Annual: 2019

PR034 »
基于DE10-Nano开发板的手语翻译系统
📁Digital Design
👤耀琦 王
 (兰州交通大学)
📅Oct 03, 2019
Regional Final



👀 2229   💬 1

PR034 » 基于DE10-Nano开发板的手语翻译系统

Description

本手语翻译+语音交互系统是利用模块化的方式设计的,以DE10-Nano开发板作为开发平台,通过 flex 柔性传感器和高精度加速度电子倾角仪进行多维度的数据采集整理,并建立8维数据模型。采用基于贝叶斯理论的朴素贝叶斯分类算法,利用最大后验概率思想准确识别手语信息。语音交互系统搭载了科大讯飞的云端语音识别引擎,可以高效的实现双向交流,保证了正常人与聋哑人之间的无障碍交流。

Demo Video

  • URL: https://v.youku.com/v_show/id_XNDM5MDQ2Mjg0NA==.html?spm=a2hzp.8253869.0.0

  • Project Proposal

    1. High-level Project Description

    目的:目前全球有近2亿多聋哑人,他们生来不会说话或者由于听力障碍使其丧失听到和辨别语言的能力,只能通过手语与正常人交流,但手语在正常人中普及度相对较低。为了解决聋哑人与正常人的交流问题,研制了一种可穿戴智慧手语翻译系统,主要功能包括手势姿态采集,手势识别以及语音播报。

    Purpose:At present, there are nearly 200 million deaf people in the world. They are not born to speak or lose their ability to hear and distinguish language due to hearing impairment. They can only communicate with normal people through sign language, but the popularity of sign language is relatively high among normal people. low. In order to solve the problem of communication between deaf and normal people, a wearable wisdom sign language translation system was developed. The main functions include gesture gesture collection, gesture recognition and voice broadcast.

    应用:聋人和普通人之间的手语沟通至关重要。 为了加快识别速度,FPGA用于开发智能手语翻译系统。 通过智能手语翻译系统,可以识别和表达聋人的手语。 让不懂手语的人明白他们想要表达什么。
    Applications: Sign language communication between deaf and normal people is essential. In order to speed up the recognition, FPGA is used to develop a smart sign language translation system. Through the smart sign language translation system, the sign language of deaf people can be recognized and voiced. people who don't understand mute understand what they want to express.

    目标用户:该产品目前能够准确识别手语和并通过语音合成模块进行语音播放,可用于聋人和普通人之间的通信,目前,对目标语言翻译系统的研究较少。该产品采用FPGA设计,与单片机相比具有很大的优势。
     Target User: The product is currently able to recognize sign language and sound through the speaker, which can be used for communication between deaf and normal people. At present, there is less research on the target language translation system. This product is designed with FPGA, which has great advantages over the single chip microcomputer.

    2. Block Diagram

     

          System flow diagram

    Algorithm flowchart

    Wearable Wisdom Sign Language Translation System Work Diagram

     

    3. Intel FPGA Virtues in Your Project

    (1)采用贝叶斯算法实现手语识别。利用最大后验概率思想实现训练模型的实时反馈优化进而实现手语的精确识别。
    (2)较好的可穿戴性。将多个传感器及核心模块、通信模块、功能模块集中在一个手套上,极大程度的减小了手套的体积和重量,让使用者具有较好的舒适性。
    (3)利用FPGA高效并行的运算单元,提高了手势分类的运算速度和效率。

    (1) Using Bayesian algorithm to achieve sign language recognition. The real-time feedback optimization of the training model is realized by the idea of maximum posterior probability to realize the accurate recognition of sign language.
    (2) Better wearability. Concentrating multiple sensors and core modules, communication modules, and functional modules on one glove greatly reduces the size and weight of the gloves, allowing users to have better comfort.
    (3) Using FPGA efficient parallel computing unit, the operation speed and efficiency of gesture classification are improved.

    4. Design Introduction

    系统设计采用DE10-Nano开发板作为主控制平台,完成手势信息的采集、基于朴素贝叶斯分类算法手语智能化分类以及语音实时播报。手套以简单的工程手套为载体,五根手指各采用一根电阻式弯曲度传感器,用于在做出哑语动作时采集手部姿势,利用高精度陀螺仪获取手势的姿态和方位。朴素贝叶斯分类算法利用特征值距离最近思想实现训练模型的实时反馈优化进而实现手语的精确识别。同时利用FPGA高效并行的运算单元,提高了手势分类的运算速度和效率。

    The system design adopts DE10-Nano development board as the main control platform to complete the collection of gesture information, intelligent classification of sign language based on naive Bayesian classification algorithm and real-time broadcast of voice. The gloves are made of simple engineering gloves. Each of the five fingers uses a resistive bending sensor to collect the hand posture when making a mute action, and to obtain the posture and orientation of the gesture using a high-precision gyroscope. The naive Bayesian classification algorithm uses the eigenvalue distance to the nearest idea to realize the real-time feedback optimization of the training model to realize the accurate recognition of sign language. At the same time, the efficient parallel computing unit of FPGA is used to improve the operation speed and efficiency of gesture classification.

    5. Function Description

                

    如图所示,整个系统架构由DE10-Nano开发板,FLEX4.5弯曲传感器、MPU6050六轴传感器以及语音播报模块组成,各部件通电以后,进行系统初始化,待初始化完成后使用者开始做出对应动作,当动作做出时,会有数据采集,然后通过串口,将采集到的数据传输到DE10-Nano开发板,然后由开发板对数据进行多次分析配对,选择最正确的动作,发送对应的数据,由语音播报模块将发来的数据转换为语音播放。

    As shown in the figure, the whole system architecture consists of DE10-Nano development board, FLEX4.5 bending sensor, MPU6050 six-axis sensor and voice broadcast module. After each component is powered on, the system is initialized. After the initialization is completed, the user starts to respond. Action, when the action is made, there will be data acquisition, and then the collected data will be transmitted to the DE10-Nano development board through the serial port, and then the development board will analyze and pair the data multiple times, select the most correct action, and send the corresponding The data is converted into voice playback by the voice broadcast module.

    1 DE10-Nano功能介绍  DE10-Nano function introduction

    DE10-Nano作为整个系统的控制核心,主要负责数据的采集和处理工作,将采集到的数据进行综合分析运算,采用K邻近分类算法对手势数据训练模型与实时获取的手势进行最小距离计算,取最小值作为分类结果,从而实现手语的准确识别。同时利用FPGA高效并行的运算单元,提高了手势分类的运算速度和效率。

    As the control core of the whole system, DE10-Nano is mainly responsible for data collection and processing. The collected data is comprehensively analyzed and calculated. The K-nearest classification algorithm is used to calculate the minimum distance between the gesture data training model and the real-time acquired gesture. The minimum value is used as the classification result to achieve accurate recognition of sign language. At the same time, the efficient parallel computing unit of FPGA is used to improve the operation speed and efficiency of gesture classification.

    数据采集功能  Data collection function

    数据采集部分由电阻式弯曲度传感器和六轴高精度陀螺仪组成。电阻式弯曲度传感器配合外围电路,利用弯曲度不同时阻值的变化从而实现电压值的变化,从而实现手势数据的获取。六轴高精度陀螺仪负责对手势的方位和姿态进行解析和数据获取。

    The data acquisition part consists of a resistive bend sensor and a six-axis high precision gyroscope. The resistive bending degree sensor cooperates with the peripheral circuit to realize the change of the voltage value by using the change of the resistance value when the bending degree is different, thereby realizing the acquisition of the gesture data. The six-axis high-precision gyroscope is responsible for parsing and acquiring the orientation and posture of the gesture.

    3 语音播报功能  Voice broadcast function

    语音播报模块采用科大讯飞的YS-XFSV2语音合成模块,该模块支持支持任意中文文本、英文文本的合成,同时能够对文本进行分析,对常见的数字、号码、时间、日期、度量衡符号等格式的文本,能够根据内置的文本匹配规则进行正确的识别和处理;对一般多音字也可以依据其语境正确判断读法;另外针对同时有中文和英文的文本,可实现中英文混读。

    The voice broadcast module uses the YS-XFSV2 speech synthesis module of Keda Xunfei. The module supports the synthesis of any Chinese text and English text, and can analyze the text, such as common numbers, numbers, time, date, weights and symbols. The text can be correctly identified and processed according to the built-in text matching rules; for the general multi-phonetic words, the reading method can be correctly judged according to its context; and for Chinese and English texts at the same time, the Chinese-English mixed reading can be realized.

    6. Performance Parameters

    1、手势识别率  Gesture recognition rate

    该翻译系统对不同手势的识别率基本是一样的,课题组分析了不同手势下翻译系统的识别情况,通过加速度、角度加权,把不同手势的特征差异放大,让手势匹配路径更优,总欧式距离值更小,从而减少匹配过程中手势的误识和拒识率。手势识别成功率反应了手势翻译系统性能的优良程度,目前测得手势的平均识别率为88.1%,该参数目前还处于优化阶段,具体优化后的值将在后续展示中给出。

    The recognition rate of the different gestures is basically the same. The research group analyzes the recognition of the translation system under different gestures. By accelerating the acceleration and angle, the difference of the features of different gestures is enlarged, so that the gesture matching path is better. The distance value is smaller, which reduces the misunderstanding and rejection rate of gestures during the matching process. The success rate of gesture recognition reflects the excellent performance of the gesture translation system. The average recognition rate of the measured gesture is currently 88.1%. This parameter is still in the optimization stage, and the optimized value will be given in the subsequent display.

    2、手势识别反应时间  Gesture recognition reaction time

    识别速度是当使用者做出手势动作时,翻译系统的响应时间,该值随着测试数据的不稳定性情况而变化。手势识别速度在一定程度上反应了总体性能的好坏。在支撑手势翻译速度性能参数的因素中,训练模板选取的好坏占有重要比重。本课题中涉及数据比对的部分是在FPGA中处理的,相比传统用arm 处理器,这一计算过程耗时大大降低了。手势平均识别时间为125ms。

    The recognition speed is the response time of the translation system when the user makes a gesture, and the value varies with the instability of the test data. The speed of gesture recognition reflects the overall performance of the game to some extent. Among the factors supporting the speed performance parameters of the gesture translation, the quality of the training template is important. The part of the subject that involves data comparison is processed in the FPGA. Compared with the traditional arm processor, this calculation process is greatly reduced in time. The average recognition time of the gesture is 125ms.

    7. Design Architecture

    图 设计结构图

    手语翻译系统主要包括信息采集系统、信息传输系统和信息播报系统3部分,信息采集系统包括三部分:手势姿态数据信息采集和手指弯曲数据采集,电子部分设计系统结构框图如图所示。

       

     图 加权手势分类算法流程



    1 Comments

    Zhou Wenyan
    很好的提案,请帮忙补全内容。
    🕒 Jun 26, 2019 02:24 PM

    Please login to post a comment.