基于深度學(xué)習(xí)的手語(yǔ)識(shí)別方法研究
發(fā)布時(shí)間:2018-03-03 17:35
本文選題:手語(yǔ)識(shí)別 切入點(diǎn):深度學(xué)習(xí) 出處:《吉林大學(xué)》2017年碩士論文 論文類型:學(xué)位論文
【摘要】:手語(yǔ)作為聾啞人交流的重要工具,在聾啞人中有著廣泛的使用價(jià)值,而對(duì)其復(fù)雜多變的手勢(shì)的研究也能促進(jìn)基于手勢(shì)的人機(jī)交互技術(shù)的發(fā)展。但正是由于手語(yǔ)的復(fù)雜多變以及其復(fù)雜的使用環(huán)境,對(duì)手語(yǔ)識(shí)別的研究一直困難重重。傳統(tǒng)的手語(yǔ)識(shí)別研究方法往往要求手語(yǔ)者佩戴昂貴的用于手語(yǔ)信息捕捉的數(shù)據(jù)手套或者是要求手語(yǔ)者佩戴彩色手套,以方便對(duì)手語(yǔ)的手勢(shì)進(jìn)行特征提取等操作,利用這類方法雖然能在限定的使用條件下達(dá)到較高的準(zhǔn)確率。但這類方法推廣性較差,往往在更換一個(gè)手語(yǔ)數(shù)據(jù)集后就得重新手動(dòng)的提取特征。本文將深度學(xué)習(xí)的系列方法引入到手語(yǔ)識(shí)別的研究中,具體的在靜態(tài)手語(yǔ)識(shí)別方面本文結(jié)合深度卷積神經(jīng)網(wǎng)絡(luò)提出了靜態(tài)手語(yǔ)識(shí)別模型一(SLR-CNN1)和靜態(tài)手語(yǔ)識(shí)別模型二(SLR-CNN2)。利用SLR-CNN1驗(yàn)證了深度卷積神經(jīng)網(wǎng)絡(luò)在手語(yǔ)識(shí)別上的可行性。利用SLR-CNN2模型進(jìn)一步提高了靜態(tài)手語(yǔ)識(shí)別的準(zhǔn)確率,本文將全局均值池化引入到手語(yǔ)識(shí)別模型中,極大的降低了參數(shù)數(shù)量,防止過擬合現(xiàn)象的發(fā)生。通過大量實(shí)驗(yàn)驗(yàn)證了深度卷積神經(jīng)網(wǎng)絡(luò)可以自動(dòng)的學(xué)習(xí)到有用的手語(yǔ)特征,且深度卷積神經(jīng)網(wǎng)絡(luò)能學(xué)習(xí)到手語(yǔ)的細(xì)微變換,從而可以有效的對(duì)手語(yǔ)進(jìn)行識(shí)別。本文還利用深度學(xué)習(xí)Caffe框架訓(xùn)練了兩個(gè)可以用于實(shí)際部署的深度學(xué)習(xí)手語(yǔ)識(shí)別模型。在動(dòng)態(tài)手語(yǔ)識(shí)別方面,本文將深度卷積神經(jīng)網(wǎng)絡(luò)和長(zhǎng)短時(shí)記憶循環(huán)神經(jīng)網(wǎng)絡(luò)結(jié)合,提出了動(dòng)態(tài)手語(yǔ)識(shí)別模型一(SLR-LSRCN1)和動(dòng)態(tài)手語(yǔ)識(shí)別模型二(SLRLSRCN2)。并對(duì)深度學(xué)習(xí)框架Caffe的源碼進(jìn)行修改,使其能接受連續(xù)的視頻幀作為深度學(xué)習(xí)模型的輸入。通過大量實(shí)驗(yàn)得出利用卷積神經(jīng)網(wǎng)絡(luò)和循環(huán)神經(jīng)網(wǎng)絡(luò)結(jié)合的方式,可以對(duì)動(dòng)態(tài)手語(yǔ)做出有效的識(shí)別。在此基礎(chǔ)上訓(xùn)練了可用于實(shí)際部署的動(dòng)態(tài)手語(yǔ)識(shí)別模型。最后為了驗(yàn)證深度學(xué)習(xí)算法在手語(yǔ)識(shí)別上的可行性,本文通過結(jié)合現(xiàn)有數(shù)據(jù)庫(kù)和自錄數(shù)據(jù)庫(kù)的方式,標(biāo)記了大量的可用于靜態(tài)手語(yǔ)識(shí)別的樣本庫(kù),可以更方便的進(jìn)行算法的驗(yàn)證和實(shí)驗(yàn)。本文通過將深度學(xué)習(xí)的方法引入到手語(yǔ)識(shí)別任務(wù)中,為手語(yǔ)識(shí)別增加了一條可擴(kuò)展性強(qiáng),具有魯棒性的新思路。
[Abstract]:Sign language is an important tool for communication among deaf and mute people. The research on its complex and changeable gestures can also promote the development of human-computer interaction technology based on gestures, but it is precisely because of the complexity of sign language and its complex use environment, Research on sign language recognition has been difficult. Traditional sign language recognition methods often require sign language users to wear expensive data gloves for sign language information capture or color gloves for sign language users. In order to facilitate sign language gesture feature extraction and other operations, although the use of this method can achieve a higher accuracy under limited conditions of use, but this kind of method is less popularizing. After replacing a sign language data set, we often have to re-extract the features manually. In this paper, a series of in-depth learning methods are introduced into the study of sign language recognition. In the aspect of static sign language recognition, this paper proposes a static sign language recognition model (SLR-CNN1) and a static sign language recognition model (SLR-CNN2) combined with deep convolution neural network. The SLR-CNN1 is used to verify the feasibility of deep convolution neural network in sign language recognition. Using SLR-CNN2 model to further improve the accuracy of static sign language recognition, In this paper, the global mean pool is introduced into the sign language recognition model, which greatly reduces the number of parameters and prevents over-fitting. Through a large number of experiments, it is verified that the deep convolution neural network can automatically learn useful sign language features. And the deep convolution neural network can learn the subtle transformation of sign language. So we can effectively recognize sign language. In this paper, we also train two Deep-Learning sign language recognition models which can be used in actual deployment by using the Caffe framework of in-depth learning. In the aspect of dynamic sign language recognition, In this paper, the deep convolution neural network and the long and short time memory circulatory neural network are combined, and a dynamic sign language recognition model (SLR-LSRCN1) and a dynamic sign language recognition model (SLRLSRCN2) are proposed. The source code of the deep learning framework (Caffe) is modified. It can accept the continuous video frame as the input of the depth learning model. Through a lot of experiments, the method of combining the convolution neural network with the cyclic neural network is obtained. On the basis of this, the dynamic sign language recognition model which can be used in actual deployment is trained. Finally, in order to verify the feasibility of the deep learning algorithm in sign language recognition, In this paper, a large number of sample libraries for static sign language recognition are marked by combining the existing database and the self-recording database. This paper introduces the method of deep learning into the task of sign language recognition, and adds a new idea with strong extensibility and robustness for sign language recognition.
【學(xué)位授予單位】:吉林大學(xué)
【學(xué)位級(jí)別】:碩士
【學(xué)位授予年份】:2017
【分類號(hào)】:TP391.41
【參考文獻(xiàn)】
相關(guān)期刊論文 前4條
1 王繼紅;;國(guó)內(nèi)外手語(yǔ)翻譯研究:歷史與現(xiàn)狀[J];上海翻譯;2009年02期
2 任海兵,祝遠(yuǎn)新,徐光yP,林學(xué),
本文編號(hào):1562013
本文鏈接:http://www.wukwdryxk.cn/shoufeilunwen/xixikjs/1562013.html
最近更新
教材專著