中国手语识别中自适应问题的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着计算机的发明,模式识别得以出现和发展。手语识别作为模式识别研究中的一个热点问题,近年来受到越来越多研究者的重视。手语识别,是借助计算机自动将手语信号转换为文本或语音的过程。手语识别具有重要的社会现实意义和巨大的理论研究价值。首先,手语识别能够在聋哑人与健听人之间架起桥梁,从而促进社会和谐发展。其次,手语是一种相对规范的手势集合,手语识别与其它手势分析问题相比相对简单,可以作为更为普遍的手势分析研究的一个前期平台。最后,手语识别涉及计算机视觉、模式识别、机器学习、智能人机接口等研究领域,其研究有助于促进这些领域内其它类似问题的研究。
     手语识别研究经过多年的积累,已经在特定人识别领域取得了很好的结果。然而,当测试者与训练集中所有人手语打法差异较大时,系统性能下降明显。采集足够多的训练数据训练普适模型,能够部分解决该问题。然而,由于不同人做相同手语差异较大,因此,模型训练不易收敛。而且,普适模型参数分布比较平缓,能够对大部分测试者取得较好的识别结果;但是,对于特定用户,其性能与特定人模型差距明显。自适应手语识别利用新用户数据对普适模型的参数进行修正,使模型更适合于新用户。该方法与人类认知事物由一般到特殊的机理相契合。
     本文围绕手语识别的自适应问题展开研究。根据自适应数据所属类别是否已知,自适应分为有监督自适应和无监督自适应。有监督自适应中需要已知自适应数据的所属类别,因此需要用户显式采集自适应数据。由于显式的数据采集过程需要用户参与,对系统的易用性造成损害。因此,有监督自适应的核心问题是:如何利用尽可能少的自适应数据对模型参数进行修正。无监督自适应中不需要自适应数据的所属类别,因此自适应数据可以在用户使用系统的同时自动采集,无需用户参与。然而,使用无标号数据之前必须对其进行标注,以确定其类别。因此,无监督自适应的核心问题是:如何有效利用大量的无标号数据对模型参数进行修正。
     对于有监督自适应问题,提出了基于基本单元提取的手语识别自适应方法和基于模范均值选择和最大后验概率/循环矢量场平滑(Maximum A Posteri-ori/Iterative Vector Field Smoothing, MAP/IVFS)的手语识别自适应方法。由于基于词根的手语识别方法能够取得同基于词汇的手语识别方法相当的识别结果,因此,本文提出基于词根的手语识别自适应方法。实验结果证明,同基于词汇的方法相比,基于词根的手语识别自适应方法能够在基本保持原有识别率的基础上,大大降低所需采集的自适应数据数量。进一步,分析中国手语的多数据流和词间片段数据相似的特点,可以对模型均值进行聚类以便得到更底层的手语词编码。根据此编码,可以通过部分手语词样本,生成词汇集中其它词汇的手语词样本,利用这些样本进行模型自适应,能够提高模型的识别率。实验结果证明,该方法能够进一步降低所需采集的自适应数据数量。为进一步减少自适应数据,提出了基于模范均值选择和MAP/IVFS的手语识别自适应方法。通过对手语词模型的均值向量进行聚类,可以提取出模范均值向量子集,进而得到模范手语词子集,该子集能够表征新用户的个性特征。仅采集该子集中的词汇的新用户数据,可以对相应的模型进行自适应。未得到自适应的模型,可以通过模型之间的相关性和得到自适应的模型参数估计得到。
     尽管有监督自适应能够以较少的数据对模型参数进行修正,然而,显式的数据采集过程必不可少。无监督自适应可以通过隐式采集自适应数据对模型参数进行修正。对于无监督自适应问题,提出了结合简化多项式段模型(SimplifiedPolynomial Segment Model, SPSM)和隐马尔可夫模型(Hidden Markov Model,HMM)的手语识别无监督自适应方法和基于假设比较导引交叉验证的无监督自适应方法。HMM适合描述具有明显状态跳转的手语词,对于一些无明显状态跳转的渐变的手语词,其描述能力较弱,这源于HMM帧间数据独立同分布的假设。SPSM能够描述帧间数据的相关性,因此适合于描述另一类手语词。结合SPSM和HMM,对无标号数据进行标注,能够增加标注准确率,进而提升无监督自适应性能。传统的自学习自适应方法中,由于对无标号数据进行标注的模型和待适应模型为相同模型,因此出现错误累积和过适应问题。基于交叉验证的无监督自适应方法通过引入交叉验证思想,将对数据进行标注的模型和待适应模型分割开来,从而避免出现错误累积和过适应。通过引入假设比较,可提高标注的准确率,提升自适应的性能。
     本文通过对自适应问题进行深入探讨和研究,为未来手语识别系统真正走向实用化提供了必要的准备;同时,也为其它领域自适应问题的解决提供了借鉴和参考。
With the invention of the computer, pattern recognition appeared and blossomed.As a hot research area in pattern recognition, sign language recognition has been paidmuch attention by many researchers. Sign language recognition is to automatically tran-scribe sign language to texts or speech by computers. It is of great value both for realapplication in society and for theoretical research. First, sign language recognition canbuild a bridge between the hearing impaired and the hearing society, which promotes thesociety’s development harmoniously. Second, sign language is a more structured type ofsigns. Compared to other types of sign analysis, recognition of sign language is relativelysimple. Sign language recognition can serve as a test bed for more general research onsign analysis. Moreover, sign language recognition is related to the areas of computervision, pattern recognition, machine learning and intelligent human-computer interactionetc. Sign language recognition can help to solve the similar problems in these areas.
     After many years of research, signer dependent sign language recognition systemshave achieved good performance. However, the performance decreases drastically whenthe test signer is unregistered in the training data set. Collecting enough data from dif-ferent signers to train signer independent models can solve this problem to some extent.Nevertheless, the models are difficult to converge because data from different signersare extraordinarily diverse. Moreover, the distributions of signer independent model pa-rameters are plain, and the acceptable performance can be achieved on lots of signers.However, the performance of signer independent models is not as perfect as that of signerdependent models for a specific test signer. Adaptive sign language recognition utilizesdata from a new signer to tailor the initial models, and the tailored models can bettermodel the new signer. The method accords to the mechanism that people perceive theworld from generality to specialization.
     This dissertation aims to solve the problem of signer adaptation. According to theavailability of the adaptation data’s labels, signer adaptation can be classified into super-vised signer adaptation and unsupervised signer adaptation. For supervised signer adapta-tion, the labels of the adaptation data are needed, so the data collecting process is explicit.Explicit data collecting needs the intervention of the user, which is not acceptable to some users. Therefore the core problem in supervised signer adaptation is to modify the param-eters using as less data as possible. For unsupervised signer adaptation, the labels of theadaptation data are needless, so the data can be collected implicitly at the same time whenthe user is manipulating the system. However, the data must be labeled before they areused for adaptation. Therefore the core problem in unsupervised signer adaptation is tomodify the parameters with the huge amounts of unlabeled data effectively.
     For supervised signer adaptation, the adaptive sign language recognition methodsbased on basic units extraction and that based on exemplar extraction and MAP/IVFSare proposed. Inspired by that sign language recognition based on etyma can achievecomparable results with that based on words, we propose signer adaptation based onetyma. Experimental results showed that signer adaptation based on etyma could bothpreserve the recognition rate and save the adaptation data greatly. Further more, there aresimilar segments in different Chinese sign language (CSL) words. By clustering meanvectors into clusters the CSL words can be coded. Using representative words’samples,the virtual samples of the whole vocabulary can be generated. Using these data the modelscan be adapted and the adapted models can achieve higher performance. Experimentalresults showed that the amount of adaptation data can be decreased further. To reduce theamount of adaptation data needed further, signer adaptation based on exemplar extractionand MAP/IVFS is proposed. A mean vector subset can be selected by clustering, and thecorresponding word subset can be directly formed. The word subset can represent the newsigner’s signing characteristics. Using the samples in the word subset, the correspondingmodels can be adapted. The other models can be estimated using the adapted models andthe correlation among the models.
     Though the supervised signer adaptation can adapt the models with small amountof data, the explicit data collecting process is indispensable. However, unsupervisedsigner adaptation can collect the adaptation data implicitly. For unsupervised signeradaptation, the combination of simplified polynomial segment model (SPSM) and hiddenMarkov model (HMM) for unsupervised adaptation and unsupervised adaptation basedon hypothesis-comparison guided cross-validation (HC-CV) are proposed. HMMs aresuitable to model the words which have obvious state transitions, and are not suitable tomodel those which have not obvious state transitions and change frame by frame. This isbecause that in HMMs the observations in the same state are supposed to be independent and identical distributed. SPSMs are suitable for modeling the other type of words inthat SPSMs can model the correlation between frames. Combining SPSMs and HMMsto label the unlabeled data can decrease the noise rate of the adaptation data set, whichleads to the improvement of the adaptation. In conventional self-teaching adaptation, themodel set that is used to label the unlabeled data and the model set that is to be adaptedare same, which leads to the error reinforcement and the over fitting. By introducingcross-validation to unsupervised adaptation, the problems encountered can be relieved.Applying hypothesis comparison for unsupervised adaptation, the labeling right rate canbe improved. By this way the adaptation can be more effective.
     By solving the signer adaptation problem in CSL recognition, the preparation forthe application of the CSL recognition system in daily life has been supplied. Moreover,signer adaptation in CSL recognition can help to solve the adaptation problems in otherresearch areas.
引文
1 I. Yoo, D. Yook. Automatic Sound Recognition for the Hearing Impaired[J]. IEEETransactions on Consumer Electronics, 2008, 54(4):2029–2036.
    2 Http://www.cdpf.com.cn/ggtz/content/2008-05/04/content 25053452.htm[M].
    3 S. Ong, S. Ranganath. Automatic Sign Language Analysis: A Survey and the FutureBeyond Lexical Meaning[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2005, 27(6):873–891.
    4 J. Lichtenauer, A. Gineke, M. Reinders, et al. Person-Independent 3D SignLanguage Recognition[C]//7th International Workshop on Gesture in Human-Computer Interaction and Simulation. 2007, 5085:69–80.
    5 G. J. Grimes. Digital Data Entry Glove Interface Device[J]. Technical Report USPatent 4,414,537, Bell Telephone Laboratories, 1983:1–50.
    6 T. Starner. Visual Recognition of American Sign Language Using Hidden MarkovModels[D]Massachusetts Institute of Technology, 1995.
    7 T. Starner, A. Pentland. Real-Time American Sign Language Recognition fromVideo Using Hidden Markov Models[J]. Computational Imaging and Vision, 1997,9:227–244.
    8 T. Starner, J. Weaver, A. Pentland. Real-Time American Sign Language Recogni-tion Using Desk and Wearable Computer Based Video[J]. IEEE Transactions onPattern Analysis and Machine Intelligence, 1998, 20(12):1371–1375.
    9 C. Vogler, D. Metaxas. A Framework for Recognizing the Simultaneous Aspectsof American Sign Language[J]. Computer Vision and Image Understanding, 2001,81(3):358–384.
    10 M. Yang, N. Ahuja, M. Tabb. Extraction of 2D Motion Trajectories and its Appli-cation to Hand Gesture Recognition[J]. IEEE Transactions on Pattern Analysis andMachine Intelligence, 2002, 24(8):1061–1074.
    11 P. Dreuw, T. Deselaers, D. Rybach, et al. Tracking Using Dynamic Programmingfor Appearance-based Sign Language Recognition[C]//Proceedings of the 7th Inter-national Conference on Automatic Face and Gesture Recognition. 2006:293–298.
    12 S. Fels, G. Hinton. Glove-Talk: A Neural Network Interface between a Data-Gloveand a Speech Synthesizer[J]. IEEE Transactions on Neural Networks, 1993, 4(1):2–8.
    13 S. Fels, G. Hinton. Glove-TalkII: A Neural Network Interface Which Maps Ges-tures to Parallel Formant Speech Synthesizer Controls[J]. IEEE Transactions onNeural Networks, 1998, 9(1):205–212.
    14 M. Kadous. Machine Recognition of Auslan Signs Using Powergloves: TowardsLarge-Lexicon Recognition of Sign Language[C]//Proceedings of the Workshop onthe Integration of Gesture in Language and Speech, Wilmington, DE. 1996:165–174.
    15 T. Bui, L. Nguyen. Recognizing Postures in Vietnamese Sign Language with MemsAccelerometers[J]. IEEE Sensors Journal, 2007, 7(5):707–712.
    16 W. Kong, S. Ranganath. Signing Exact English (SEE): Modeling and Recogni-tion[J]. Pattern Recognition, 2008, 41(5):1655–1669.
    17 M. AL-Rousan, K. Assaleh, A. Tala’a. Video-based Signer-Independent ArabicSign Language Recognition Using Hidden Markov Models[J]. Applied Soft Com-puting Journal, 2009, 9(3):990–999.
    18 J. Han, G. Awad, A. Sutherland. Automatic Skin Segmentation and Tracking inSign Language Recognition[J]. Computer Vision, IET, 2009, 3(1):24–35.
    19 S. Akyol, P. Alvarado. Finding Relevant Image Content for Mobile Sign LanguageRecognition[C]//Proc. IASTED Int’l Conf. Signal Processing, Pattern Recognitionand Application. 2001:48–52.
    20 K. Imagawa, S. Lu, S. Igi. Color-based Hand Tracking System for Sign LanguageRecognition[C]//IEEE International Conference on Automatic Face and GestureRecognition, Japan. 1998.
    21 I. Imagawa, H. Matsuo, R. Taniguchi, et al. Recognition of Local Features forCamera-based Sign Languagerecognition System[C]//Proceedings. 15th Interna-tional Conference on Pattern Recognition. 2000:849–853.
    22 J. Sherrah, S. Gong. Resolving Visual Uncertainty and Occlusion Through Proba-bilistic Reasoning[C]//British Machine Vision Conference, Bristol. 2000:252–261.
    23 J. Zieren, N. Unger, S. Akyol. Hands Tracking from Frontal View for Vision-basedGesture Recognition[C]//Proceedings of the 24th DAGM Symposium on PatternRecognition. 2002:531–539.
    24 J. Terrillon, A. Piplr, Y. Niwa, et al. Robust Face Detection and Japanese SignLanguage Hand Posture Recognition for Human-Computer Interaction in an Intel-ligent Room[C]//Fifteenth International Conference on Vision Interface, Calgary,Canada. 2002:369–376.
    25 M. Assan, K. Grobel. Video-Based Sign Language Recognition Using HiddenMarkov Models[C]//Proceedings of the International Gesture Workshop on Ges-ture and Sign Language in Human-Computer Interaction. 1997:97–109.
    26 B. Bauer, K. Kraiss. Towards an Automatic Sign Language Recognition SystemUsing Subunits[J]. Lecture Notes In Computer Science, 2001, 2298:64–75.
    27 B. Bauer, K. Kraiss. Video-based Sign Recognition Using Self-Organizing Sub-units[C]//International Conference on Pattern Recognition. 2002, 16:434–437.
    28 A. Sutherland. Real-Time Video-based Recognition of Sign Language Gestures Us-ing Guided Template Matching[C]//Progress in Gestural Interaction: Proceedingsof Gesture Workshop. 1996:31–38.
    29 G. Sweeney, A. Downton. Towards Appearance-based Multi-Channel GestureRecognition[C]//Proc. Gesture Workshop. 1996:7–16.
    30 Y. Cui, J. Weng. A Learning-based Prediction-and-Verification SegmentationScheme for Hand Sign Image Sequence[J]. IEEE Transactions on Pattern Anal-ysis and Machine Intelligence, 1999, 21(8):798–804.
    31 Y. Cui, J. Weng. Appearance-based Hand Sign Recognition from Intensity ImageSequences[J]. Computer Vision and Image Understanding, 2000, 78(2):157–176.
    32 C. Huang, W. Huang. Sign Language Recognition Using Model-based Trackingand a 3D Hopfield Neural Network[J]. Machine Vision and Applications, 1998,10(5):292–307.
    33 C. Huang, S. Jeng. A Model-based Hand Gesture Recognition System[J]. MachineVision and Applications, 2001, 12(5):243–258.
    34 F. Chen, C. Fu, C. Huang. Hand Gesture Recognition Using a Real-Time Track-ing Method and Hidden Markov Models[J]. Image and Vision Computing, 2003,21(8):745–758.
    35 E. Ong, R. Bowden. A Boosted Classifier Tree for Hand Shape Detection[C]//SixthIEEE International Conference on Automatic Face and Gesture Recognition.2004:889–894.
    36 C. Vogler, D. Metaxas. Adapting Hidden Markov Models for ASL Recognition byUsing Three-Dimensional Computer Vision Methods[C]//Proceedings of the IEEEInternational Conference on Systems, Man and Cybernetics. 1997, 1:156–161.
    37 N. Tanibata, N. Shimada, Y. Shirai. Extraction of Hand Features for Recognition ofSign Language Words[C]//The 15th International Conference on Vision Interface.2002:27–29.
    38 S. Tamura, S. Kawasaki. Recognition of Sign Language Motion Images[J]. PatternRecognition, 1988, 21(4):343–353.
    39 H. Hienz, K. Grobel, G. Offner. Real-Time Hand-Arm Motion Analysis Usinga Single Video Camera[C]//Proceedings of the 2nd International Conference onAutomatic Face and Gesture Recognition (FG’96). 1996:323–327.
    40 A. Downton, H. Drouet. Model-based Image Analysis for Unconstrained HumanUpper-Body Motion[C]//International Conference on Image Processing and its Ap-plications. 1992:274–277.
    41 H. Matsuo, S. Igi, S. Lu, et al. The Recognition Algorithm with Non-Contact forJapanese Sign Language Using Morphological Analysis[C]//Proceedings of the In-ternational Gesture Workshop on Gesture and Sign Language in Human-ComputerInteraction. 1997:273–284.
    42 O. Al-Jarrah, A. Halawani. Recognition of Gestures in Arabic Sign Language UsingNeuro-Fuzzy Systems[J]. Artificial Intelligence, 2001, 133(1-2):117–138.
    43 H. Birk, T. Moeslund, C. Madsen. Real-Time Recognition of Hand Alphabet Ges-tures Using Principal Component Analysis[C]//Proceedings of the ScandinavianConference on Image Analysis. 1997, 1:261–268.
    44 J. Deng, H. Tsui. A Novel Two-Layer PCA/MDA Scheme for Hand Posture Recog-nition[C]//Proceedings of the 16 th International Conference on Pattern Recogni-tion (ICPR’02). 2002, 1:294–299.
    45 L. Gupta, S. Ma. Gesture-based Interaction and Communication: Automated Clas-sification of Hand Gesture Contours[J]. IEEE Transactions on Systems, Man, andCybernetics, Part C: Applications and Reviews, 2001, 31(1):114–120.
    46 M. Handouyahia, D. Ziou, S. Wang. Sign Language Recognition Using Moment-based Size Functions[C]//Proc. of the Int’l Conf. on Vision Interface. 1999:210–216.
    47 Y. Wu, T. Huang. View-Independent Recognition of Hand Postures[C]//IEEE Con-ference on Computer Vision and Pattern Recognition. 2000, 2:88–94.
    48 T. Kobayashi, S. Haruyama. Partly-Hidden Markov Model and its Application toGesture Recognition[C]//Proceedings of the 1997 IEEE International Conferenceon Acoustics, Speech, and Signal Processing. 1997, 4:3081–3084.
    49 T. Cootes, C. Taylor, D. Cooper, et al. Active Shape Models-Their Training andApplication[J]. Computer Vision and Image Understanding, 1995, 61(1):38–59.
    50 M. Dubuisson, A. Jain. A Modified Hausdorff Distance for Object Match-ing[C]//Proceedings of the 12th IAPR International Conference on Pattern Recog-nition. 1994, 1:566–568.
    51 E. Holden, R. Owens. Visual Sign Language Recognition[C]//Proceedings ofthe 10th International Workshop on Theoretical Foundations of Computer Vision:Multi-Image Analysis. 2000:270–288.
    52 B. Dorner. Chasing the Colour Glove: Visual Hand Tracking[D]Simon Fraser Uni-versity, 1994.
    53 C. Nolker, H. Ritter. Visual Recognition of Continuous Hand Postures[J]. IEEETransactions on Neural Networks, 2002, 13(4):983–994.
    54 J. Rehg, T. Kanade. Visual Tracking of High DOF Articulated Structures: AnApplication to Human Hand Tracking[C]//Proceedings of the third European con-ference on Computer Vision. 1994, 2:35–46.
    55 H. Fillbrandt, S. Akyol, K. Kraiss. Extraction of 3D Hand Shape and Posture fromImage Sequences for Sign Language Recognition[C]//Proceedings of the IEEE In-ternational Workshop on Analysis and Modeling of Faces and Gestures. 2003:181–186.
    56 T. Cootes, G. Wheeler, K. Walker, et al. View-based Active Appearance Models[J].Image and Vision Computing, 2002, 20(9-10):657–664.
    57 W. Gao, J. Ma, J. Wu, et al. Sign Language Recognition Based onHMM/ANN/DP[J]. International Journal of Pattern Recognition and Artificial In-telligence, 2000, 14(5):587–602.
    58 C. Wang, W. Gao, S. Shan. An Approach Based on Phonemes to Large VocabularyChinese Sign Language Recognition[C]//Proceedings of the Fifth IEEE Interna-tional Conference on Automatic Face and Gesture Recognition. 2002:411–416.
    59 C. Vogler. American Sign Language Recognition: Reducing the Complexity ofthe Task with Phoneme-based Modeling and Parallel Hidden Markov Models[J].University of Pennsylvania, Dissertation, 2003.
    60 R. McGuire, J. Hernandez-Rebollar, T. Starner, et al. Towards a One-Way AmericanSign Language Translator[C]//Sixth IEEE International Conference on AutomaticFace and Gesture Recognition. 2004:620–625.
    61 R. Liang, M. Ouhyoung. A Real-Time Continuous Gesture Recognition System forSign Language[C]//Proceedings of the 3rd. International Conference on Face andGesture Recognition. 1998:558–567.
    62 G. Fang, G. Wen, X. Chen, et al. Signer-Independent Continuous Sign LanguageRecognition Based on SRN/HMM[J]. Lecture Notes in Computer Science, 2002,2298:76–85.
    63 F. Jelinek. Statistical Methods for Speech Recognition[M]. MIT press, 1999.
    64 Q. Yuan, W. Geo, H. Yao, et al. Recognition of Strong and Weak ConnectionModels in Continuous Sign Language[C]//Proceedings of the 16 th InternationalConference on Pattern Recognition (ICPR’02). 2002, 1:75–78.
    65 R. Erenshteyn, P. Laskov, R. Foulds, et al. Recognition Approach to Gesture Lan-guage Understanding[C]//Proceedings of the International Conference on PatternRecognition (ICPR). 1996, 3:431–435.
    66 K. Murakami, H. Taguchi. Gesture Recognition Using Recurrent Neural Net-works[C]//Proceedings of the SIGCHI conference on Human Factors in ComputingSystems: Reaching through Technology. 1991:237–242.
    67 P. Vamplew, A. Adams. Recognition of Sign Language Gestures Using NeuralNetworks[J]. Australian Journal of Intelligent Information Processing Systems,1998, 5(2):94–102.
    68 M. Waldron, S. Kim. Isolated ASL Sign Recognition System for Deaf Persons[J].IEEE Transactions on Rehabilitation Engineering, 1995, 3(3):261–271.
    69 J. Wu, W. Gao. The Recognition of Finger-Spelling for Chinese Sign Lan-guage[C]//The International Gesture Workshop on Gesture and Sign Languages inHuman-Computer Interaction. 2001:96–100.
    70 P. Simpson. Fuzzy Min-Max Neural Networks-Part 1: Classification[J]. IEEETransactions on Neural Networks, 1992, 3(5):777.
    71 J. Kim, W. Jang, Z. Bien. A Dynamic Gesture Recognition System for the KoreanSign Language (KSL)[J]. IEEE Transactions on Systems, Man, and Cybernetics,Part B, 1996, 26(2):354–359.
    72 J. Jang. ANFIS: Adaptive-Network-based Fuzzy Inference System[J]. IEEE Trans-actions on Systems, Man and Cybernetics, 1993, 23(3):665–685.
    73 M. Su. A Fuzzy Rule-based Approach to Spatio-Temporal Hand Gesture Recogni-tion[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applicationsand Reviews, 2000, 30(2):276–281.
    74 K. Murakami, H. Taguchi. Gesture Recognition Using Recurrent Neural Net-works[C]//Proceedings of the SIGCHI Conference on Human Factors in ComputingSystems. ACM, 1991:237–242.
    75 W. Kong, S. Ranganath. 3-D Hand Trajectory Recognition for Signing ExactEnglish[C]//Sixth IEEE International Conference on Automatic Face and GestureRecognition. 2004:535–540.
    76 N. Kambhatla, T. Leen. Dimension Reduction by Local Principal Component Anal-ysis[J]. Neural Computation, 1997, 9(7):1493–1516.
    77 J. Herna′ndez-Rebollar, R. Lindeman, N. Kyriakopoulos. A Multi-Class PatternRecognition System for Practical Finger Spelling Translation[C]//Proceedings ofthe 4th IEEE International Conference on Multimodal Interfaces. 2002:185–190.
    78 J. Hernandez-Rebollar, N. Kyriakopoulos, R. Lindeman. A New Instru-mented Approach for Translating American Sign Language Into Sound andText[C]//Proceedings of the Sixth IEEE International Conference on AutomaticFace and Gesture Recognition. 2004:547–552.
    79 J. Kramer, L. Leifer. The Talking Glove: An Expressive and Recep-tive‘Verbal’Communication Aid for the Deaf, Deaf-Blind, and Non-Vocal[C]//Proceeding of Third Annual Conference on Computer Technol-ogy/Special Education/Rehabilitation. 1987:335–340.
    80 M. Kadous. Learning Comprehensible Descriptions of Multivariate Time Se-ries[C]//Proceedings of the Sixteenth International Conference on Machine Learn-ing. 1999:454–463.
    81 J. Lee, A. Tiao, J. Yen. A Fuzzy Rule-based Approach to Real-Time Schedul-ing[C]//Proceedings of the Third IEEE Conference on Fuzzy Systems. 1994:1394–1399.
    82 J. Wu, W. Gao. A Fast Sign Word Recognition Method for Chinese Sign Lan-guage[J]. Lecture Notes in Computer Science, 2000, 1948:599–606.
    83 H. Sagawa, M. Takeuchi. A Method for Recognizing a Sequence of Sign LanguageWords Represented in a Japanese Sign Language Sentence[C]//Fourth IEEE Inter-national Conference on Automatic Face and Gesture Recognition. 2000:434–439.
    84 J. Han, G. Awad, A. Sutherland. Modelling and Segmenting Subunits for SignLanguage Recognition Based on Hand Motion Analysis[J]. Pattern RecognitionLetters, 2009, 30(6):623–633.
    85 G. Fang, X. Gao, W. Gao, et al. A Novel Approach to Automatically ExtractingBasic Units from Chinese Sign Language[C]//Proceedings of the 17th InternationalConference on Pattern Recognition. 2004, 4:454–457.
    86 C. Vogler, D. Metaxas. Toward Scalability in ASL Recognition: Breaking downSigns into Phonemes[J]. Lecture Notes in Computer Science, 2000, 1739:211–226.
    87 U. Canzler, T. Dziurzyk. Extraction of Non Manual Features for Videobased SignLanguage Recognition[C]//Proc IAPR Workshop on Machine Vision Applications.2002:318–321.
    88 K. Ming, S. Ranganath. Representations for Facial Expressions[C]//7th Interna-tional Conference on Control, Automation, Robotics and Vision (ICARCV). 2002,2:716–721.
    89 A. Hyvarinen, E. Oja. Independent Component Analysis: Algorithms and Applica-tions[J]. Neural Networks, 2000, 13(4-5):411–430.
    90 V. Kruger, G. Sommer. Gabor Wavelet Networks for Object Representation[J].Lecture Notes in Computer Science, 2001, 2032:115–128.
    91 U. Erdem, S. Sclaroff. Automatic Detection of Relevant Head Gestures in AmericanSign Language Communication[C]//International Conference on Pattern Recogni-tion. 2002, 16:460–463.
    92 M. Xu, B. Raytchev, K. Sakaue, et al. A Vision-based Method for Recognizing Non-Manual Information in Japanese Sign Language[J]. Lecture Notes in ComputerScience, 2000, 1948:572–581.
    93 O. Aran, T. Burger, A. Caplier, et al. A Belief-based Sequential Fusion Approachfor Fusing Manual Signs and Non-Manual Signals[J]. Pattern Recognition, 2009,42(5):812–822.
    94 P. Harling, A. Edwards. Hand Tension as a Gesture Segmentation Cue[C]//Progressin Gestural Interaction: Proceedings of Gesture Workshop. 1996:75–88.
    95 C. Wang, W. Gao, Z. Xuan. A Real-Time Large Vocabulary Continuous Recogni-tion System for Chinese Sign Language[C]//Proceedings of the Second IEEE Pa-cific Rim Conference on Multimedia: Advances in Multimedia Information Pro-cessing. 2001:150–157.
    96 W. Gao, G. Fang, D. Zhao, et al. Transition Movement Models for Large VocabularyContinuous Sign Language Recognition[C]//Sixth IEEE International Conferenceon Automatic Face and Gesture Recognition. 2004:553–558.
    97 R. Yang, S. Sarkar, B. Loeding. Handling Movement Epenthesis and Hand Seg-mentation Ambiguities in Continuous Sign Language Recognition Using NestedDynamic Programming[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, Forthcoming.
    98 S. Ong, S. Ranganath. Deciphering Gestures with Layered Meanings and SignerAdaptation[C]//Sixth IEEE International Conference on Automatic Face and Ges-ture Recognition. 2004:559–564.
    99 S. Ong, S. Ranganath, Y. Venkatesh. Understanding Gestures with Systematic Vari-ations in Movement Dynamics[J]. Pattern Recognition, 2006, 39(9):1633–1648.
    100 U. von Agris, D. Schneider, J. Zieren, et al. Rapid Signer Adaptation for IsolatedSign Language Recognition[C]//Proc. of the IEEE Conf. on Computer Vision andPattern Recognition Workshop. 2006.
    101 A. Farhadi, D. Forsyth, R. White. Transfer Learning in Sign Language[C]//IEEEConference on Computer Vision and Pattern Recognition (CVPR’07). 2007:1–8.
    102 R. Raina, A. Battle, H. Lee, et al. Self-Taught Learning: Transfer Learning fromUnlabeled Data[C]//Proceedings of the 24th International Conference on Machinelearning. 2007:759–766.
    103 U. von Agris, C. Blomer, K. Kraiss. Rapid Signer Adaptation for Continuous SignLanguage Recognition Using a Combined Approach of Eigenvoices, MLLR, andMAP[C]//19th International Conference on Pattern Recognition (ICPR). 2008:1–4.
    104 L. Rabiner. A Tutorial on Hidden Markov Models and Selected Applications inSpeech Recognition[J]. Proceedings of the IEEE, 1989, 77(2):257–286.
    105 L. Baum, J. Eagon. An Inequality with Applications to Statistical Estimation forProbabilistic Functions of Markov Processes and to a Model for Ecology[J]. Bull.Amer. Math. Soc, 1967, 73(360-363):212.
    106 A. Viterbi. Error Bounds for Convolutional Codes and an Asymptotically Opti-mum Decoding Algorithm[J]. IEEE Transactions on Information Theory, 1967,13(2):260–269.
    107 G. Forney. The Viterbi Algorithm[J]. Proceedings of the IEEE, 1973, 61(3):268–278.
    108 L. Baum, T. Petrie. Statistical Inference for Probabilistic Functions of Finite StateMarkov Chains[J]. The Annals of Mathematical Statistics, 1966:1554–1563.
    109 L. Baum, G. Sell. Growth Functions for Transformations on Manifolds[J]. Pac. J.Math, 1968, 27(2):211–227.
    110 L. Baum, T. Petrie, G. Soules, et al. A Maximization Technique Occurring in theStatistical Analysis of Probabilistic Functions of Markov Chains[J]. The Annals ofMathematical Statistics, 1970, 41(1):164–171.
    111 L. Baum. An Inequality and Associated Maximization Technique in StatisticalEstimation for Probabilistic Functions of Markov Processes[J]. Inequalities, 1972,3(1):1–8.
    112 T. Moon. The Expectation-Maximization Algorithm[J]. IEEE Signal ProcessingMagazine, 1996, 13(6):47–60.
    113 D. O’Shaughnessy. Invited Paper: Automatic Speech Recognition: History, Meth-ods and Challenges[J]. Pattern Recognition, 2008, 41(10):2965–2979.
    114 H. Fujisawa. Forty Years of Research in Character and Document Recognition—AnIndustrial Perspective[J]. Pattern Recognition, 2008, 41(8):2435–2446.
    115 C. Lee, Q. Huo. On Adaptive Decision Rules and Decision Parameter Adaptationfor Automatic Speech Recognition[J]. Proceedings of the IEEE, 2000, 88(8):1241–1269.
    116 A. Brakensiek, A. Kosmala, G. Rigoll. Comparing Adaptation Techniques for On-Line Handwriting Recognition[C]//Sixth International Conference on DocumentAnalysis and Recognition. 2001:486–490.
    117 B. Mak, J. Kwok, S. Ho. A Study of Various Composite Kernels for KernelEigenvoice Speaker Adaptation[C]//IEEE International Conference on Acoustics,Speech, and Signal Processing (ICASSP). 2004, 1:325–328.
    118 R. Kuhn, P. Nguyen, J. Junqua, et al. Eigenvoices for Speaker Adaptation[C]//FifthInternational Conference on Spoken Language Processing. 1998:1771–1774.
    119 R. Kuhn, J. Junqua, P. Nguyen, et al. Rapid Speaker Adaptation in EigenvoiceSpace[J]. IEEE Transactions on Speech and Audio Processing, 2000, 8(6):695–707.
    120 C. Leggetter, P. Woodland. Maximum Likelihood Linear Regression for SpeakerAdaptation of Continuous Density Hidden Markov Models[J]. Computer Speechand Language, 1995, 9(2):171–185.
    121 J. Gauvain, C. Lee. Maximum a Posteriori Estimation for Multivariate GaussianMixture Observations of Markov Chains[J]. IEEE Transactions on Speech andAudio Processing, 1994, 2(2):291–298.
    122 L. Liporace. Maximum Likelihood Estimation for Multivariate Observations ofMarkov Sources[J]. IEEE Transactions on Information Theory, 1982, 28(5):729–734.
    123 B. Juang. Maximum-Likelihood Estimation for Mixture Multivariate StochasticObservation of Markov Chains.[J]. AT & T TECH. J., 1985, 64(6):1235–1250.
    124 J. Chen, X. Chen, W. Gao. Expand Training Set for Face Detection by Ga Re-sampling[C]//The 6th IEEE International Conference on Automatic Face and Ges-ture Recognition (FG2004). 2004:73–79.
    125 B. Legrand, C. Chang, S. Ong, et al. Chromosome Classification Using DynamicTime Warping[J]. Pattern Recognition Letters, 2008, 29(3):215–222.
    126 R. Bellman. Some Problems in the Theory of Dynamic Programming[J]. Econo-metrica: Journal of the Econometric Society, 1954:37–48.
    127 B. Frey, D. Dueck. Clustering by Passing Messages between Data Points[J]. Sci-ence, 2007, 315(5814):972–976.
    128 J. Macqueen. Some Methods for Classification and Analysis of Multivariate Obser-vations[C]//Proceedings of the fifth Berkeley Symposium on Mathematical Statis-tics and Probability: Statistics. 1967:281–297.
    129 J. Takahashi, S. Sagayama. Vector-Field-Smoothed Bayesian Learning for In-cremental Speaker Adaptation[C]//International Conference on Acoustics, Speech,and Signal Processing (ICASSP). 1995, 1:696–699.
    130 H. Gish, K. Ng. A Segmental Speech Model with Applications to Word Spot-ting[C]//IEEE International Conference on Acoustics, Speech, and Signal Process-ing (ICASSP). 1993, 2:447–450.
    131 C. Li, M. Siu, J. Au-Yeung. Recursive Likelihood Evaluation and Fast SearchAlgorithm for Polynomial Segment Model with Application to Speech Recogni-tion[J]. IEEE Transactions on Audio, Speech, and Language Processing, 2006,14(5):1704–1718.
    132 S. Dharanipragada, M. Padmanabhan. A Nonlinear Unsupervised Adaptation Tech-nique for Speech Recognition[C]//Sixth International Conference on Spoken Lan-guage Processing. 2000:556–559.
    133 P. Woodland, D. Pye, M. Gales. Iterative Unsupervised Adaptation Using Maxi-mum Likelihood Linear Regression[C]//Fourth International Conference on SpokenLanguage Processing. 1996, 2:1133–1136.
    134 P. Nguyen, P. Gelin, J. Junqua, et al. N-Best Based Supervised and UnsupervisedAdaptation for Native and Non-Native Speakers in Cars[C]//IEEE InternationalConference on Acoustics, Speech, and Signal Processing. 1999, 1:173–176.
    135 T. Shinozaki, Y. Kubota, S. Furui. Unsupervisec Cross-Validation Adaptation Al-gorithms for Improved Adaptation Performance[C]//Proceedings of the 2009 IEEEInternational Conference on Acoustics, Speech and Signal Processing. 2009:4377–4380.
    136 R. Kohavi. A Study of Cross-Validation and Bootstrap for Accuracy Estimationand Model Selection[C]//International Joint Conference on Artificial Intelligence.1995, 14:1137–1145.
    137 R. Duda, P. Hart, D. Stork. Pattern Classification[M]. Wiley-Interscience, 2001.
    138 K. Fukunaga. Introduction to Statistical Pattern Recognition[M]. Academic PressProfessional, Inc., 1990.
    139 D. Angluin, P. Laird. Learning from Noisy Examples[J]. Machine Learning, 1988,2(4):343–370.
    140 D. Haussler. Probably Approximately Correct Learning[C]//Proceedings of theEighth National Conference on Artificial Intelligence. 1990:1101–1108.
    141 M. Tonomura, T. Kosaka, S. Matsunaga. Speaker Adaptation Based on Trans-fer Vector Field Smoothing Using Maximum a Posteriori Probability Estima-tion[C]//International Conference on Acoustics, Speech, and Signal Processing(ICASSP). 1995, 1:688–691.
    142 K. Ohkura, M. Sugiyama, S. Sagayama. Speaker Adaptation Based on TransferVector Field Smoothing with Continuous Mixture Density HMMs[C]//Second In-ternational Conference on Spoken Language Processing. 1992:369–372.
    143 L. Rabiner, B. Juang. Fundamentals of Speech Recognition[M]. Prentice Hall,1993.
    144 R. Plamondon, S. Srihari. On-Line and Off-Line Handwriting Recognition: AComprehensive Survey[J]. IEEE Transactions on Pattern Analysis and MachineIntelligence, 2000, 22(1):63–84.
    145 C. Wang, X. Chen, W. Gao. Generating Data for Signer Adaptation[C]//Int.Workshop on Gesture and Sign Language based Human-Computer Interaction.2007:114–121.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700