基于几何特征的手势识别算法研究

作者：何阳清
论文级别：硕士
学科专业名称：计算机软件与理论
中文关键词：手势识别 ; 关键点 ; 几何矩 ; 边缘检测 ; 欧式距离
英文关键词：Gesture Recognition ; Feature Pixel ; Invariant Moment ; Edge Detection ; Euclidean Distance
学位年度：2004
导师：葛元
学科代码：081202
学位授予单位：上海海事大学
论文提交日期：2004-06-01

摘要

手势是人们生活当中一种自然而直观的人际交流模式，随着计算机技术的发展和人机交互逐渐向以人为中心转移，对手势识别的研究也逐渐成为人们研究的热点。然而，由于手势本身具有的多样性、多义性、以及时间和空间上的差异性等特点，加之人手是复杂变形体及视觉本身的不适定性，因此基于视觉的手势识别是一个极富挑战性的多学科交叉研究课题。手势分为动态手势和静态手势，动态手势定义为手运动的轨迹，而静态手势强调通过手型传递一定的意义。本文结合上海市自然科学基金资助课题“手势识别和合成算法”，对静态的手势识别算法进行研究。
     手势识别的过程大致分为三个部分，手势图像预处理、手势图像特征提取和识别。在手势图像预处理部分，对已经被标准化的手势图像(大小为128*128像素的bmp格式的灰度图)，根据需要采用局部平均法对图像进行平滑，然后对图像采用拉普拉斯算子进行锐化，再对图像采用最大方差法进行二值化，最后用八方向邻域搜索法对二值化图像做轮廓提取。
     在手势特征提取和识别部分，本文提出了两种基于手势图像几何特征的方法：HDC提取关键点的识别算法及应用几何矩和canny边缘检测结合的识别算法。在HDC提取关键点的识别算法中，提出一种提取手势轮廓曲线关键点对手势进行识别的算法。手势图像经过二值化后，提取其轮廓。将图像的轮廓看成一条曲线，应用层次离散相关原理，以一个内核对曲线进行多次平滑，得到曲线的尺度空间，再通过跟踪曲线在尺度空间中的运动找出手势轮廓的关键点。最后通过最小距离法进行识别。在应用几何矩和canny边缘检测结合的识别算法中，提出一种结合几何矩和边缘检测的手势识别算法。手势图像经过二值化处理后，提取手势图像的几何矩特征，取出几何矩特征七个特征分量中的四个分量，形成手势的几何矩特征向量。在灰度图基础上直接检测图像的边缘，利用直方图表示图像的边界方向特征。最后，通过设定两个特征的权重来计算图像间的距离，对30个字母手势进行识别。
     最后，HDC提取关键点的识别算法在实验中对30个手势进行识别，识别率为83.3％。应用几何矩和canny边缘检测结合的识别算法结合了两种图像特征的优点，在实验中识别率为91.3％。
Hand gestures play a natural and intuitive communication mode for all human dialogs. With the development of computer technology, HCI(Human Computer Interaction) is advancing and human is becoming the center in HCI. So a growing number of researcher are concerning the study on hand gesture recognition. However, vision-based recognition of hand gestures is an extremely challenging interdisciplinary project due to following three reasons: (1) hand gestures are rich in diversities, multi-meanings, and space-time varieties; (2) human hands are complex non-rigid objects; (3) computer vision itself is an ill-posed problem. Hand gestures include dynamic hand gestures, whose meanings are based on the track of the motion of hands, and static hand gestures in which the shape of hand gesture is used to express the meaning. This paper, as a part of the researching subject, the algorithm of hand gesture recognition and synthesizing, is supported by shanghai nature and science fund, tries to perform study on static hand g
    esture recognition.
    Hand gesture recognition is composed of three parts, preprocessing hand gesture images, extracting image features and recognition. During the preprocessing, image smoothing, then image sharpening are performed on standard hand gesture images (128*128 pixels gray bmp image). Finally, binary image is extracted and contour is detected by means of 8-connected boundary tracking when necessary.
    In the part of feature extraction and recognition, this paper presents two methods based on geometric features: Algorithm by Using HDC for Feature Pixels and Algorithm Based on Invariant Moment and Edge Detection. In the Algorithm by Using HDC for Feature Pixels, the contour of hand gesture, which will be regarded as a curve, is extracted after preprocessing. Then a scale space of the curve is created by the application of the hierarchical discrete correlation. Anew method which is based on the motion of the curve through scale space is proposed for feature detection. Finally, gesture patterns are recognized by means of minimal distance of feature pixels. In Algorithm Based on Invariant Moment and Edge Detection, an algorithm based on two features



    of invariant moment and edge detection is presented. After preprocessing, binary image is obtained and then 4 from 7 invariant moments are extracted. By edge detection, histogram is formed to describe the edge information. Finally, the recognition is performed on 30 letter gestures by computing distance, in which different coefficients are set to these two features.
    The recognition rate of is 83.3% in Algorithm by Using HDC for Feature Pixels by performing recognition on 30 hand gestures. In Algorithm Based on Invariant Moment and Edge Detection, the recognition rate of 91.3% is achieved.
    Yangqing He(Computer Software and Theory)
    Directed by-. Yuan Ge

引文

[1] 赵慧琳．奇异值分解的人脸识别算法：[硕士学位论文]．上海：上海海运学院，2003
    [2] T.Kanade. Picture processing by computer complex and recognition of human faces. Technical Report, Kyoto University. Department of Information Science, 1973
    [3] 任海兵，祝远新，徐光佑等．基于视觉手势识别的研究——综述．电子学报，2000，28(2)：118-121
    [4] 郭兴伟．基于视觉的手势识别算法研究：[硕士学位论文]．上海：上海海运学院，2003
    [5] T.Takahashi and F.K.Shinoo Hand gesture coding based on experiments using a hand gesture interface device. SIGCHI Bulletin,1991, 23(2):67-73
    [6] Davis and M. Shah, Visual gesture recognition, In IEEE Proceeding on Vision-linage Signal Processing, April 1994:321-332
    [7] Starner, T. and Pentland, A. Visual Recognition of American Sign Languagc Using Hidden Markov Models. Technical Report TR306, Media Lab, MIT, 1995
    [8] Starner, T. and Pentland, A. Real-time American Sign Language Recognition from Video Using Hidden Markov Models. Technical Report TR375, Media Lab, MIT, 1996
    [9] Kirsti Grobel, Marcell Assam. Isolated sign language recognition using hidden Markov models. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando,FL, 1997:162-167
    [10] Wen Gao. Enhanced user interface by using hand gesture recognition. Proceedings of IVYCS'95 workshop on software computing, Beijing 1995.
    [11] 吴江琴，高文，陈熙霖．基于数据手套的汉语手指字母的识别。模式识别与人工智能，March，1999，12(1)：74-78
    [12] Jiyong Ma, Wen Gao, Jiangqin Wu and Chunli Wang. A Continuous Chinese Sign Language Recognition System, IEEE. International Conference on Face and Gesture,March, FG'2000: 428-433, 28-31
    [13] Wen Gao, Jiyong Ma, Jiangqin Wu and Chunli Wang. Large Vocabulary Sign Language Recognition Based on HMM/ANN/DP. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14 (5): 587-602
    [14] Wen Gao, Jiyong Ma, Xilin Chen, Shiguan Shan,Wei Zeng, Jie Yan, Hongming Zhang, Jiangqin Wu,Feng Wu,Chunli Wang. HandTalker: A Multimodal Dialog System Using Sign Language and 3-D Virtual Human. The Third International Conference on Multimodal Interface. Lecture Notes in Computer Science, Beijing Oct. 2000:564-571
    [15] 任海兵，祝远新，徐光佑等．连续动态手势的时空表现建模及识别．计算机学报，2000，23(8)：824-828
    [16] Liang R-H. and Ouhyoung, M. A Real-time Continuous Alphabetic Sign Language to Speech Conversion VR System, Computer Graphics Forum, Aug 1995, 14 (3): C67-C77


    [17] Liang, R-H. and Ouhyoung, M. A Sign Language Reco.gnition System Using Hidden Markov Model and Context Sensitive Search, Proc. of the ACM Symposium on Virtual Reality Software and Technology, July 1996:59-66
    [18] C.Lee&Y.Xu, Online,interactive learning of gestures for human/robot interfaces, In Proceeding of IEEE lnt.Conf, on Robotics and Automation,1996, 3 (1):30-42
    [19] Mohammed Waleed Kadous. Machine recognition of Auslan signs using PowerGloves: Towards large-lexicon recognition of sign language. In Lynn Messing, editor, Proceedings of the Workshop on the Integration of Gesture in Language and Speech, Applied Science and Engineering Laboratories Newark, Delaware and Wilmington, Delaware, October 1996:165-174
    [20] C.Vogler and D.Metaxas. Adapting Hidden Markov Models for ASL recognition byusing three -dimensional computer vision methods.SMC'97:156-161
    [21] 王延江，多模态人机交互中基于笔输入的手势识别，北方交通大学学报，2001年4月，第25卷第2期
    [22] Simon Haykin．神经网络原理．机械工业出版社，2003．10
    [23] Lee, H.K, Kim. L.H.J.H. An HMM-based threshold model approach for gesture recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1999.21(10):961-972
    [24] 郭兴伟。基于视觉的手势识别算法研究：[硕士学位论文]。上海：上海海运学院，2003
    [25] Vincent Beau, Mark Singer; Reduced resolution and scale space for dominant feature detection in contours, Pattern Recognition 34(2001) 287-297
    [26] P. Perona and J. Malik. Scale space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12:629-639, 1990.
    [27] Wang Y.P. Image Representation using Multiseale Differential Operators, IEEE Trans. Image Processing, vol. 8, pp. 1757-1771, 1999
    [28] 王延江，袁保宗．多模式人机交互中的基于笔输入的手势识别第25卷第2期2001年4月
    [29] A. Jain and A. Vailaya, Shape based retrieval: a case study with trademark image databases, Pattern Recognition, Vol. 31, No. 9, pp. 1369-1390, 1998.
    [30] MING-KUEI HU. Visual pattern recognition by moment invadants. Ire transactions on information theory. 1962.179-187
    [31] 殷涛．基于几何矩的字母手势识别算法：[硕士学位论文]．上海：上海海运学院，2003
    [32] 杨静，丘江，王岩飞．线性不变矩及其在图像识别中的应用算法研究．光子学报，第32卷第3期
    [33] 郭丽，黄元元，孙兴华，杨静宇．基于方向特征的二值商标图像检索方法．第23卷第7期2003年7月
    [34] 孙兆林．MATLAB 6.x图像处理．北京：清华大学出版社，2002
    [35] 陈岩松，郑师海，李德华．二维光学几何矩变换．物理学报．第40卷第10期
    [36] 何斌，马天予，王运坚等编著．Visual C++数字图像处理．北京：人民邮电出版社，2001．04


    [37][日]谷口庆治．数字图像处理．北京：科学出版社，2002
    [38]姚海根编著．图像处理．上海：上海科学技术出版社，2000．01
    [39]王林泉．关于手写汉字识别的研究．计算机研究与发展，1987，24(7)：14-22
    [40]黄柏林．基于边界特征的人脸识别：[硕士学位论文]．上海：上海海运学院，2003
    [41]何海峰、王林泉、葛元。灰度图像中基于像素分布特征的人脸定位．计算机工程，2002，28(6)：158-160
    [42]朱俊青．人脸识别的算法研究：[硕士学位论文]．上海：上海海运学院，2002
    [43]李振辉，李仁和编著．探索图像文件的奥秘．北京：清华大学出版社，1996．09
    [44]边肇祺，张学工等编著．模式识别．北京：清华大学出版社，2000．02
    [45]杨枝灵．王开．Visual C++数字图像获取．北京：人民邮电出版社，2003

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700