基于视觉的静态手势识别系统
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
人与计算机的交互活动越来越成为人们日常生活的一个重要组成部分。特别是最近几年,随着计算机技术的迅猛发展,研究符合人机交流习惯的新颖人机交互技术变得异常活跃,也取得了可喜的进步。这些研究包括人脸识别、面部表情识别、唇读、头部运动跟踪、凝视跟踪、手势识别、以及体势识别等。
     手势是一种自然、直观、易于学习的人机交互手段。手势识别按输入设备不同可以分为基于数据手套的手势识别和基于视觉的手势识别。其中基于视觉的手势识别以人手直接作为计算机的输入设备,人机之间的通讯将不再需要中间媒体,用户可以简单地定义一种适当的手势来对周围的机器进行控制。但是由于手势本身具有多样性、多义性以及时间和空间上的差异性等特点,加之人手是复杂变形体以及视觉本身的不适定性,基于视觉的手势识别是一个富有挑战性的、多学科交叉的研究课题。
     本文设计实现了一个基于视觉的静态手势识别系统,该系统能够实时地对从摄像头输入的14个常用静态手势进行识别,并通过识别结果控制幻灯片放映。系统的设计准则一是实时性,二是准确性。在手势建模方面,采用基于表观的手势模型;在手势分析方面,经过手势图像预处理和特征参数提取得到八个手势特征参数;在手势识别方面,采用二次分类(粗分类和细分类)的方法进行识别。
     整个系统分三个部分实现。手势图像预处理部分,根据人体的肤色特征从环境中分割出手区域,然后通过图像增强和拉普拉斯边缘提取算法得到手势轮廓;手势特征提取部分,提取了八个手势特征参数,组成特征向量;视频流实时处理部分,使用VFW,通过回调函数对摄像头输入的视频流进行计算,提取出单个静态手势图像,并进行实时地识别。
Human-computer interaction has played a more and more important role in human’s daily life. Recent years espacially, with the development of computer science, research on new human-computer interaction technology become extremely active, and advancement has been achieved. These research includes face recognition, expression recognition, hand gesture regognition, pose recognition and so on.
     Hand gesture is a natural and straight human-computer interaction method. There are two methods on hand gesture recognition, recognition based on data glove and recogniton based on vision. Take hand as the input equipment directly, communication between human and computer will need no more other intermediate media. Users can control the machines around simply sign to it with the hand gesture user itself defines. However, gesture has the characters of multi-mode, multi-meaning and has discrepancy under certain time and space situation; moreover, human hands are complicated transformed objects and there is visual instability, all of which make gesture recognition based on sight become a challengeable multi-subject research goal.
     This paper realized a static hand gesture recognition system based on vision. This system can recognize 14 common static hand gestures inputted from camera at real time, and control the powerpoint with the recognition result. It is a real-time system, so both recognition time and the correct recognition rate have to be considerd while designing the system. In the aspect of hand gesture modeling, the system adopt hand gesture model based on apparent; In the aspect of hand gesture analysis, the system picks up eight characters through image preprocess and character extraction; In the aspect of recognition, the system adopts two times classifation (rough classification and particular classification).
     The system consists of three part. First, preprocession of the original hand gesture image, in this part hand area is extracted from background
引文
1 胡友树. 手势识别技术综述. 中国科技信息. 2005, 1(2): 41~42
    2 T. Takahashi, F. K. Shino. Hand gesture coding based on experiments using a hand gesture interface device. SIGCHl Bulletin. 1991, 23(2):67~73
    3 Davis, M. Shah. Visual gesture recognition. In IEEE Proceeding on Vision-Image Signal Processing. 1994, 141(2): 321~332
    4 Starner, T., Pentland. A Real-time American Sign Language Recognition from Video Using Hidden Markov Models. Technical Report TR375, Media Lab, MIT, 1996. URL:ftp://whitechapel.media.mit.edu/pub/tech-reports/ TR- 375.ps.Z
    5 Kirsti Grobel, Marcell Assam. Isolated sign language recognition using hidden Markov models. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, 1997. Orlando, FL, 1997:162~167
    6 C. Vogler, D. Metaxas. Adapting Hidden Markov Models for ASL recognition by using three -dimensional computer vision methods. SMC' 97:156~161
    7 C. Lee, Y. Xu. Online interactive learning of gestures for human/robot interfaces. In Proceeding of IEEE Int.Conf. on Robotics and Automation. 1996, 3(1):30~42
    8 Mohammed Waleed Kadous. Machine recognition of Auslan signs using PowerGloves:Towards large-lexicon recognition of sign language. In Lynn Messing, editor, Proceedings of the Workshop on the Integration of Gesture in Language and Speech, pages 165~174, Applied Science and Engineering Laboratories Newark,Delaware and Wilmington, Delaware, October 1996
    9 任海兵, 祝远新, 徐光祐等. 基于视觉手势识别的研究——综述. 电子学报. 2000, 28(2): 118~121
    10 Wen Gao. Enhanced user interface Proceedings of IVYCS' 95 workshop using hand gesture recognition software computing, Bei jing 1995
    11 吴江琴, 高文, 陈熙霖. 基于数据手套的汉语手指字母的识别. 模式识别与人工智能. 1999, 12 (1):74~78
    12 Jiyong Ma, Wen Gao, Jiangqin Wu et al. A Continuous Chinese SignLanguage Recognition System. IEEE International Conference on Face and Gesture, March, FG' 2000: 428~433, 28~31
    13 Wen Gao, Jiyong Ma, Jiangqin Wu et al. Large Vocabulary Sign Language Recognition Based on HMM/ANN/DP. International Journal of Pattern Recognition and Artificial Intelligence, 2000, 14(5):587~602
    14 Wen Gao, Jiyong Ma, Xilin Chen et al. HandTalker: A Multimodal Dialog System Using Sign Language and 3-D Virtual Human. The Third International Conference on Multimodal Interface. Lecture Notes in Computer Science, Beijing Oct. 2000:564~571
    15 祝远新, 徐光祐, 黄浴. 基于表观的动态孤立手势识别. 软件学报. 2000, 11(1): 54~61
    16 任海兵, 祝远新, 徐光祐等. 连续动态手势的时空表观建模及识别. 计算机学报. 2000, 23(8): 824~828
    17 Liang R-H., Ouhyoung. A Real-time Continuous Alphabetic Sign Language to Speech Conversion VR System. Computer Graphics Forum, pp. C67~C77, Vol. 14, No. 3, UK, 1995
    18 张良国, 吴江琴, 高文等. 基于 Hausdorff 距离的手势识别. 中国图象图形学报. 2002, 7(7): 1144~1150
    19 G. Bradski, Boon-Lock Yeo, Minerva M. Yeung. Gesture for video content navigation. SPIE 3656 ( Proc. of the IS&T/ SPIE Conf . on Storage and Retrieval for Image and Video Database VII), San Jose, California, 1999:230~242
    20 J. J. Kuch, Vision-based hand modeling and tracking for virtual telecomferencing and telecollaboration. Proc. IEEE Int’l Conf. Computer Vision, Cambridge, Mass., 1995
    21 D. M. Gavrila, L. S. Davis. Towards 3D model-based tracking and recognition of human movement: a multi-view approach. Proc. Int ’l Workshop on Automatic Face and Gesture Recognition, Switzerland, 1995:272~277
    22 J. Lee, T. L. Kunii. Model-based analysis of hand posture. IEEE Computer Graphics and Applications, Sept. 1995:77~86
    23 Trevor J. Darrell, Irfan A. Essa, Alex P. Pentland. Task-specific gesture analysis in real-time using interpolated views. IEEE Trans. PAMI, Dec. 1996,18 (12): 1236~1242
    24 A. Bobick, J. Davis. Real-time recognition of activity using temporal templates. Proc. of Third IEEE Workshop on applications of computer vision, Florida, 1996: 39~42
    25 R. Cipolla, N. J. Hollinghurst. Human-robot interface by pointing with uncalibrated stereo vision. Image and vision computing, Mar. 1996, 14: 171~178
    26 Quek F. Unencumbered gestural interaction. IEEE Multimedia, 1996: 36~47
    27 R. Culter, M. Turk. View-based interpretation of real-time optical flow for gesture recognition. Proc. of 3rd Int′l Conf. Automatic Face and Gesture Recognition, Japan, 1998
    28 G. Xu, Y. Zhu, Y. Huang et al. Automatic visual recognition of isolated hand gestures with computing spatio-temporal representations. Proc. Of the 1998 Symp. on Image, Speech, Signal Processing and Robotics ( IS2 SPR’98), 1998, Hong Kong, I: 49~54
    29 T. Starner, J. Weaver et al. Real-time american sign language recognition using desk and wearable computer based video. IEEE Trans. PMAI, 1998, 20(12): 1371~1375
    30 David Alan Becker, Sensi. A Real-Time Recognition, Feedback and Training System for T’ai Chi Gestures. MITMedia Lab, May, 1997
    31 Foley, J. D., van Dam, A. Fundamentals of Interactive Computer Grap hics. Reading, MA: Addison-Wesley, 1982
    32 Gonzalez, R. C., Woods, R. E. Digital Image Processing. 3rd ed, Reading, MA: Addison-Wesley, 1992
    33 Levkowitz, H.. Color Theory and Modeling for Computer Graphics, Visualization, and Multimedia Applications. Boston: Kluwer Academic Publishers, 1997
    34 Ledley, S., Buas, M., Golab, T.. Fundamentals of true-color image processing. In : Proceedings of the 10th International Conference on Pattern Recognition. 1990:791~795
    35 Bajon, J., Cattoen, M. et al. Real-Time colorimetric transformations used in robot vision. In : Proceedings of the MICAD. 1985:76~86
    36 陶霖密, 彭振云, 徐光祐. 人体的肤色特征. 软件学报. 2001, 12(7):1032~1041
    37 Rafael C. Gonzalez, Richard E.Woods. 数字图像处理. 阮秋琦, 阮宇智. 第二版. 电子工业出版社, 2003: 59~112
    38 郭兴伟, 葛元, 王林泉. 基于形状特征的字母手势的分类及识别算法. 计算机工程. 2004, 30(18): 130~132, 186
    39 刘肃亮, 周明全, 韦智勇. 基于 VFW 的视频应用程序开发. 西北大学学报. 2003, 12(6)
    40 张星明. 视频图像捕获及运动检测技术的实现. 计算机工程. 2002, 28(8): 130~132
    41 刘祎玮. Visual C++视频/音频开发实用工程案例精选. 人民邮电出版社, 2004: 11~33
    42 郎锐. 数字图像处理学 Visual C++实现. 北京希望电子出版社, 2003: 27~40

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700