基于混合卷积神经网络的静态手势识别

英文篇名：Static Gesture Recognition Based on Hybrid Convolution Neural Network
作者：石雨鑫 ; 邓洪敏 ; 郭伟林
英文作者：SHI Yu-xin;DENG Hong-min;GUO Wei-lin;College of Electronics and Information Engineering,Sichuan University;
关键词：卷积神经网络 ; 随机森林 ; 静态手势 ; 识别
英文关键词：Convolutional neural network;;Random forest;;Static gesture;;Recognition
中文刊名：JSJA
英文刊名：Computer Science
机构：四川大学电子信息学院;
出版日期：2019-06-15
出版单位：计算机科学
年：2019
期：v.46
基金：国家自然科学基金(61174025)资助
语种：中文;
页：JSJA2019S1034
页数：4
CN：S1
ISSN：50-1075/TP
分类号：175-178

摘要

静态手势识别在人机交互方面具有重要的应用价值,但手势背景的复杂性和手势形态的多样性给识别的准确性带来了一定的影响。为了提高手势识别的准确率,文中提出了一种基于卷积神经网络(Convolution Nenral Network,CNN)与随机森林(Random Forest,RF)的识别方法。该方法首先对静态手势的图片进行手势分割,然后利用卷积网络的特征提取功能提取特征向量,最后使用随机森林分类器对这些特征向量进行分类。一方面,卷积神经网络具有分层学习的能力,能够收集图片上更具代表性的信息;另一方面,随机森林对样本和特征选择具有随机性,并且对每个决策树结果进行了平均,不易出现过拟合问题。在静态手势数据集上进行验证,实验结果显示:所提方法能有效地对静态手势进行识别,平均识别率能够达到94.56%。文中进一步将所提方法与几种经典的特征提取方法(主成分分析(PCA)和局部二进制(LBP))进行对比,实验结果显示:相比于PCA和LBP特征提取方法,由CNN提取的特征向量进行分类识别的效果更好,该方法的识别率比PCA-RF方法高2.44%,比LBP-RF方法高1.74%。最后,在经典的MNIST数据集上进行验证,所提方法的识别率达到了97.9%,高于其他两种传统的特征提取方法。
Static gesture recognition has caught special attention for its great application value in man-machine interaction.At the same time,the accuracy of gesture recognition is affected by the complexity of gesture background and the diversity of gesture morphology in a certain extent.In order to improve the accuracy of gesture recognition,a method was proposed,which is based on convolutional neural network(CNN) and random forest(RF).Firstly,the image of the static gesture is segmented,then the feature extraction function of convolution network is used to extract feature vectors,and finally the random forest classifier is used to classify these feature vectors.On the one hand,the CNN has the ability of layered learning and is able to collect more representative information on the picture.On the other hand,random forest shows randomness for samples and feature selection,meanwhile,it can be avoided easily that the results of each decision tree is averaged over fitting problem.This paper verified by using the static gesture data set,and the experimental results show that the proposed method can effectively identify the static gestures and achieve an average recognition rate of 94.56%.The method proposed in this paper was further compared with principal component analysis(PCA) and partial binary(LBP).The experimental results show that the classification and recognition effect with feature extraction by CNN is better than PCA and LBP.The recognition rate is 2.44% higher than that of PCA-RF methodand 1.74% higher than that of LBP-RF method.Finally,the recognition rate of the proposed method reaches 97.9%,which is higher than the other two traditional feature extraction methods.

引文

[1]ZAKI M M,SHAHEEN S I.Sign language recognition using a combination of new vision based features[J].Pattern Recognition Letters,2011,32(4):572-577.
    [2]ALKHATEEB J H,KHELIFI F,JIANG J,et al.A new approach for off-line handwritten Arabic word recognition using KNN classifier[C]∥IEEE International Conference on Signal and Image Processing Applications.IEEE,2010:191-194.
    [3]LIU Y,YIN Y,ZHANG S.Hand Gesture Recognition Based on HU Moments in Interaction of Virtual Reality[C]∥International Conference on Intelligent Human-Machine Systems and Cybernetics.IEEE,2012:145-148.
    [4]RONCANCIO C.Combined Gesture-Speech Recognition and Synthesis Using Neural Networks[J].IFAC Proceedings Volumes,2008,41(2):2968-2973.
    [5]LECUN Y,BENGIO Y.Convolutional networks for images,speech,and time series[M]∥The handbook of brain theory and neural networks.MIT Press,1998.
    [6]WAIBEL A,HANAZAWA T,HINTON G,et al.Phoneme recognition using time-delay neural networks[J].Readings in Speech Recognition,1990,1(2):393-404.
    [7]VAILLANT R,MONROCQ C,CUN Y L.An original approach for the localization of objects in images[C]∥International Conference on Artificial Neural Networks.IET,1993:26-30.
    [8]LAWRENCE S,GILES C L,TSOI A C,et al.Face recognition:a convolutional neural-network approach[J].IEEE Transactions on Neural Networks,1997,8(1):98-113.
    [9]NIU X X,SUEN C Y.A novel hybrid CNN-SVM classifier for recognizing handwritten digits[J].Pattern Recognition,2012,45(4):1318-1325.
    [10]史鹤欢,许悦雷,马时平,等.PCA预训练的卷积神经网络目标识别算法[J].西安电子科技大学学报(自然科学版),2016,43(3):161-166.
    [11]BREIMAN L.Random forest[J].Machine Learning,2001,45:5-32.
    [12]STERGIOPOULOU E,PAPAMARKOS N.Hand gesture recognition using a neural network shape fitting technique[J].Engineering Applications of Artificial Intelligence,2009,22(8):1141-1158.
    [13]ESCALERA S,RADEVA P,DIMOV D,et al.Graph cuts optimization for multi-limb human segmentation in depth maps[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2012:726-732.
    [14]BELONGIE S,MALIK J,PUZICHA J.Shape matching and object recognition using shape contexts[C]∥IEEE International Conference on Computer Science and Information Technology.IEEE,2010:483-507.
    [15]NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814.
    [16]QUINLAN J R.Bagging,boosting,and C4.5[C]∥Proceedings of the National Conference on Artificial Intelligence.AMER AS-SOC ARTFICIAL INTELL,1996:725-730.
    [17]JOHNSON R W.An Introduction to the Bootstrap[J].Teaching Statistics,2001,23(2):49-54.
    [18]王全才.随机森林特征选择[D].大连:大连理工大学,2011.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700