基于卷积神经网络的人体动作识别

英文篇名：Human action recognition based on convolutional neural networks
作者：于华 ; 智敏
英文作者：YU Hua;ZHI Min;College of Computer Science and Technology,Inner Mongolia Normal University;
关键词：卷积神经网络模型算法 ; 可变形部件模型算法 ; 特征提取 ; 特征融合 ; 人体动作识别
英文关键词：convolutional neural network;;deformable part model algorithm;;feature extraction;;feature fusion;;human motion
中文刊名：SJSJ
英文刊名：Computer Engineering and Design
机构：内蒙古师范大学计算机科学技术学院;
出版日期：2019-04-16
出版单位：计算机工程与设计
年：2019
期：v.40;No.388
基金：内蒙古自然科学基金项目(2018MS06008);; 内蒙古师范大学2017年度研究生科研创新基金项目(CXJJS17111)
语种：中文;
页：SJSJ201904042
页数：6
CN：04
ISSN：11-1775/TP
分类号：268-273

摘要

针对复杂场景下人体动作识别精度不高的问题,提出融合改进的可变形部件模型算法(DPM)以及卷积神经网络模型算法(CNN)的人体动作识别算法。在特征提取阶段,为提高人体检测精度,采用改进的DPM算法将部件滤波器模型由5个增加到8个,同时结合分支定界(BB)算法;CNN采用连续的卷积层提取特征,使用的CNN模型是经过梯度优化训练的针对人体动作识别的卷积神经网络,两个算法并行进行。在特征融合阶段,用加权求和的方式把两个模型提取的特征进行融合。用softmax分类器进行人体动作的分类识别。实验结果表明,该算法在标准的数据集、自搜集数据集上的精度较传统的机器学习方法提高了约10个百分点。
Aiming at the problem that human motion recognition accuracy is not high in complex scenes,a human motion recognition algorithm based on improved deformable part model algorithm(DPM)and convolution neural network model algorithm(CNN)was proposed.In the feature extraction stage,to improve the accuracy of human detection,the improved DPM algorithm was used to make part filter model increase from 5 to 8,and when the human body was positioned,branch and bound(BB)algorithm was combined.CNN adopted continuous convolutional layer extraction features.The CNN model used was a convolutional neural network trained by gradient optimization for human motion recognition,and the two algorithms were performed in parallel.In the stage of feature fusion,the features extracted by the two models were merged by weighted summation.The softmax classifier was used to classify and recognize human actions.Experimental results show that the accuracy of the proposed algorithm in standard data sets and self-collected data sets is about 13% higher than that of traditional machine learning methods.

引文

[1]YANG Rui.Human motion segmentation and recognition based on depth data[D].Nanjing:Nanjing University,2016(in Chinese).[杨睿.基于深度数据的人体动作分割与识别[D].南京:南京大学,2016.]
    [2]LIU Yan.Human motion recognition based on 3Dskeleton segment representation metrics and manifold segmentation[D].Hefei:University of Science and Technology of China,2017(in Chinese).[刘沿.基于3D骨架片段表示度量及流形分割的人体动作识别[D].合肥:中国科学技术大学,2017.]
    [3]Wang H,Schmid C.Action recognition with improved trajectories[C]//IEEE International Conference on Computer Vision,2014:3551-3558.
    [4]Simonyan K,Zisserman A.Two-stream convolutional networks for action recognition in videos[J].Computational Linguistics,2014(4):568-576.
    [5]Wang Limin,Xiong Yuanjun, Wang Zhe,et al.Temporal segment networks:Towards good practices for deep action recognition[J].ECCV,2016,22(1):20-36.
    [6]Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[J].Computer Science,2014(7):580-587.
    [7]Karpathy A,Toderici G,Shetty S,et al.Large-scale video classification with convolutional neural networks[C]//Computer Vision and Pattern Recognition.IEEE,2014:1725-1732.
    [8]Du T,Bourdev L,Fergus R,et al.Learning spatiotemporal features with 3Dconvolutional networks[C]//IEEE International Conference on Computer Vision.IEEE,2016:4489-4497.
    [9]Wang Limin,Qiao Yu,Tang Xiaoou.Action recognition with trajectory-pooled deep-convolutional descriptors[J].IEEE Conference on Computer Vision and Pattern Recognition,2015(7):4305-4314.
    [10]Gkioxari G,Girshick R, Malik J.Actions and attributes from wholes and parts[C]//IEEE International Conference on Computer Vision,2015:2470-2478.
    [11]Zhou B,Lapedriza A,Xiao J,et al.Learning deep features for scene recognition using places database[C]//International Conference on Neural Information Processing Systems,2014:487-495.
    [12]Girshick R,Donahue J,Darrell T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[J].Computer Science,2013(4):580-587.
    [13]Ren S,Girshick R,Girshick R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,39(6):1137-1149.
    [14]Girshick R.Fast R-CNN[J].IEEE International Conference on Computer Vision,2015:1129-1136.
    [15]Dai Jifeng,Qi Haozhi,Xiong Yuwen,et al.Deformable convolutional networks[J].ICCV,2017(3):764-773.
    [16]Zhang Pingping,Wang Dong,Lu Huchuan,et al.Learning uncertain convolutional features for accurate saliency detection[J].ICCV,2017(4):212-221.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700