摘要
提出了一种基于卷积神经网络和长短期记忆(LSTM)神经网络的深度学习网络结构。采用特征融合的方法,通过卷积网络提取出浅层特征与深层特征并进行联接,对特征通过卷积进行融合,将获得的矢量信息输入LSTM单元。分别使用数据光流信息与红绿蓝信息训练网络,将各网络的结果进行加权融合。实验结果表明,所提模型有效地提高了行为识别精度。
A deep learning network structure based on the convolutional neural network and long short term memory(LSTM)neural network is proposed.The feature fusion is used to extract the shallow features and deep features through the convolutional network,and the features are fused by convolution,and the the obtained vector information is input into the LSTM unit.Networks are trained separately using the optical flow images and the red green blue information,and the results from each network are fused with weights.The experimental results show that the proposed model effectively improves the accuracy of behavior recognition.
引文
[1]Laptev I,Marszalek M,Schmid C,et al.Learning realistic human actions from movies[C]∥IEEEConference on Computer Vision and Pattern Recognition,2008:1-8.
[2]Xu H Y,Kong J,Jiang M,et al.Action recognition based on histogram of spatio-temporal oriented principal components[J].Laser&Optoelectronics Progress,2018,55(6):061009.徐海洋,孔军,蒋敏,等.基于时空方向主成分直方图的人体行为识别[J].激光与光电子学进展,2018,55(6):061009.
[3]Zhao X J,Zeng X Q.Action recognition method based on dense optical flow trajectory and sparse coding algorithm[J].Journal of Computer Applications,2016,36(1):181-187.赵晓健,曾晓勤.基于稠密光流轨迹和稀疏编码算法的行为识别方法[J].计算机应用,2016,36(1):181-187.
[4]Xie F,Gong S R,Liu C P,et al.Human action recognition by visual word based on local and global features[J].Computer Science,2015,42(11):293-298.谢飞,龚声蓉,刘纯平,等.基于局部和全局特征视觉单词的人物行为识别[J].计算机科学,2015,42(11):293-298.
[5]Luo H L,Wang C J,Lu F.Survey of video behavior recognition[J].Journal on Communications,2018,39(6):169-180.罗会兰,王婵娟,卢飞.视频行为识别综述[J].通信学报,2018,39(6):169-180.
[6]Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[7]Chollet F.Xception:deep learning with depthwise separable convolutions[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2017:1800-1807.
[8]Howard A G,Zhu M,Chen B,et al.MobileNets:efficient convolutional neural networks for mobile vision applications[J].arXiv preprint arXiv:1704.04861,2017.
[9]Hochreiter S,Schmidhuber J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[10]Cai M,Liu J.Maxout neurons for deep convolutional and LSTM neural networks in speech recognition[J].Speech Communication,2016,77:53-64.
[11]Donahue J,Hendricks L A,Guadarrama S,et al.Long-term recurrent convolutional networks for visual recognition and description[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2015:2625-2634.
[12]Wang L M,Xiong Y J,Wang Z,et al.Temporal segment networks:towards good practices for deep action recognition[M].Cham:Springer International Publishing,2016:20-36.
[13]Ji S W,Xu W,Yang M,et al.3Dconvolutional neural networks for human action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):221-231.
[14]Qin Y,Mo L F,Guo W K,et al.Combination of 3DCNNs and LSTMs and its application in activity recognition[J].Measurement&Control Technology,2017,36(2):28-32.秦阳,莫凌飞,郭文科,等.3DCNNs与LSTMs在行为识别中的组合及其应用[J].测控技术,2017,36(2):28-32.
[15]Huang G,Liu Z,Maaten L V D,et al.Densely connected convolutional networks[C]∥IEEEConference on Computer Vision and Pattern Recognition,2017:2261-2269.
[16]Brox T,Bruhn A,Papenberg N,et al.High accuracy optical flow estimation based on a theory for warping[C]∥European Conference on Computer Vision,2004:25-36.
[17]Qu L,Wang K R,Chen L L,et al.Fast road detection based on RGBD images and convolutional neural network[J].Acta Optica Sinica,2017,37(10):101003.曲磊,王康如,陈利利,等.基于RGBD图像和卷积神经网络的快速道路检测[J].光学学报,2017,37(10):101003.
[18]Yang Y,Saleemi I,Shah M.Discovering motion primitives for unsupervised grouping and one-shot learning of human actions,gestures,and expressions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(7):1635-1648.
[19]Lu T R,Yu F Q,Yang H Z,et al.Human action recognition based on dense trajectories with saliency detection[J].Computer Engineering and Applications,2018,54(14):163-167.鹿天然,于凤芹,杨慧中,等.基于显著性检测和稠密轨迹的人体行为识别[J].计算机工程与应用,2018,54(14):163-167.