基于改进卷积神经网络的多视角人脸表情识别

英文篇名：Multi-view facial expression recognition based on improved convolutional neural network
作者：钱勇生 ; 邵洁 ; 季欣欣 ; 李晓瑞 ; 莫晨 ; 程其玉
英文作者：QIAN Yongsheng;SHAO Jie;JI Xinxin;LI Xiaorui;MO Chen;CHENG Qiyu;College of Electronics and Information Engineering, Shanghai University of Electric Power;
关键词：多视角人脸表情识别 ; MVFE-LightNet ; 残差网络 ; 深度可分离卷积 ; 压缩和奖惩网络模块 ; 空间金字塔池化
英文关键词：multi-view facial expression recognition;;Multi-View Facial Expression Lightweight Network (MVFE-LightNet);;residual network;;depthwise separable convolution;;Sequeeze-and-Excitation block;;spatial pyramid pooling
中文刊名：JSGG
英文刊名：Computer Engineering and Applications
机构：上海电力学院电子与信息工程学院;
出版日期：2018-12-15
出版单位：计算机工程与应用
年：2018
期：v.54;No.919
基金：国家自然科学基金青年科学基金项目(No.61302151,No.61401268);; 上海市自然科学基金(No.15ZR1418400)
语种：中文;
页：JSGG201824003
页数：8
CN：24
分类号：17-24

摘要

人脸表情识别是计算机视觉领域的研究热点之一。针对自然状态下的人脸存在多视角变化、脸部信息缺失等问题,提出了一种基于MVFE-LightNet(Multi-View Facial Expression Lightweight Network)的多视角人脸表情识别方法。首先,在残差网络的基础上设计卷积网络提取不同视角下的表情特征,引入深度可分离卷积来减少网络参数。其次,嵌入压缩和奖惩网络模块学习特征权重,利用特征重新标定方式提高网络表示能力,并通过加入空间金字塔池化增强网络的鲁棒性。最后,为了进一步优化识别结果,采用AdamW(Adam with Weight decay)优化方法使网络模型加速收敛。在RaFD、BU-3DFE和Fer2013表情库上的实验表明,该方法具有较高的识别率,且减少网络计算时间。
Facial expression recognition is attracting growing interest in the field of computer vision. A multi-view facial expression recognition method based on Multi-View Facial Expression Lightweight Network(MVFE-LightNet)is proposed to slove some existing problems, such as multi-view facial change and facial information loss in the natural state.Firstly, the convolutional network is designed to extract the facial expression features from different perspectives based on the residual network, and depthwise separable convolution are introduced to reduce the network parameters. Secondly,Sequeeze-and-Excitation block is embedded to learn feature weights, using feature re-calibration to improve network representation, and the robustness of the network is enhanced by adding spatial pyramid pooling. Finally, for further optimizing the recognition results, the Adam with weight decay optimization method is used to accelerate the convergence of the network model. Experiments on RaFD, BU-3 DFE and Fer2013 expression database show that the method has state-of-the-art classification accuracy and reduces network computing time.

引文

[1]汤春明,赵红波,张小玉.基于流形学习2D-LDLPA的东亚人脸表情识别算法[J].计算机工程与应用,2018,54(17):146-150.
    [2]Bartlett M S,Littlewort G,Fasel I,et al.Real time face detection and facial expression recognition:development and applications to human computer interaction[C]//2003Conference on Computer Vision and Pattern Recognition Workshop,Madison,Wisconsin,USA,2003:53.
    [3]吉训生,王荣飞.自适应加权LGCP与快速稀疏表示的面部表情识别[J].计算机工程与应用,2017,53(1):158-162.
    [4]Hesse N,Gehrig T,Gao Hua,et al.Multi-view facial expression recognition using local appearance features[C]//Proceedings of the 21st International Conference on Pattern Recognition,2012:3533-3536.
    [5]刘娟,胡敏,黄忠.基于区域NBPR特征及可信度修正的人脸表情识别[J].计算机科学与探索,2017,11(3):459-467.
    [6]Santra B,Mukherjee D P.Local saliency-inspired binary patterns for automatic recognition of multi-view facial expression[C]//2016 IEEE International Conference on Image Processing,Phoenix,AZ,2016:624-628.
    [7]Zheng Wenming,Tang Hao,Lin Zhouchen,et al.Emotion recognition from arbitrary view facial images[C]//European Conference on Computer Vision,2010:490-503.
    [8]Seo M,Chen Y W.Joint subspace learning for reconstruction of 3D facial dynamic expression from single image[C]//2016 9th International Congress on Image and Signal Processing,Bio Medical Engineering and Informatics,2016:820-824.
    [9]常亮,邓小明,周明全,等.图像理解中的卷积神经网络[J].自动化学报,2016,42(9):1300-1312.
    [10]Lu Z,Jiang X,Kot C C.Deep coupled ResNet for lowresolution face recognition[J].IEEE Signal Processing Letters,2018,PP(99):1.
    [11]陈玄,朱荣,王中元.基于融合卷积神经网络模型的手写数字识别[J].计算机工程,2017,43(11):187-192.
    [12]郭克友,贾海晶,郭晓丽.卷积神经网络在车牌分类器中的应用[J].计算机工程与应用,2017,53(14):209-213.
    [13]张珂,高策,郭丽茹,等.非受限条件下多级残差网络人脸图像年龄估计[J].计算机辅助设计与图形学学报,2018(2):346-353.
    [14]江大鹏,杨彪,邹凌.基于LBP卷积神经网络的面部表情识别[J].计算机工程与设计,2018,39(7):1971-1977.
    [15]孙晓,潘汀,任福继.基于ROI-KNN卷积神经网络的面部表情识别[J].自动化学报,2016,42(6):883-891.
    [16]龙敏,佟越洋.应用卷积神经网络的人脸活体检测算法研究[J].计算机科学与探索,2018,12(10):1658-1670.
    [17]胡敏,张柯柯,王晓华,等.结合滑动窗口动态时间规整和CNN的视频人脸表情识别[J].中国图象图形学报,2018,23(8):1144-1153.
    [18]Rudovic O,Patras I,Pantic M.Coupled Gaussian process regression for pose-invariant facial expression recognition[C]//European Conference on Computer Vision,2010:350-363.
    [19]Dapogny A,Bailly K,Dubuisson S.Dynamic pose-robust facial expression recognition by multi-view pairwise conditional random forests[J].IEEE Transactions on Affective Computing,2017,PP(99).
    [20]Zhang Tong,Zheng Wenming,Cui Zhen,et al.A deep neural network-driven feature learning method for multiview facial expression recognition[J].IEEE Transactions on Multimedia,2016,18(12):2528-2536.
    [21]Jung H,Lee S,Yim J,et al.Joint fine-tuning in deep neural networks for facial expression recognition[C]//IEEE International Conference on Computer Vision,2015:2983-2991.
    [22]Lopes A T,Aguiar E D,Souza A F D,et al.Facial expression recognition with convolutional neural networks:coping with few data and the training sample order[J].Pattern Recognition,2017,61:610-628.
    [23]He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,Las Vegas,USA,2016:770-778.
    [24]胡挺,祝永新,田犁,等.面向移动平台的轻量级卷积神经网络架构[J/OL].计算机工程[2018-11-21].http://kns.cnki.net/kcms/detail/31.1289.TP.20180203.1033.004.html.
    [25]Howard A G,Zhu M,Chen B,et al.MobileNets:efficient convolutional neural networks for mobile vision applications[EB/OL].[2017].https://arxiv.org/abs/1704.04861.
    [26]Hu J,Shen L,Sun G.Squeeze-and-excitation networks[EB/OL].[2017].https://arxiv.org/abs/1709.01507.
    [27]Ioffe S,Szegedy C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//International Conference on Machine Learning,2015:448-456.
    [28]Glorot X,Bordes A,Bengio Y.Deep sparse rectifier neural networks[C]//International Conference on Artificial Intelligence and Statistics,2012:315-323.
    [29]He K,Zhang X,Ren S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis&Machine Intelligence,2015,37(9):1904-1916.
    [30]Wilson A C,Roelofs R,Stern M,et al.The marginal value of adaptive gradient methods in machine learning[EB/OL].[2017].https://arxiv.org/abs/1705.08292.
    [31]Loshchilov I,Hutter F.Fixing weight decay regularization in adam[EB/OL].[2017].https://arxiv.org/abs/1711.05101.
    [32]Xiang J,Zhu G.Joint face detection and facial expression recognition with MTCNN[C]//2017 4th International Conference on Information Science and Control Engineering,Changsha,2017:424-427.
    [33]Albawi S,Mohammed T A,Al-Zawi S.Understanding of a convolutional neural network[C]//2017 International Conference on Engineering and Technology,Antalya,Turkey,2017:1-6.
    [34]Moore S,Bowden R.Local binary patterns for multi-view facial expression recognition[J].Computer Vision&Image Understanding,2011,115(4):541-558.
    [35]Zheng W,Tang H,Lin Z,et al.A novel approach to expression recognition from non-frontal face images[C]//IEEE International Conference on Computer Vision,2011:1901-1908.
    [36]Zheng Wenming.Multi-view facial expression recognition based on group sparse reduced-rank regression[J].IEEE Transactions on Affective Computing,2014:71-85.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700