视频序列中表情和姿态的双模态情感识别

设为首页

收藏本站

网站地图 | English | 公务邮箱

远程访问

NSTL服务站

视频序列中表情和姿态的双模态情感识别

详细信息查看全文 | 推荐本文 |

英文篇名：Dual-Modal Emotion Recognition Based on Facial Expression and Body Posture in Video Sequences
作者：姜明星 ; 胡敏 ; 王晓华 ; 任福继 ; 王浩文
英文作者：Jiang Mingxing;Hu Min;Wang Xiaohua;Ren Fuji;Wang Haowen;Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine,School of Computer and Information,Hefei University of Technology;Information and Service Department,Anhui Institute of International Business;Graduate School of Advanced Technology &Science,University of Tokushima;
关键词：图像处理 ; 表情 ; 姿态 ; 时空局部三值方向角模式 ; 云模型
英文关键词：image processing;;facial expression;;body posture;;spatiotemporal local ternary orientation pattern;;cloud model
中文刊名：JGDJ
英文刊名：Laser & Optoelectronics Progress
机构：合肥工业大学计算机与信息学院情感计算与先进智能机器安徽省重点实验室;安徽国际商务职业学院信息服务系;德岛大学先端技术科学教育部;
出版日期：2018-01-16 09:55
出版单位：激光与光电子学进展
年：2018
期：v.55;No.630
基金：国家自然科学基金(61672202,61502141);; 国家自然科学基金-深圳联合基金(U1613217);; 高校优秀青年骨干人才国内外访学研修(gxfx2017189);; 安徽高校自然科学研究(KJ2016A126)
语种：中文;
页：JGDJ201807020
页数：8
CN：07
ISSN：31-1690/TN
分类号：167-174

摘要

针对时空局部方向角模式应用到视频情感识别时,出现的特征稀疏、噪声敏感等问题,提出了一种新的特征提取算法——时空局部三值方向角模式(SLTOP)。考虑到表情和姿态特征的互补性,提出云加权决策融合的分类方法。对视频图像进行预处理,得到表情和姿态两种模态的序列;分别提取表情序列和姿态序列的SLTOP特征,并借鉴灰度矩阵思想解决特征直方图过于稀疏的问题;在决策分类阶段,引入云模型对表情和姿态两种模态进行云加权决策融合,实现双模态情感的最终识别。在FABO数据库中,表情和姿态单模态分别取得了92.21%和96.76%的平均识别率;与体积局部二值模式、三正交平面局部二值模式(LBP-TOP)、时空局部三值模式矩(TSLTPM)比较时,在表情模态上分别高约18.42%、22.01%、9.15%,而在姿态模态上分别高约26.59%、29.53%、1.98%。通过云加权融合得到平均识别率为97.54%,均高于其他实验得到的数据。所提出的SLTOP,对噪声和光照具有很好的稳健性。利用云模型的加权决策融合方法可以较好地发挥表情和姿态分类器的性能,得到较好的识别结果,与其他分类识别方法进行对比实验,结果同样表现出优越性。
Aiming at the problems of feature sparseness and noise sensitivity when the temporal-spatial local direction angle mode is applied to the video emotion recognition,we propose a new feature extraction algorithm,the spatiotemporal local ternary orientation pattern(SLTOP).Considering the complementarity of facial expression and posture characteristics in recognition,a classification method based on the cloud weighted decision fusion is proposed.The video image is preprocessed to obtain the sequence of the two modes of facial expression and gesture.For reducing the sparseness of the feature histogram,we extract the SLTOP feature of the sequences of expression and posture,learning from the idea of gray level co-occurrence matrix.In the stage of decision fusion,the cloud model is introduced to implement the cloud weighted decision fusion for the two modes of expression and posture making to realize the final recognition of dual-modal emotion.The average recognition rate of the single modal of facial expression and body posture in the FABO database is 92.21% and 96.76%,respectively.And they areapproximately 18.42%,22.01% and 9.15% higher in expression,respectively,when compared with the volume local binary mode,local binary mode three orthogonal planes(LBP-TOP)and temporal-spatial local ternary pattern moment(TSLTPM).In the single-posture modal,they are 26.59%,29.53%,1.98% higher,respectively.The average recognition rate obtained by cloud-weighted fusion is 97.54%,which is higher than that of other experiments.The proposed SLTOP has good robustness to the noise and illumination.The weighted decision fusion method of cloud model is used to greatly express the performance of two classifiers with expression and posture.The superiority of the recognition results in this paper is shown comparing with other classification methods.

引文

[1]Xia J,Pei D,Wang Q Z,et al.Face recognition based on local adaptive ternary derivative pattern coupled with Gabor feature[J].Laser&Optoelectronics Progress,2016,53(11):111004.夏军,裴东,王全州,等.融合Gabor特征的局部自适应三值微分模式的人脸识别[J].激光与光电子学进展,2016,53(11):111004.
    [2]Hong Y,Sun X X,Wang D,et al.Fast pose estimation method for unmanned aerial vehicle based on rectangular geometry feature[J].Chinese Journal of Lasers,2016,43(5):0508006.洪洋,孙秀霞,王栋,等.基于矩形几何特性的小型无人机快速位姿估计方法[J].中国激光,2016,43(5):0508006.
    [3]Huang Z W,Xue W T,Mao Q R,et al.Unsupervised domain adaptation for speech emotion recognition using PCANet[J].Multimedia Tools&Applications,2017,76(5):6785-6799.
    [4]TkalcˇicˇM,Odic'A,Ko2ir A.The impact of weak ground truth and facial expressiveness on affect detection accuracy from time continuous videos of facial expressions[J].Information Sciences,2013,249(16):13-23.
    [5]Zhao G,Pietikinen M.Dynamic texture recognition using local binary patterns with an application to facial expressions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(6):915-928.
    [6]Ojala T,Pietikinen M,MenpT.Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):971-987.
    [7]Mu Y D,Yan S C,Liu Y,et al.Discriminative local binary patterns for human detection in personal album[C]∥Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,2008:1-8.
    [8]Wang X H,Hou D Y,Hu M,et al.Dual-modality emotion recognition based on composite spatiotemporal features[J].Journal of Image and Graphics,2017,22(1):39-48.王晓华,侯登永,胡敏,等.复合时空特征的双模态情感识别[J].中国图象图形学报,2017,22(1):39-48.
    [9]Fu X F,Fu X J,Li J J,et al.Facial expression recognition using multiscale spatiotemporal local orientational pattern histogram projection in video sequences[J].Journal of Computer-Aided Design&Computer Graphics,2015,27(6):1060-1066.付晓峰,付晓鹃,李建军,等.视频序列中基于多尺度时空局部方向角模式直方图映射的表情识别[J].计算机辅助设计与图形学学报,2015,27(6):1060-1066.
    [10]Gunes H,Piccardi M.A bimodal face and body gesture database for automatic analysis of human nonverbal affective behavior[C]∥Proceedings of IEEE Internatioual Conference on Pattern Recognition,2006,4:1148-1153.
    [11]Gunes H,Piccardi M.Bi-modal emotion recognition from expressive face and body gestures[J].Journal of Network and Computer Applications,2007,30(4):1334-1345.
    [12]Gunes H,Piccardi M.Fusing face and body gesture for machine recognition of emotions[C]∥Proceedings of IEEE International Workshop on Robots and Human Interactive Communication,2005:306-311.
    [13]Gunes H,Piccardi M.Affect recognition from face and body:early fusion vs.late fusion[C]∥Proceedings of IEEE International Conference on Systems,Man and Cybernetics,2005,4:3437-3443.
    [14]Chen S Z,Tian Y L,Liu Q S,et al.Recognizing expressions from face and body gesture by temporal normalized motion and appearance features[J].Image and Vision Computing,2013,31(2):175-185.
    [15]Yan J J,Zheng W M,Xin M H,et al.Bimodal emotion recognition based on body gesture and facial expression[J].Journal of Image and Graphics,2013,18(9):1101-1106.闫静杰,郑文明,辛明海,等.表情和姿态的双模态情感识别[J].中国图象图形学报,2013,18(9):1101-1106.
    [16]Wang Y P,Hu Y H,Lei W H,et al.Aircraft target classification method based on texture feature of laser echo time-frequency image[J].Acta Optica Sinica,2017,37(11):1128004.王云鹏,胡以华,雷武虎,等.基于激光回波时频图纹理特征的飞机目标分类方法[J].光学学报,2017,37(11):1128004.
    [17]Song Y J,Li D Y,Yang X Z,et al.Reliability evaluation of electronic products based on cloud models[J].Acta Electronica Sinica,2000,28(12):74-76.宋远骏,李德毅,杨孝宗,等.电子产品可靠性的云模型评价方法[J].电子学报,2000,28(12):74-76.
    [18]Zeng F X,Li L,Diao X P.Iterative closest point algorithm registration based on curvature features[J].Laser&Optoelectronics Progress,2017,54(1):011003.曾繁轩,李亮,刁鑫鹏.基于曲率特征的迭代最近点算法配准研究[J].激光与光电子学进展,2017,54(1):011003.
    [19]Liu H J,Liu Z,Jiang W L,et al.Approach based on cloud model and vector neural network for emitter identification[J].Acta Electronica Sinica,2010,38(12):2797-2804.刘海军,柳征,姜文利,等.基于云模型和矢量神经网络的辐射源识别方法[J].电子学报,2010,38(12):2797-2804.
    [20]Tan X Y,Triggs B.Enhanced local texture feature sets for face recognition under difficult lighting conditions[J].IEEE Transactions on Image Processing,2010,19(6):1635-1650.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700