用户名: 密码: 验证码:
融合语音信号和脑电信号的多模态情感识别
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Multimodal emotion recognition for the fusion of speech and EEG signals
  • 作者:马江河 ; 孙颖 ; 张雪英
  • 英文作者:MA Jianghe;SUN Ying;ZHANG Xueying;College of Information and Computer,Taiyuan University of Technology;
  • 关键词:语音信号 ; 脑电信号 ; 特征融合 ; 决策融合
  • 英文关键词:speech signals;;electroencephalo-graph signals;;feature fusion;;decision fusion
  • 中文刊名:XDKD
  • 英文刊名:Journal of Xidian University
  • 机构:太原理工大学信息与计算机学院;
  • 出版日期:2018-07-27 16:09
  • 出版单位:西安电子科技大学学报
  • 年:2019
  • 期:v.46
  • 基金:国家自然科学基金(61371193);; 山西省青年科技研究基金(2013021016-2)
  • 语种:中文;
  • 页:XDKD201901026
  • 页数:8
  • CN:01
  • ISSN:61-1076/TN
  • 分类号:149-156
摘要
为构造有效的情感识别系统,通过声音刺激分别诱发出高兴、悲伤、生气以及中性4种情感,并采集相应的语音信号和脑电信号。首先,利用相空间重构技术提取脑电信号和语音信号的非线性几何特征和非线性属性特征,并结合两者的基本特征分别实现情感识别;然后,通过构建基于限制玻尔兹曼机的特征融合算法,从特征层融合的角度实现多模态情感识别;最后,利用二次决策算法从决策融合的角度构建多模态情感识别系统。实验结果显示,从特征融合的角度构建的多模态情感识别系统相比语音信号和脑电信号情感整体识别率,分别提高1.08%和2.75%;从决策融合的角度构建的多模态情感识别系统相比语音信号和脑电信号情感整体识别率,分别提高6.52%和8.19%;决策融合相比特征融合构建的多模态情感识别系统整体识别效果更优。因此,融合语音信号和脑电信号等不同来源的情感数据可以构造出更有效的情感识别系统。
        To construct an effective emotion recognition system,the emotions of joy,sadness,anger and neutrality are induced by sound stimulation,and the corresponding speech and EEG signals are collected.First,this paper extracts the nonlinear geometric feature and nonlinear attribute feature of EEG and speech signals by phase space reconstruction respectively,and the emotion recognition is realized by combining the basic features.Then,a feature fusion algorithm based on the Restricted Boltzmann Machine is constructed to realize multimodal emotion recognition from the perspective of feature fusion.Finally,a multimodal emotion recognition system is constructed through decision fusion by using the quadratic decision algorithm.The results show that the overall recognition rate of the multimodal emotion recognition system constructed by feature fusion is 1.08% and 2.75% higher than that of speech signals and that of EEG signals respectively,and that the overall recognition rate of the multimodal emotion recognition system constructed by decision fusion is 6.52% and 8.19% higher than that of speech signals and that of EEG signals respectively.The overall recognition effect of the multimodal emotion recognition system based on decision fusion is better than that of feature fusion.A more effective emotion recognition system can be constructed by combining the emotional data of different channels such as speech signals and EEG signals.
引文
[1]ZHANG S,ZHANG S,HUANG T,et al.Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching[J].IEEE Transactions on Multimedia,2018,20(6):1576-1590.
    [2]黄程韦,金赟,王青云,等.基于语音信号与心电信号的多模态情感识别[J].东南大学学报:自然科学版,2010,40(5):895-900.HUANG Chengwei,JIN Yun,WANG Qingyun,et al.Multimodal Emotion Recognition Based on Speech and ECGSignals[J].Journal of Southeast University:Natural Science Edition,2010,40(5):895-900.
    [3]PEREZ-GASPAR L A,CABALLERO-MORALES S O,TRUJILLO-ROMERO F.Multimodal Emotion Recognition with Evolutionary Computation for Human-robot Interaction[J].Expert Systems with Applications,2016,66:42-61.
    [4]HUANG X,KORTELAINEN J,ZHAO G,et al.Multi-modal Emotion Analysis from Facial Expressions and Electroencephalogram[J].Computer Vision and Image Understanding,2016,147:114-124.
    [5]刘仁怡.基于脑电和语音信号的情感识别研究[D].天津:天津师范大学,2015.
    [6]姚慧,孙颖,张雪英.情感语音的非线性动力学特征[J].西安电子科技大学学报,2016,43(5):167-172.YAO Hui,SUN Ying,ZHANG Xueying.Research on Nonlinear Dynamics Features of Emotional Speech[J].Journal of Xidian University,2016,43(5):167-172.
    [7]HATAMIKIA S,NASRABADI A M.Recognition of Emotional States Induced by Music Videos Based on Nonlinear Feature Extraction and SOM Classification[C]//Proceedings of the 2014 21st Iranian Conference on Biomedical Engineering.Piscataway:IEEE,2014:333-337.
    [8]孙颖,宋春晓.相空间重构的情感语音特征提取及优化[J].西安电子科技大学学报,2017,44(6):162-168.SUN Ying,SONG Chunxiao.Emotional Speech Feature Extraction and Optimization of Phase Space Reconstruction[J].Journal of Xidian University,2017,44(6):162-168.
    [9]TAKENS F.Detecting Strange Attractors in Turbulence[C]//Lecture Notesin Mathematics:898.Heidelberg:SpringerVerlag,1981:366-381.
    [10]LAHMIRI S.Generalized Hurst Exponent Estimates Differentiate EEG Signals of Healthy and Epileptic Patients[J].Physica A:Statistical Mechanics and Its Applications,2018,490:378-385.
    [11]KORDA A I,ASVESTAS P A,MATSOPOULOS G K,et al.Automatic Identification of Eye Movements Using the Largest Lyapunov Exponent[J].Biomedical Signal Processing and Control,2018,41:10-20.
    [12]WOLF A,SWIFT J B,SWINNEY H L,et al.Determining Lyapunov Exponents from a Time Series[J].Physica D:Nonlinear Phenomena,1985,16(3):285-317.
    [13]DAVID G F,PAU M M,JORGE J N,et al.Noisy EEG Signals Classification Based on Entropymetrics.Performance Assessment Using First and Second Generation Statistics[J].Computers in Biology and Medicine,2017,87:141-151.
    [14]HARRINGTON P D B.Feature Expansion by a Continuous Restricted Boltzmann Machine for Near-infrared Spectrometric Calibration[J].Analytica Chimica Acta,2018,1010:20-28.
    [15]ZHENG H,ZHANG S,SUN X.Classification Recognition of Anchor Rod Based on PSO-SVM[C]//Proceedings of the2017 29th Chinese Control and Decision Conference.Piscataway:IEEE,2017:2207-2212.
    [16]TENG K,WANG J.Classification Related Manifold Dimension Estimation with Restricted Boltzmann Machine[C]//Proceedings of the 2013 7th International Conference on Image and Graphics.Washington:IEEE Computer Society,2013:857-862.
    [17]CAI X,HU S,LIN X.Feature Extraction Using Restricted Boltzmann Machine for Stock Price Prediction[C]//Proceedings of the 2012IEEE International Conference on Computer Science and Automation Engineering.Washington:IEEE Computer Society,2012:80-83.
    [18]ZHANG G,JIA S,LI X,et al.Weighted Score-level Feature Fusion Based on Dempster-Shafer Evidence Theory for Action Recognition[J].Journal of Electronic Imaging,2018,27(1):013021.
    [19]宋静,张雪英,孙颖,等.基于模糊综合评价法的情感语音数据库的建立[J].现代电子技术,2016,39(13):51-54.SONG Jing,ZHANG Xueying,SUN Ying,et al.Establishment of Emotional Speech Database Based on Fuzzy Comprehensive Evaluation Method[J].Modern Electronics Technique,2016,39(13):51-54.
    [20]畅江,张雪英,张奇萍,等.不同语种及非言语情感声音的ERP研究[J].清华大学学报:自然科学版,2016,56(10):1131-1136.CHANG Jiang,ZHANG Xueying,ZHANG Qiping,et al.ERP Research on the Emotional Voice for Different Languages and Non-speech Utterances[J].Journal of Tsinghua University:Science and Technology,2016,56(10):1131-1136.
    [21]LI K,WU Y,NAN Y,et al.Hierarchical Multi-class Classification in Multimodal Spacecraft Data Using DNN and Weighted Support Vector Machine[J].Neurocomputing,2017,259:55-65.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700