基于情绪特定词向量的情绪分类算法

英文篇名：Emotion Classification Algorithm Based on Emotion-specific Word Embedding
作者：张璐 ; 沈忱林 ; 李寿山
英文作者：ZHANG Lu;SHEN Chen-lin;LI Shou-shan;School of Computer Science & Technology,Soochow University;
关键词：情感分析 ; 情绪分类 ; 词向量
英文关键词：Sentiment analysis;;Emotion classification;;Word Embedding
中文刊名：JSJA
英文刊名：Computer Science
机构：苏州大学计算机科学与技术学院;
出版日期：2019-06-15
出版单位：计算机科学
年：2019
期：v.46
基金：国家自然科学基金(61331011,61375073)资助
语种：中文;
页：JSJA2019S1019
页数：5
CN：S1
ISSN：50-1075/TP
分类号：103-107

摘要

情绪分析是自然语言处理领域的一个研究热点,其通过分析人们发布的文本推测人们的主观感受。情绪分类是情绪分析中的一个基本任务,旨在判断一个文本的情绪类别。对情绪分类来说,词语的表示具有决定性的作用。许多现有的词向量学习算法只对词语的上下文语义信息进行建模,而忽略了词语的情绪信息,这样会导致上下文相似但情绪相反的词语有相似的词向量。为了解决该问题,通过构建一个由两个基本网络(即文档-词网络和情绪图标-词网络)组成的异构网络来学习情绪特定的词向量。最后,在标注样本上训练一个LSTM分类器。实验结果表明了所提情绪特定词向量学习算法的有效性。
Emotion analysis is a hot research issue in the field of NLP,and it infersthe feelings of individuals through analyzing the text they have published.Emotion classification is a fundamental task in emotion analysis,which aims to determine the emotion categories in a piece of text.The representation of words is a critical prerequisite for emotion classification.Many intuitive choices of learning word embedding are available,but these word embedding algorithms typically model the syntactic context of words but ignore the emotion information relevant to words.As a result,words with opposite emotion but similar syntactic context tend to be represented as close vectors.To address the problem,this paper proposed a a heterogeneous network composed of two basic networks,i.e.,document-word network and emoticon-word network to learn emotion-specific word embedding.Finally,an LSTM network was trained on the labeled data.Empirical studies demonstrate the effectiveness of the proposed approach to learn emotion-specific word embedding.

引文

[1] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality [J].Advances in Neural Information Processing Systems,2013,26:3111-3119.
    [2] COLLOBERT R,WESTON J,BOTTOU L,et al.Natural language processing(almost) from scratch [J].Journal of Machine Learning Research,2011,12(1):2493-2537.
    [3] TURIAN J,RATINOV L,BENGIO Y.Word representations:a simple and general method for semi-supervised learning[C]//Proceedings of the Meeting of the Association for Computational Linguistics.2010:384-394.
    [4] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space [J].arXiv:1301.3781.
    [5] TANG J,QU M,WANG M,et al.LINE:large-scale information network embedding[C]//Proceedings of the International World Wide Web Conference.2015:1067-1077.
    [6] 黄磊,李寿山,周国栋.基于句法信息的微博情绪识别方法研究 [J].计算机科学,2017,44(2):244-249.
    [7] LIU H H,LI S S,ZHOU G D,et al.Joint modeling of news reader’s and comment writer’s emotions[C]//Proceedings of the Meeting of the Association for Computational Linguistics.2013:511-515.
    [8] ABDUL-MAGEED M,UNGAR L.EmoNet:fine-grained emotion detection with gated recurrent neural networks[C]//Proceedings of the Meeting of the Association for Computational Linguistics.2017:718-728.
    [9] LI S S,HUANG L,WANG R,et al.Sentence-level emotion classification with label and context dependence[C]//Procee-dings of the Meeting of the Association for Computational Linguistics.2015:1045-1053.
    [10] KOZAREVA Z,NAVARRO B,VAZQUEZ S,et al.UA-ZBSA:a headline emotion classification through web information[C]//Proceedings of theInternational Workshop on Semantic Evaluations.Association for Computational Linguistics.2007:334-337.
    [11] WEN S,WAN X.Emotion classification in microblog texts using class sequential rules[C]//Proceedings of theAAAI Conference on Artificial Intelligence.2014:187-193.
    [12] LI S S,HUANG L,WANG R,et al.Sentence-level emotion classification with label and context dependence[C]//Procee-dings of the Meeting of the Association for Computational Linguistics.2015:1045-1053.
    [13] ALM C C,ROTH D,SPROAT R.Emotions from text:machine learning for text-based emotion prediction[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.2005:579-586.
    [14] LI C X,WU H M,JIN Q.Emotion classification of Chinese microblog text via fusion of bow and evector feature representations [C]//Communications in Computer and Information Scie-nce.2014:217-228.
    [15] LI S S,XU J,ZHANG D,et al.Two-view label propagation to semi-supervised reader emotion classification[C]//Proceedings of theInternational Conference on Computational Linguistics.2016:2647-2655.
    [16] BENGIO Y,DUCHARME R,VINCENT P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155.
    [17] MNIH A,HINTON G.A scalable hierarchical distributed language model [C]//Proceedings of theInternational Conference on Neural Information Processing Systems.2008:1081-1088.
    [18] SOCHER R,BAUER J,MANNING C D,et al.Parsing with compositional vector grammars[C]//Proceedings of the Mee-ting of the Association for Computational Linguistics.2013:455-165.
    [19] TANG D Y,QIN B,LIU T,et al.Learning sentence representa- tion for emotion classification on microblogs[C]//Proceedings of the Meeting of theNatural Language Processing and Chinese Computing.2013:212-223.
    [20] XU R F,CHEN T,XIA Y Q,et al.Word embedding composition for data imbalances in sentiment and emotion classification [J].Cognitive Computation,2015,7(2):226-240.
    [21] WANG Z Q,ZHANG Y,LEE S Y M,et al.A bilingual attention network for code-switched emotion prediction[C]//Proceedings of theInternational Conference on Computational Linguistics.2016:1624-1634.
    [22] LABUTOV I,LIPSON H.Re-embedding words[C]//Procee- dings of the Meeting of the Association for Computational Linguistics.2013:489-493.
    [23] HUANG L,LI S S,ZHOU G D.Emotion corpus construction on microblog[C]//Proceedings of the Chinese Lexical Semantics Workshop.2015:204-212.
    [24] NIU F,RECHT B,RE C,et al.Hogwild:a lock-free approach to parallelizing stochastic gradient descent[C]//Proceedings of theInternational Conference on Neural Information Processing Systems.2011:693-701.
    [25] TANG J,QU M,MEI Q Z.PTE:predictive text embedding through large-scale heterogeneous text networks[C]//Procee-dings of the Knowledge Discovery in Database.2015:1165-1174.
    [26] HOCHREITER S,SCHMIDHUBER J.Long short-term memory [J].Neural Computation,1997,9(8):1735-1780.
    [27] GRAVES A.Generating sequences with recurrent neural net- works [J].arXiv:1308.0850.
    1)http://weibo.com/
    1)https://radimrehurek.com/gensim

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700