摘要
针对影评数据的情感分析,提出基于Word2vec和多分类器的情感分类方法.首先在对评论数据进行预处理的基础上,训练Word2vec模型,将词表示为词向量;其次结合随机森林和朴素贝叶斯多项式模型完成影评数据的情感分类;最后在Kaggle竞赛公开的影评数据集上进行实验.结果表明,Word2vec可有效捕捉词的语义,显著提高情感分类算法的性能.
The article raises an emotion classification method based on word2vec and multiple classifier aiming at emotional analysis on film review data. Firstly, on the basis of preprocessing the comment data, the word2vec model is trained to represent words as word vectors; secondly, the emotion classification of the data is completed by combining random forest and naive Bayesian polynomial model; finally, the experiment is carried out on the open film review data set of the kaggle competition. The result shows that, word2vec can effectively capture the semanteme of words and improve the performance of affective classification algorithm.
引文
[1] 闫晓东,黄涛.基于情感词典的藏语文本句子情感分类[J].中文信息学报,2018,32(2):75-80.
[2] ZHAO Hua,JI Xiaowen,ZENG Qingtian,et al.A teaching evaluation method based on the sentiment classification [J].International Journal of Computing Science and Mathematics,2016,7(1):54-62.
[3] 李寿山,黄局仁.基于Stacking的组合分类方法的中文情感分类研究[J].中文信息学报,2010,24(5):56-62.
[4] 刘志明,刘鲁.基于机器学习的中文微博情感分类实证研究[J].计算机工程与应用,2012,48(1):1-4.
[5] 曹宇,王名扬,贺惠新.情感词典扩充的微博文本多元情感分类研究[J].情报杂志,2016,35(10):185-189.
[6] ZHOU Guangyou,XIE Zhiwen,HUANG Xiangji,et al.Bi-Transferring deep neural network for domain adaptation[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016:322-332.
[7] 林政,谭松波,程学旗.基于情感关键句抽取的情感分类研究[J].计算机研究与发展,2012,49(11):2376-2382.
[8] 廖祥文,谢媛媛,魏晶晶,等.基于卷积记忆网络的视角级微博情感分类[J].模式识别与人工智能,2018,31(3):219-229.
[9] ARAQUEO,CORCUERA-PLATAS I,SNCHEZ-RADA J F,et al.Enhancing deep learning sentiment analysis with ensemble techniques in social applications [J].Expert Systems with Applications,2017,77:236-246.
[10] TANG Duyu,QIN Bing,LIU Ting.Deep learning for sentiment analysis:successful approaches and future challenges [J].Wiley Interdisciplinary Reviews Data Mining & Knowledge Discovery,2015,5(6):292-303.
[11] 黄仁,张卫.基于word2vec的互联网商品评论情感倾向研究[J].计算机科学,2016,43(6A):387-389.
[12] BAI Xue,CHEN Fu,ZHAN Shaobin.A study on sentiment computing and classification of sina weibo with Word2vec [C]// IEEE International Congress on Big Data,2014:358-363.
[13] 李晓,解辉,李立杰.基于word2vec的句子语义相似度计算研究[J].计算机科学,2017,44(9):256-260.
[14] KIM W,KIM D,JANG H.Semantic extension search for documents using the Word2vec [J].Journal of the Korea Contents Association,2016,16(10):687-692.