基于机器学习与情感词典的文本主题概括及情感分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Text Topic Summary and Sentiment Analysis Based on Machine Learning and Sentiment Lexicon
  • 作者:宋祖康 ; 阎瑞霞 ; 辜丽琼
  • 英文作者:SONG Zu-kang;YAN Rui-xia;GU Li-qiong;School of Management,Shanghai University of Engineering and Technology;
  • 关键词:主题概括 ; Word2vec ; K-Means ; 情感分析
  • 英文关键词:topic summary;;Word2vec;;K-Means;;sentiment analysis
  • 中文刊名:RJDK
  • 英文刊名:Software Guide
  • 机构:上海工程技术大学管理学院;
  • 出版日期:2019-01-25 15:40
  • 出版单位:软件导刊
  • 年:2019
  • 期:v.18;No.198
  • 基金:国家自然科学基金项目(71301100);; 上海市教委科研创新项目(14YZ140)
  • 语种:中文;
  • 页:RJDK201904002
  • 页数:5
  • CN:04
  • ISSN:42-1671/TP
  • 分类号:10-14
摘要
作为社交网络重要载体,微博成为信息传播的重要平台,承载着公众情感表达及舆论传播的重要功能。对微博博文及评论作出主题概括及情感分析在网络管控、舆情监测及公众情绪引导方面具有重要的实践意义。提出一种基于机器学习与文本分析的主题概括及情感分析模型。以武汉理工大学研究生坠亡事件为话题,利用Word2vec将文本转化为词向量,并且通过机器学习聚类方法对舆情各个生命周期过程进行主题概括,采用基于词典文本分析方法,对评论文本进行多元情感分析,对表现突出的情感大类作细粒度分析,最终实现基于主题与情感分析的多元细粒度公众情感变化分析模型。该分析模型可在特定舆情事件下得出公众在各阶段的关注中心及情绪变化规律,实现舆情主题与情感变化的协同演化研究。
        As an important carrier of social networks,Weibo has become an important platform for information dissemination and it carries important functions of public sentiment expression and public opinion communication.The topic summary and sentiment analysis about Weibo blog posts and comments have important practical significance in network management,public opinion monitoring and public sentiment guidance.This paper proposes a topic summary and sentiment analysis model based on machine learning and text analysis.Taking the hot topic of postgraduate death in Wuhan University of Technology as an example,relevant texts were transformed into a word vector,and the clustering method in machine learning was used to summarize the various life cycle processes of the public opinion,using dictionary-based text analysis.The method makes a multi-emotion analysis on the comment texts,and makes a fine-grained analysis of the outstanding emotional categories,and finally realizes a multi-fine-grained public sentiment change analysis model based on the theme and sentiment analysis.The analysis model proposed in this paper can draw the public's attention center at each stage and the public's emotional changes at various stages under specific public opinion events,and realize the co-evolution research of the public topic and emotional changes.
引文
[1]李保利,杨星.基于LDA模型和话题过滤的研究主题演化分析[J].小型微型计算机系统,2012,33(12):2738-2743.
    [2]安璐,吴林.融合主题与情感特征的突发事件微博舆情演化分析[J].图书情报工作,2017,61(15):120-129.
    [3]薛炜明,侯霞,李宁.一种基于Word2vec的文本分类方法[J].北京信息科技大学学报2018,33(1):72-75.
    [4]李岩,韩斌,赵剑,等.基于短文本及情感分析的微博舆情分析[J].计算机应用与软件,2013,30(12):240-243.
    [5]王宏伟,刘勰,尹裴,等.基于语义分析的微博搜索[J].情报学报,2010(5):931-938.
    [6]张小倩.情感极性转移现象研究及应用[D].苏州:苏州大学,2012.
    [7]王文凯,王黎明,柴玉梅,等.基于卷积神经网络和Tree-LSTM的微博情感分析[J/OL].计算机应用与研究,2019,36(5):1-7.2018-03-09.http://www.arocmag.com/article/02-2019-05-007.html.
    [8]杜振雷.面向微博短文本的情感分析研究[D].北京:北京信息科技大学,2013.
    [9]冯成刚,田大钢.基于机器学习的微博情感分类研究[J].软件导刊,2018,17(6):58-61.
    [10]贾亚敏,安璐,李纲,城市突发事件网络信息传播时序变化规律研究[J].情报杂志,2015,34(4):91-96.
    [11]台湾大学NTUSD中文情感极性字典[EB/OL].https://download.csdn.net/download/huixion/9470816.
    [12]知网.HowNet情感字典[EB/OL].http://www.keenage.com/.
    [13]韩忠明,张玉沙,张慧,等.有效的中文微博短文本倾向性分类算法[J].计算机应用与软件,2012,29(10):89-93.
    [14]DAVISON B D.Structural link analysis and prediction in Microblogs[C].Proceedings of the 20th ACM Conference on Information and Knowledge Management,2011:1163-1168.
    [15]HANNON J,BENNETT M,SMYTH B.Recommending Twitter users to follow using content and collaborative filtering approaches[C].Proceedings of the 2010 ACM Conference on Recommender Systems,2010:199-206.
    [16]大连理工大学.大连理工大学情感词汇本体库[DB/OL].http://ir.dlut.edu.cn/group/detail/4.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700