基于文本摘要的影评评分预测研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Rating Prediction of Movie Reviews Based on Text Summaries
  • 作者:邱秀连 ; 邹珞彬 ; 王峥
  • 英文作者:QIU Xiulian;ZOU Luobin;WANG Zheng;Nanjing Fiberhome Software Technology Co.,Ltd.;Wuhan Research Institute of Posts and Telecommunications;
  • 关键词:文本摘要 ; TextRank ; 支持向量机 ; 情感分析 ; 评分预测
  • 英文关键词:text summarization;;TextRank;;SVM;;sentiment analysis;;rating prediction
  • 中文刊名:JSSG
  • 英文刊名:Computer & Digital Engineering
  • 机构:南京烽火软件科技有限公司;武汉邮电科学研究院;
  • 出版日期:2019-01-20
  • 出版单位:计算机与数字工程
  • 年:2019
  • 期:v.47;No.351
  • 语种:中文;
  • 页:JSSG201901032
  • 页数:6
  • CN:01
  • ISSN:42-1372/TP
  • 分类号:151-156
摘要
电影是世界上最重要的娱乐方式之一,人们通常想通过电影评分或者评论来了解一部电影的好坏。目前互联网上有数千万篇电影评论,需要对这些影评进行探索、分析和总结,以便人们能够更好地做出观影选择。实验在豆瓣长影评的背景下对评分预测进行了分析。为了实现这一目标,在综合了主题相似度、评论句位置等特征的基础上,利用情感分析加入句子的情感特征,提出了改进TextRank的文本摘要算法,采用基于词袋模型的SVM分类器进行评分预测。实验结果表明,压缩率在20%至50%之间的影评摘要文本与完整影评相比得到的准确率基本相同或者更高,证明文本摘要适用于长影评评分预测问题。
        Movie is one of the most important ways of entertainment in the world. People often want to know the opinions of the movie in advance either by ratings or looking through the movie reviews. There are tens of millions of reviews available and these re-views need to be explored,analyzed and organized for a better decision making. In this paper,the prediction of rating is analyzed inthe context of long movie reviews in Douban. For achieving this objective,an improved method is proposed on the basis of TextRank.The method takes some important features into consideration including sentiment analysis. The method adopts a Support Vector Ma-chine(SVM)learning algorithm,relying on a bag-of-words model. The results obtained demonstrate that text summaries are appropriate for the rating prediction problem.
引文
[1]Jiménez F R,Mendoza N A. Too Popular to Ignore:TheInfluence of Online Reviews on Purchase Intentions ofSearch and Experience Products[J]. Journal of InteractiveMarketing,2013,27(3):226-235.
    [2]闫强,孟跃.在线评论的感知有用性影响因素——基于在线影评的实证研究[J].中国管理科学,2013(S1):126-131.YAN Qiang,MENG Yue.Factors Affecting the PerceivedUsefulness of Online Reviews—An Empirical Study Basedon Online Film Reviews[J]. Chinese Journal of Manage-ment Science,2013(S1):126-131.
    [3]戴和忠.网络推荐和在线评论对数字内容商品体验消费的整合影响及实证研究[D].杭州:浙江大学,2014.DAI Hezhong. Empirical Analysis of the Integration Im-pact of Online Recommendation and Online Reviews onthe Consumption of Digital Content Products[D]. Hang-zhou:Zhejiang University,2014.
    [4]武鹏飞,闫强.在线评论对社交网络中电子口碑采纳的影响研究[J].北京邮电大学学报(社会科学版),2015,17(1):52-61.WU Pengfei,YAN Qiang. Impact of Online Reviews onAdoption of Electronic Word-of-Mouth in Social Networks[J]. Journal of Beijing University of Posts and Telecommu-nications(Social Sciences Edition),2015,17(1):52-61.
    [5]高祎璠,余文喆,晁平复,等.基于评论分析的评分预测与推荐[J].华东师范大学学报(自然科学版),2015(3):80-90.GAO Yifan,YU Wenzhe,CHAO Pingfu,et al. Analyzingreviews for rating prediction and item recommendation[J].Journal of East China Normal University(Natural Sci-ence),2015(3):80-90.
    [6]胡淼元.电影评分数据分析及用户行为偏好建模[D].昆明:云南大学,2016.HU Miaoyuan. Analysis of Movie Rating Data and UserPreference Model[D]. Kunming:Yunnan University,2016.
    [7]穆云磊,周春晖,俞东进.基于文档向量和回归模型的评分预测框架[J].计算机时代,2016(5):24-29.MU Yunlei,ZHOU Chunhui,YU Dongjin. A rating predic-tion framework based on distributed representation of doc-ument and regression model[J]. Computer Era,2016(5):24-29.
    [8]张林,钱冠群,樊卫国,等.轻型评论的情感分析研究[J].软件学报,2014(12):2790-2807.ZHANG Lin,QIAN Guanqun,FAN Weiguo,et al. Senti-ment Analysis Based on Light Reviews[J]. Journal of Soft-ware,2014(12):2790-2807.
    [9]杨萌萌,黄浩,程露红,等.基于LDA主题模型的短文本分类[J].计算机工程与设计,2016(12):3371-3377.YANG Mengmeng,HUANG Hao,CHENG Luhong,et al.Short text classification based on LDA topic model[J].Computer Engineering and Design,2016(12):3371-3377.
    [10]金鑫.基于朴素贝叶斯的文档级情感分析[D].大连:大连理工大学,2013.JIN Xin. Document Level Sentiment Analysis Based onNaive Bayes Algorithm[D]. Dalian:Dalian University ofTechnology,2013.
    [11]Sankarasubramaniam Y,Ramanathan K,Ghosh S. Textsummarization using Wikipedia[J]. Information Process-ing and Management,2014,50(3):443-461.
    [12]曹洋.基于TextRank算法的单文档自动文摘研究[D].南京:南京大学,2016.CAO Yang. Single document automatic summarizationbased on TextRank algorithm[D]. Nanjing:Nanjing Uni-versity,2016.
    [13]林莉媛,王中卿,李寿山,等.基于PageRank的中文多文档文本情感摘要[J].中文信息学报,2014(2):85-90.LIN Liyuan,WANG Zhongqing,LI Shoushan,et al.Chi-nese Multi-Document Opinion Summarization via PageR-ank[J]. Journal of Chinese Information Processing,2014(2):85-90.
    [14]宁建飞,刘降珍.融合Word2vec与TextRank的关键词抽取研究[J].现代图书情报技术,2016(6):20-27.NING Jianfei,LIU Jiangzhen. Using Word2vec with Tex-tRank to Extract Keywords[J]. New Technology of Li-brary and Information Service,2016(6):20-27.
    [15]Yu S,Su J,Li P,et al. Towards High Performance Text Mining:A TextRank-based Method for Automatic TextSummarization[J]. International Journal of Grid&HighPerformance Computing,2016,8(2):58-75.
    [16]D Gunawan,A Pasaribu,R F Rahmat,et al. AutomaticText Summarization for Indonesian Language Using Text-Teaser[C]//IOP Conference Series:Materials Scienceand Engineering,2017,190,(012048):1-6.
    [17]Mortimer K,Pressey A. Consumer Information Searchand Credence Services:Implications for Service Provid-ers[J]. Journal of Services Marketing,2013,27(1):49-58.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700