摘要
针对有监督评论有用性预测方法中的训练数据集难以构造,以及无监督方法缺乏对情感信息支撑的问题,提出基于语义和情感信息构建一种无监督模型,用于对评论有用性进行预测,同时考虑了评论和评论下回复内容对观点的支持度用来计算观点的有用性得分,进而得到评论的有用性。同时,提出结合句法分析和改进潜在狄利克雷分配(LDA)模型的评论摘要方法用于评论有用性预测模型中的观点提取,基于句法分析结果构建must-link和cannot-link两种约束条件指导主题模型学习,在保证召回率的同时提高模型准确率。该方法在实验数据集上能取得70%左右的F1值和90%左右的排序准确率,且实例应用也表明该方法对结果具有较好的解释性。
Focusing on the issues in review helpfulness prediction methods that training datasets are difficult to construct in supervised models and unsupervised methods do not take sentiment information in to account,an unsupervised model combining semantics and sentiment information was proposed. Firstly,opinion helpfulness score was calculated based on opinion support score of reviews and replies,and then review helpfulness score was calculated. In addition,a review summary method combining syntactic analysis and improved Latent Dirichlet Allocation( LDA) model was proposed to extract opinions for review helpfulness prediction,and two kinds of constraint conditions named must-link and cannot-link were constructed to guide topic learning based on the result of syntactic analysis,which can improve the accuracy of the model with ensuring the recall rate. The F1 value of the proposed model is 70% and the sorting accuracy is nearly 90% in the experimental data set,and the instance also shows that the proposed model has good explanatory ability.
引文
[1]林煜明,王晓玲,朱涛,等.用户评论的质量检测与控制研究综述[J].软件学报,2014,25(3):506-527.(LIN Y M,WANG X L,ZHU T,et al.Survey on quality evaluation and control of online reviews[J].Journal of Software,2014,25(3):506-527.)
[2]LIU B.Sentiment analysis and opinion mining[J].Synthesis Lectures on Human Language Technologies,2012,5(1):1-167.
[3]JINDAL N,LIU B.Opinion spam and analysis[C]//Proceedings of the 2008 International Conference on Web Search and Data Mining.New York:ACM,2008:219-230.
[4]LIM E P,NGUYEN V A,JINDAL N,et al.Detecting product review spammers using rating behaviors[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management.New York:ACM,2010:939-948.
[5]JINDAL N,LIU B,LIM E P.Finding unusual review patterns using unexpected rules[C]//Proceedings of the 19th ACM International Conference on Information and Knowledge Management.New York:ACM,2010:1549-1552.
[6]WANG G,XIE S,LIU B,et al.Identify online store review spammers via social review graph[J].ACM Transactions on Intelligent Systems and Technology,2012,3(4):61.
[7]MUKHERJEE A,LIU B,WANG J,et al.Detecting group review spam[C]//Proceedings of the 20th International Conference Companion on World Wide Web.New York:ACM,2011:93-94.
[8]MUKHERJEE A,LIU B,BLANCE N.Spotting fake reviewer groups in consumer reviews[C]//Proceedings of the 21st International Conference on World Wide Web.New York:ACM,2012:191-200.
[9]黄婷婷,曾国荪,熊焕亮.基于商品特征关联度的购物客户评论可信排序方法[J].计算机应用,2014,34(8):2322-2327.(HUANG T T,ZENG G S,XIONG H L.Trustworthy sort method for shopping customer reviews based on correlation degree with product features[J].Journal of Computer Applications,2014,34(8):2322-2327.)
[10]GHOSE A,IPEIROTIS P G.Designing novel review ranking systems:predicting the usefulness and impact of reviews[C]//Proceedings of the Ninth International Conference on Electronic commerce.New York:ACM,2007:303-310.
[11]LIU Y,HUANG X,AN A,et al.Modeling and predicting the helpfulness of online reviews[C]//Proceedings of the 2008Eighth IEEE International Conference on Data Mining.Piscataway,NJ:IEEE,2008:443-452.
[12]ZENG Y-C,KU T,WU S-H,et al.Modeling the helpful opinion mining of online consumer reviews as a classification problem[J].Computational Linguistics and Chinese Language Processing,2014,19(2):17-32.
[13]HSIEH H Y,WU S H.Ranking online customer reviews with the SVR model[C]//Proceedings of the 2015 IEEE International Conference on Information Reuse and Integration.Piscataway,NJ:IEEE,2015:550-555.
[14]XU B,ZHAO T-J,WU J-W,et al.Automatically ranking reviews based on the ordinal regression model[C]//AICI 2011:Proceedings of the Third International Conference on Artificial Intelligence and Computational Intelligence.Berlin:Springer,2011:126-134.
[15]HU M,LIU B.Mining and summarizing customer reviews[C]//Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2004:168-177.
[16]LIU B,HU M,CHENG J.Opinion observer:analyzing and comparing opinions on the Web[C]//Proceedings of the 14th International Conference on World Wide Web.New York:ACM,2005:342-351.
[17]扈中凯,郑小林,吴亚峰,等.基于用户评论挖掘的产品推荐算法[J].浙江大学学报(工学版),2013,47(8):1475-1485.(HU Z K,ZHENG X L,WU Y F,et al.Product recommendation algorithm based on users’reviews mining[J].Journal of Zhejiang University(Engineering Science),2013,47(8):1475-1485.)
[18]RYANG H,YUN U.Ranking book reviews based on user discussion[M]//PARK J J,ADELI H,PARK N,et al.Mobile,Ubiquitous,and Intelligent Computing.Berlin:Springer,2014:7-11.
[19]RYANG H,YUN U.Ranking method for book reviews based on estimated discussion quality[M]//PARK J J,STOJMENOVIC I,JEONG H Y,et al.Computer Science and Its Applications.Berlin:Springer,2015:171-177.
[20]WIDYANTORO D,WIBISONO Y.Modeling credibility assessment and explanation for tweets based on sentiment analysis[J].Journal of Theoretical and Applied Information Technology,2014,70(3):540-548.