基于属性集合的产品评论挖掘研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着Web 2.0的兴起与普及,以及电子商务的快速发展,越来越多的消费者选择网络购物,并发表产品评论。这些产品评论成为了潜在消费者了解产品信息的一个重要的来源,并且在一定程度上影响着消费者的潜在消费行为。针对这些非结构化的、离散分布的产品评论,产品评论挖掘采用自然语言处理技术,以自动化的方式分析这些资源,帮助企业和个人方便、有效地获取这些信息。
     本文主要围绕基于属性的产品评论挖掘问题展开研究。在分析现有产品属性识别方法不足的基础上,提出建立产品属性集合的方法,从而更好的挖掘和汇总评论信息。首先,手工提取产品说明书和少量评论文本中的产品属性词语,利用产品属性集合的建立思想建立针对该产品类别的属性集合。并利用点互信息(PMI)的方法识别新评论文本中出现的新的产品属性词语,动态地扩展产品属性集合。其次,利用HowNet(知网)中的正、负面评价词组成种子情感词集合,并利用WordNet的同义词、反义词集合预测评论中观点词的情感倾向,对种子情感词集合进行扩展。然后,根据评论句中属性词语、情感词语和否定词语的数量,利用连接词以及就近原则计算产品属性的情感分值,并利用产品属性集合的层次结构将属性分值由最底层逐层向上汇总,获得产品各个层次上的意见分值。最后,本文以www.Amazon.com上Canon(佳能)品牌下Power shot SD780 IS相机的所有用户评论为样本,基于以上研究,获得基于该款相机的意见挖掘结果,并利用产品属性集合以及产品评价指标对结果进行局部和整体两方面的展示。
Along with the fast development of E-commerce and the population of Web 2.0, more and more consumers go shopping online and post reviews of products .Those customer reviews are excellent sources for potential customers to gain more information of products, and may have some impacts on potential consumers' behavior. To those unstructured and scattered opinions, product opinion mining is being developed to exploit these sources to help companies and individuals to gain such information effectively and easily, using NLP techniques automated.
     We study the problem of opinion mining at the feature-based level. After analyzing the limitation of the existing methods of product feature identification, we propose a method based on product features system to better mining and summarizing customer reviews. Firstly, we manually come up product features from user guide of product and a small amount of product reviews text, and establish a product features system of the product category, according to the theory of establishing product features system. Then, we identify new product features in the additional opinions by using pointwise Mutual Information (PMI), in order to improve product features system dynamically. Secondly, we utilize these positive and negative words in HowNet as the opinion seed list. And we expand the seed list by utilizing the adjective synonym set and antonym set in WordNet to predict the semantic orientations of adjectives. Thirdly, according to the number of feature words, opinion words and negative words in opinion sentence, we calculate product features’sentiment score by utilizing conjunctions and the principle of proximity. Then, we aggregate scores of each level features from the lowest layer to the upper layer by using the hierarchy of product features system. Finally, we extract product reviews of Canon Power shot SD780 IS on www.Amazon.com as a sample, analysis and obtain the opinion mining result of this camera based on previous theory research. And the overall result and partial result are showed by using the product features system and product indicators of evaluation respectively.
引文
[1] Kim S M,Hovy E.Determining the Sentiment of Opinions[A].Proceedings of COLING-04[C],Geneva,Switzerland.2004:1367-1373
    [2] Carenini G,Ng R T,Zwart E.Extracting knowledge from evaluative text[C]//Proceedings of the 3rd international Conference on Knowledge Capture Banff,Alberta,Canada, 2005
    [3] V. Hatzivassiloglou, J.M.Wiebe.Effects of adjective orientation and gradability on sentence subjectivity[C]//Proceedings of the 18th International Conference on Computational Linguistics, Saarbrucken, Germany. 2000: 299– 305
    [4] J.M.Wiebe.Learning subjective adjectives from corpora[C]//Proceedings of 17th National Conference on Artificial Intelligence, Menlo Park, California: AAAI Press, 2000:735-740
    [5] Hu M, Liu B.Mining Opinion Features in Customer Reviews[C]// Proceedings of 19th National Conference on Artificial Intellifgence, San Jose, USA, July 2004
    [6] Hu M,Liu B.Mining and summarizing customer reviews[C]∥the Proceedings of KDD.Seattle, Washington, USA, 2004
    [7] Lu Yue,Huizhong Duan,Hongning Wang,ChengXiang Zhai.Exploiting structured ontology to organize scattered online opinions[C]//proceedings of the 23rd international conference on comuputational linguistics,Beijing,August,2010.
    [8] E.Riloff, J.Wiebe.Learning extraction patterns for subjective expressions[C]//In Proceedings of EMNLP.2003:105-112
    [9] Yu H, Hatzivassiloglou V.Towards answering opinion question: Separating facts from opinions and identifying the polarity of opinion sentences[C]//Proceedings of EMNLP.2003
    [10] PANG B, LEE L.A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts[C]/ /Proceedings of the Association for Computational Linguistics (ACL). 2004:271–278
    [11] P. D. Turney. Thumbs up or thumbs down? Semantic orientation applied tounsupervised classification of reviews[C]// Proceedings of the 40th Annual Meeting of the ACL. Philadelphia, 2002:417–424
    [12] T Nasukawa, J Yi. Sentiment analysis: Capturing favorability using natural language processing[C]//Proceedings of the 2nd international conference on Knowledge capture. Sanibel Island, Florida, USA, 2003:23-25
    [13] AM Popescu, O Etzioni.Extracting product features and opinions from reviews[C]//Proceedings of EMNLP, 2005
    [14] H Nakagawa, T Mori.A simple but powerful automatic term extraction method [A]/ /Proceedings of COLING-02[C], Taipei, Taiwan. 2002: 29-35
    [15] Li Zhang, Feng Jing, Xiao-yan Zhu. Movie review mining and summarization [A]//Proceedings of CIXM-06[C].Virginia, USA.2006
    [16] YeQ, Zhang Z, LawR. Sentiment classification of online reviews to travel destinations by supervised machine learning approaches [J]. Expert Systems with Applications, 2009, 36(3): 6527-6535
    [17] Popescu A-M, Etzioni O. Extracting Product Features and Opinions from Reviews[C]//Proceedings of EMNLP.2005
    [18] Liu B,Hu M,Cheng J.Opinion Observer: Analyzing and Comparing Opinions on the Web[C]//Proceedings of the 14th International Conference of World Wide Web.Chiba,Japan,2005
    [19] Liu Bing.Web Data Mining: Exploring hyperlinks, contents and usage data[C].Springer, December 2006
    [20] S.Morinaga, K.Yamanishi, K.Teteishi, T.Fukushima. Mining product reputations on the web[C] Proceedings of the ACM SIGKDD Conference, 2002
    [21] B.Pang, L.Lee, S.Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques[C]//Proceedings of the 2002 ACL EMNLP Conference, 2002: 79–86
    [22] Yi J, Niblack W. Sentiment mining in Web Fountain[C]//Proceedings of the Third IEEE International Conference on Data Mining, 2003
    [23] Garenini N, Raymond T.Ng,Ed Zwart.Extracting knowledge from evaluative text[C]//proceedings of the 3rd international conference on knowledge capture,NewYork,NY,USA.2005:11-18
    [24] DuanW, Gu B, Whinston A B. Do online reviews matter?—An empirical investigation of panel data [J]. Decision Support Systems, 2008, 45(4): 1007-1016
    [25] V. Hatzivassiloglou, K. R. McKeown. Predicting the semantic orientation of adjectives[C] //In Proc. of the 35th ACL Conf, 1997:174-181
    [26] Turney P D, Littman M L. Measuring Praise and Criticism: Inference of Semantic Orientation from Association [J].ACM Transactions on Information System (TOIS).2003, 21(4):315-346
    [27] Zhang Z, YeQ, LawR, eta.l The impact of e-word-of-mouth on the online popularity of restaurants: A comparison of consumer reviews and editor reviews [J]. International Journal of Hospitality Management, 2010, In Press
    [28] Lun-Wei Ku, Yu-Ting Liang, Hsin-His Chen. Opinion extraction, summarization and tracking in news and blog corpora[C]//Proceedings of AAAI-2006 Spring Symposium on computational approaches to analyzing weblogs, Stanford University,California,USA,2006
    [29] K Dave, S Lawrence.DM Pennock Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews[C]//proceedings of the 12th WWW, 2003
    [30] Turney P D,Littman M L. Unsupervised learning of semantic orientation from a hundred-billion-word corpus[R].Tech.Rep.EGB-1094,National Research Council, Canada,2002
    [31] Carenini G, Raymond T, Ed Zwart. Extracting Knowledge from Evaluative Text[C] K-CAP’05, Banff, Alberta, Canada, 2005
    [32] Esuli A, Sebastiani F. Determining the semantic orientation of terms through gloss classification[c]//Proceedings of CIKM Bremen, Germany, 2005:617-624
    [33] Kamps J,Marx M,Mokken RJ,et a1.Using WordNet to measure semantic orientation of adjectives[C]//Proceedings of 4th International Conference on Language Resources and Evaluation Lisbon,Portugal,2004:1115-1118
    [34] YeQ, Law R, Gu B. The impact of online user reviews on hotel room sales [J]. International Journal of Hospitality Management, 2009, 28(1): 180-182
    [35] General Inquirer home page:http://www.wjh.harvard.edu
    [36] A.Andreevskaia, S.Bergler.Mining WordNet for a fuzzy sentiment: sentiment tag extraction from WordNet glosses[C]//proceedings of the 11th conf. of the European chapter of the Association for computational linguistics, Budapest, 2003:209-216
    [37] Fellbaum, C. 1998. WordNet: an Electronic Lexical Database, MIT Press
    [38] Stanford POS Tagger version 3.0:http://nlp.stanford.edu/software/tagger.shtml
    [39] Stanford Named Entity Recognizer Version 1.1.1: http://nlp.stanford.edu/software/CRF-NER.shtml
    [40]胡熠,陆汝占,李学宁,段建勇,陈玉泉.基于语言建模的文本情感分类研究[J].计算机研究与发展,2007,44(9):1469-1475
    [41]姚天昉,聂青阳,李建超.一个用于汉语汽车评论的意见挖掘系统【C】中国中文信息学会成立二十五周年学术年会,北京,中国,2006
    [42]侯锋,王传廷,李国辉.网络意见挖掘、摘要与检索研究综述[J].计算机科学,2009,36(7):15-19
    [43]周立柱,贺宇凯,王建勇.情感分析研究综述[J].计算机应用,2008,11(28):2725-2728
    [44]张紫琼,叶强,李一军.互联网商品评论情感分析研究综述[J].管理科学学报,2010(6):84-96
    [45]李实,叶强,李一军, Law Rob.中文网络客户评论的产品特征挖掘方法研究[J].管理科学学报, 2009, 12(2): 142-152
    [46]姚天昉,程希文,徐飞玉,汉思·乌思克尔特,王睿.文本意见挖掘综述[J].中文信息学报,2008, 22(3):71-80
    [47]朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1): 14-20
    [48]娄德成,姚天昉.汉语句子语义极性分析和观点抽取方法的研究[J].计算机应用,2006:2622-2625

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700