基于依存关系的旅游景点评论文本倾向分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着人们生活水平的提高,旅游已成为人们生活的重要组成部分。与此同时,关于旅游景点的网络评论也越来越多。这些评论对于潜在型游客和各地景点管理商都是非常重要的信息资源。对于一般游客,在出游之前,可以通过网上评论了解其他游客对某景点的看法,规划自己的旅游行程。而对于景点管理商可以通过景点评论了解游客对景点的意见和态度,以便提高旅游景点的服务质量。但是,人工地逐篇阅读大量的旅游景点评论,需要花费许多的时问和精力,阅读者极有可能会“迷失”在其中,无法正确识别和利用其中有价值的观点信息。为了准确、高效地挖掘出游客感兴趣的观点信息,对文本进行情感倾向性分析是需要解决的关键问题之一
     本文利用词对间的依存关系,研究了评论文本的情感倾向分类和特征-观点对抽取方法。本文的主要研究工作如下:
     (1)基于规则的组块获取
     为了抽取对情感倾向分类有用的信息,本文利用了词对问的依存关系,构建了获取含情感倾向组块的规则。实验结果表明,基于规则的方法获取组块是可行的。
     (2)基于组块特征的评论文本情感倾向性分类
     对于旅游景点评论文本的情感倾向分类的特征选择问题,本文将利用获取的组块与情感词相结合作为情感倾向分类的特征。通过对旅游景点评论的情感倾向分类实验,结果表明,采用组块信息可以提高文本情感倾向分类的性能。
     (3)特征-观点对的抽取
     特征-观点对的抽取是观点挖掘中重要的研究课题之一,本文利用依存语法对句子的分析,研究了评论文本中特征-观点对的抽取方法。利用词对间的依存关系,先构建了获取含有评价对象和观点词语组块的规则以及候选评价对象的识别算法。在此基础上,设计了具有情感倾向的特征-观点对的抽取算法。通过实验验证了方法的有效性。
With the improvement of people's living standard, tourism has become an important part of people's lives. Meanwhile, the online scenic spots reviews will be more and more. These reviews are considered as significant reference information for potential visitors and local scenic spots managements. Visitors have utilized this piece of this information to understand view of other visitors and plan trips through read online comments before traveling. In order to improve tourist attractions of service quality, managements of scenic spots may understand the opinions and attitudes about scenic spots. However, it needs to spend a lot of time and energy to artificial read mass reviews, and readers may have "lost", it is unable to identify and using the valuable information. In order to accurately and efficiently mine opinion information that is interested for visitors. Text sentiment orientation analysis is one of the key problems need to solve.
     This paper studies the review texts sentiment orientation classification and the method of extract the feature-opinion in review texts based on dependency relation. The major works of this thesis focuses on the following:
     (1) Getting chunks based on the rules
     In order to extract the useful information about sentiment orientation classification. By using the dependency relation between word and word words, this thesis constructs the rules to obtain chunks which contain sentiment orientation. Experimental results show that the method based on rule obtain chunks is feasible.
     (2) Review texts sentiment orientation classification based on chunk features
     The thesis utilizes chunks combined with emotional words as features of sentiment orientation classification. Through the experiment of sentiment orientation classification about scenic spots reviews, experimental results show that adopting chunk information can improve the performance of text sentiment orientation classification.
     (3) Feature-opinion extraction
     Feature-Opinion Extraction is one of the key researches in the area of opinion mining. This thesis studies the method to extract the feature-opinion in review texts based on dependency grammar. By using the dependency relation between word and word, we construct the rules to obtain chunks which contain evaluation object and opinion word, as well as the algorithm to identify candidate evaluation object. On this basis, we design an algorithm to extract feature-opinion with sentiment orientation. Experimental results prove the method is effective.
引文
[1]B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up? Sentiment classification using machine learning techniques[C]. In Proceeding of 2002 Conference on Empirical Methods in Natural Language. Philadelphia, US.2002:79-86.
    [2]B. Pang, L. Lee. A sentimental education:sentiment analysis using subjectivity summarization based on minimum cuts[C]. In Proceedings of the 42nd Meeting of the Association of Computational Languages.2004:271-278.
    [3]P. D. Turney. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews[C].In Proceedings of Association for Computational Linguistics 40th Anniversary Meeting. Philadelphia,PA,USA.2002: 417-424.
    [4]Casey Whitelaw, Navendu Garg, Shlom Aorgamon. Using appraisal groups for sentiment analysis[C]. In Proceedings of CIKM-05,14th ACM International Confer-rence on Information and Knowledge Management.2005:625-631.
    [5]WH Lin, T. Wilson, J.Wiebe. Which Side Are You On? Identifying Perspectives at the Document and Sentence Levels[C].In Proceedings of the Conference on Natural Lang-uage Learning (CoNLL). New York:ACL,2006:109-116.
    [6]SM. Kim, E. Crystal Hovy. Analyzing Predictive Opinions on the Web[C].In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL).2007: 1056-1064.
    [7]P. Chesley, B. Vincent, L. Xu, et al. Using Verbs and Adjectives to Automatically Classify Blog Sentiment[R]. In Proceedings of Computational Approaches to Analyzing Weblogs:Papers form the 2006 Spring Symposium.Technical Report.2006:27-29.
    [8]李荣陆.文本分类若干关键技术研究[D].复旦大学.2005.
    [9]徐军,丁宇新,王晓龙.使用机器学习方法进行新闻的情感自动分类[J].中文信息学报.2007,21(06):95-100.
    [10]徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J].中文信息学报.2007,21(01):98-102.
    [11]朱嫣岚,闵锦等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2005,20(1):14-20.
    [12]Qin, Bin, Yanyan Zhao, Leilei Gao & Ting Liu (2008). Recommended or not? Give Advice on Online Products. In Fifth International Conference on Fuzzy Systems and Knowledge Discovery.2008:208-212.
    [13]冯志伟.特思尼耶尔的从属关系语法[M].国外语言学,1983:63-65.
    [14]李素建,刘群.汉语组块的定义和获取[C].全国计算语言学联合学术会议(SWCL2003)论文集.2003:110-115.
    [15]Nozomi Kobayashi, Kentaro Inui, Yuji Matsumoto. Collecting Evaluative Expressions for Opinion Extraction. In Proceedings of the 1st International Joint Conference on Natural Language Processing (IJCNLP).2004:584-589.
    [16]Ana-Maria Popescu, Oren Etzioni. Extracting Product Features and Opinions from Reviews. In Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing(HLT/EMNLP).2005:32-33.
    [17]G. Somprasertsri, P. Lalitrojwong. Mining Feature-Opinion in Online Customer Reviews for Opinion Summarization. Journal of Universal Computer Science. 2010,16(6):938-955.
    [18]姚天防,聂青阳,李建超,李林琳,娄德成,陈珂,付宇.一个用于汉语汽车评论的意见挖掘系统[C].中文信息处理前沿进展——中国中文信息学会二十五周年学术会议论文集.2006:260-281.
    [19]Li Zhuang, Feng Jing, Xiaoyan Zhu. Movie Review Mining and Summarization, In Proceedings of the 15th ACM International Conference on Information and Knowledge Management.2006:43-50.
    [20]宋光鹏.文本的情感倾向分析研究[D].北京邮电大学.2008.
    [21]Tesniere,L.Elements de syntaxe structurale[M].Editions Klincksieck.1959.
    [22]刘海涛.依存语法和机器翻译[J].语言文字应用.1997(03):89-93.
    [23]J.J. Robinson. Dependency Structures and Transformational Rules[C]. Language. 1977,46(02):259-285.
    [24]冯志伟.判断从属树合格性的五个条件[C].第二届全国应用语言学讨论会.1998.
    [25]周明,黄昌宁,张敏等.统计与规则并举的汉语句法分析模型[J].计算机研究与发展.1994,31(2):40-49.
    [26]周明,黄昌宁.面向语料库标注的汉语依存体系的探讨[J].中文信息学报,1994,8(03):35-52.
    [27]语言技术平台LTP.哈尔滨工业大学信息检索研究中心.http://ir.hit.edu.cn/
    [28]马金山.基于统计方法的汉语依存句法[D].哈尔滨工业大学.2007.
    [29]赵妍妍,秦兵,刘挺.文本情感分析[J].软件学报.2010,21(08):1834-1848.
    [30]何婷婷,闻彬,宋乐等.词语情感倾向性识别及观点抽取研究[C]. Proceedings of the COAE2008, Harbin.2008:89-93.
    [31]苏祺,李芸,王洪俊.用于产品信息评价的术语库构建及应用[J].术语标准化与信息技术.2006(01):33-36.
    [32]V. Hatzivassiloglou, KR. McKeown. Predicting the Semantic Orientation of Adjectives[C].In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics.1997:174-181.
    [33]Janyce Wiebe, Theresa Wilson, Rebecca Bruce, Matthew Bell, Melanie Martin. Learning Subjective Language [J].Computational Linguistics.2004,30(03):277-308.
    [34]Minqing Hu and Bing Liu. Mining and Summarizing Customer Reviews[C].In Proceedings of the conference on Knowledge Discovery and Data Mining(KDD). 2004:168-177.
    [35]王素格,杨安娜,李德玉.基于汉语情感词表的句子情感倾向分类研究[J].计算机工程与应用.2009,45(24):153-155.
    [36]王素格,杨安娜.基于混合语言信息的词语搭配倾向判别方法[J].中文信息学报.2010,24(03):69-74.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700