用户名: 密码: 验证码:
教育资源评论的倾向性研究及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
用户在网络上发表针对某一种产品的评价,无论是对于厂家还是潜在的消费者,都具有十分重要的研究价值。同样,用户对网络教育资源的评价,无论是对教育资源的提供者还是学习者,也是很有研究价值的。教育资源的提供者通过用户对资源的评价信息,可以知道自己在哪些地方需要改进,同时也可以吸取其它教育资源的长处,以补己之短;学习者通过了解资源的评价信息,可以得到一些参考信息和建议,通过对资源的综合考虑和选择,找到适合自己的教育资源。因此,对教育资源的评价信息进行倾向性研究具有很大的研究意义。
     网络学习用户可以通过论坛、BBS、留言板、Blog等途径发表对教育资源的看法,这些都是评论信息的重要来源。但是对某一种教育资源的评价信息有时长达成百上千条,如果仅仅依靠人工进行浏览,是一件耗时又低效的工作。于是,我们需要使用信息抽取技术,从用户的评论信息中自动提取出教育资源的倾向性。
     提取评论信息的倾向性,关键的两个步骤是语义极性分析和观点抽取。我们需要利用计算机技术自动分析带有感情倾向的句子或文档,提取出用户潜意识中需要的主题或特征,并且分析它们的语义倾向性和强度。在汉语评论信息的语义极性分析方面,采用比较多的是基于统计分析的方法,目前很多研究机构已经运用自然语言处理技术对评论信息进行语义极性分析,取得了很有价值的进展。本文将尝试使用自然语言处理技术对教育资源的评论语句进行语义极性分析,使用户获得感兴趣和高质量的教育资源。
     本文对教育资源的资源类型和媒体类型进行统计分析,提取出教育资源的特征词和用于词语极性判断的基准词;在词语语义相似度计算时,重点分析了义原的上下位关系和对义反义关系,引用了《知网》最新发布的情感分析用词语集;通过改进基准词组和词语语义相似度算法,词语的语义极性计算结果在一定程度上得到了较高的准确性和可信性;在观点抽取过程中,本文通过LTP系统的词法分析和句法分析结果,利用依存关系对将观点词的极性传递给特征词,并给出了计算句子整体倾向性的算法,最终得到句子的倾向性。
     本文通过实验,将自动抽取的观点结果与人工判断结果进行比较,得到了较高的查准率和查全率,同时设计了教育资源评论挖掘应用系统,用于判断评论语句的倾向性。
The reviews of a product published on the network by users have very significant research value, whether they are for manufacturers or potential consumers. Similarly, the reviews of network education resources published by users are also of great research value, both for providers of education resources and learners. By the review information of resources, the providers who provide education resources can know what needs improvement, and they can absorb the strengths of other education resources to fill their own short; learners can get some references and recommendations by understanding the review information of resources, and they can find suitable education resources through comprehensive consideration and select. Therefore, it has a great deal of significance to research review orientation of education resources.
     Through forums, BBS, message boards, and Blog, E-learning users can publish reviews about education resources, which are important sources of review information. Sometimes the review information of education resources up to hundreds, it is a time-consuming and inefficient work if only relies on manual browsing. Therefore we need to use the information extraction technology to extract automatically orientation of education resources from users' reviews.
     It has two key steps to extract orientation of review information: semantic polarity analysis and opinion mining. We need to use computer technology to analyze automatically sentences and documents, which have emotional orientation; and extract topics or features, which users need subconsciously and analyze their semantic orientation and intensity. In terms of semantic polarity analysis of Chinese review information, most methods are based on methods of statistical analysis, and many research institutions have been made valuable progress using natural language processing technology to analyze semantic orientation for review information. The paper will try to use natural language processing technology to analyze semantic orientation of reviews about education resources; so that users can get education resources with high quality which they are interested in.
     This paper statistically analyzes resource types and media types of education resources, extract features of education resources, and benchmark words used for judging word polarity. When we compute sememe similarity, we analyze deeply relations of upper and lower; and antonym relations among sememes and adopt emotional words published recently by HowNet. By improved benchmark words and the algorithm of word semantic similarity, the calculation results of words' semantic polarity have higher accuracy and confidence in a certain degree. In the process of opinion mining, through lexical analysis results and parsing results of LTP system, we pass opinion's polarity to features using dependency relations, and get sentence orientation finally by algorithm of compute sentence orientation.
     In the experiment, we compare opinion results extracted automatically with manual judgment results and get high precision and recall rate. We design an application system of review mining about education resources to judge review sentence orientation.
引文
[1]杨天明.基于语义的文本倾向性分析与应用研究[D].江苏大学硕士学位论文, 2009.
    [2] Peter D. Turney. Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews, In Proc. 40th ACL, 2002, pp417-424.
    [3]朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J],中文信息学报, 2006, 20(1):14-20.
    [4] Yi, J., Nasukawa, T., Bunescu,R., Niblack, W. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In: The Third IEEE International Conference on Data Mining, November 2003,.IEEE Computer Society Press, Los Alamitos, 2003, 427–434.
    [5]刘永丹,曾海泉,李荣陆,胡运发.基于语义分析的倾向性文本过滤[J],通信学报,2004.7, 78-85.
    [6]郑宇,刘建,孙晓斌,吴耿锋.基于文本倾向性的邮件过滤系统设计[C],中国人工智能学会第11届全国学术年会论文集(下), 2005.9, 1300-1305.
    [7] Bo Pang, Lillian Lee, Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques, presented at the 2002 Conference on Empirical Methods in Natural Language Processing(EMNLP'2002), 2002, 79-86.
    [8]徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[J],中文信息学报, 2007, 21(1):96-100.
    [9] Vasileios Hatzivassiloglou, McKeown. Predicting the semantic orientation of adjectives[A]. In proceedings of the 35th Annual Meeting of the Association for Computational Linguistics(ACL-97), 1997, pages 174-181.
    [10] Turney PD, Littman ML. Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems,2003,21(4):315-346.
    [11] J.Kamps, M.Marx, R.J.Mokken, M.D.Rijke. Using WordNet to measure semantic orientation of adjectives[A]. In Proceedings of LREC-04, 4th International Conference to Language Resources and Evaluation[C], Lisbon, 2004, 1115-1118.
    [12] H.Yu, V.Hatzivassiloglou. towards Answering Opinion Questions: Separating Facts from Opinions and Identifying the Polarity of Opinion Sentences. In M.Collins and M.Steedman(eds): Proc.of EMNLP-03,8th Conference on Empirical Methods in Natural Language Processing, 2003, pages 129-136.Sapporo,Japan.
    [13] M.Gamon, A.Aue, S.Corston-Oliver, E.Ringger. Pulse: Mining Customer Opinions from FreeText.In Proc.of IDA-05, the 6th International Symposium on Intelligent Data Analysis.Lecture Notes in Computer Science, Springer-Verlag. Madrid, 2005, Spain.
    [14] B.Liu, M.Hu, J.Cheng. Opinion observer: analyzing and comparing opinions on the Web. In Proc.of WWW'05, the 14th international conference on World Wide Web, 2005, pages 342-351.Chiba,Japan.
    [15] JEONGHEE YI, WAYNE NIBLACK. Sentiment mining in WebFountain[A]. Proceedings of the 21st International Conference on Data Engineering(ICDE 2005)[C]. Washington, DC, USA:IEEE Computer Society Press,2005,1073-1083.
    [16] J.Yi, T.Nasukawa, R.Bunescu, W.Niblack. Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques. In Proceedings of the IEEE International Conference on Data Mining,2003.
    [17] T. Nasukawa, J. Yi. Sentiment Analysis: Capturing Favorability using Natural Language Processing[A]. In: Proceedings of the 2nd International Conference on Knowledge Capture(KCAP 2003)[C]. Sanibel, USA:2003. 70-77.
    [18] Michael Gamon and Anthony Aue. Automatic identification of sentiment vocabulary: exploiting low association with known sentiment terms. In Proceedings of the ACL 2005 Workshop on Feature Engineering for Machine Learning in NLP, 2005.
    [19] X. Cheng. Automatic Topic Term Detection and Sentiment Classification for Opinion Mining[D]. Master Thesis. Saarbrcken, Germany: The University of Saarland, 2007.
    [20] T.Wilson, J.Wiebe, P.Hoffmann. Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis. In Proc. of Human Language Technologies Conference/Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), 2005, pages 347-354.Vancouver,Canada.
    [21]龙丽君.网络内容监管系统中基于局部信息的语义倾向性识别算法[D].南京理工大学, 2004.
    [22] R.Yuan et al. Morpheme-based Derivation of Bipolar Semantic Orientation of Chinese Words. In Proc. of the 20th International Conference on Computational Linguistics (COLING-2004), 2004, pages 1008-1014.Geneva, Switzerland.
    [23]叶惠敏,戴冠中.基于综合集成方法的网上舆论倾向分析与评估系统方案[J],计算机工程与应用, 2005(16): 216-217.
    [24]熊德兰.中文网页褒贬倾向性分类研究[D].郑州大学, 2006.5(P16).
    [25]闵锦.基于主题和态度分类的文本过滤系统[D].复旦大学, 2006.5.15(P25).
    [26]李慧,沈洁,张舒,顾天竺,吴颜,陈晓红.基于页面分块与信息熵的评论发现及抽取[J],计算机应用, 2007(2).
    [27]李姜.基于DOM的评论发现及抽取模型研究[J],计算机工程与设计, 2007(9):2150-2153.
    [28] Benjamin K. Y.Tsou, Raymond W. M.Yuen, Oi Yee Kwong, Tom B, Y.Lai, Wei Lung Wong. 2005 Polarity Classification of Celebrity Coverage in the Chinese Press. In Proc.of the International Conference on Intelligence Analysis. McLean, USA.
    [29]姚天昉,聂青阳,李建超,李林琳,娄德成,陈珂,付宇.一个用于汉语汽车评论的意见挖掘系统[C].见:曹右琦,孙茂松主编,中文信息处理前沿进展-中国中文信息会二十五周年学术会议论文集.清华大学出版社,北京, 2006年11月.
    [30]刘群,李素建.基于《知网》的词汇语义相似度计算[A].第三届汉语词汇语义学研讨会.台北, 2002.
    [31] Dekang Lin. An Information-Theoretic Definition of Similarity Semantic distance in WordNet [C]// Proceedings of the Fifteenth International Conference on Machine Learning. 1998.
    [32]李峰,李芳.中文词语语义相似度计算—基于《知网》2000[J].中文信息学报, 2007, 21(3):99-105.
    [33]林丽,薛方,任仲晟.一种改进的基于《知网》的词语相似度计算方法[J].计算机应用, 2009, 29(1).
    [34]娄德成.基于NLP技术的中文网络评论观点抽取方法的研究[D].上海交通大学, 2007-01.
    [35] Li Zhuang, Feng Jing, Xiao-Yan Zhu. Movie review mining and summarization[C]. Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, Arlington, Virginia, USA. 2006.
    [36] Hu, M., and Liu, B. 2004. Mining and summarizing customer reviews. KDD’04, 2004.
    [37]钱杰.网络评论观点的倾向性分析[D].浙江工业大学硕士学位论文, 2008.
    [38]倪茂树.基于语义理解的观点评论挖掘研究[D].大连理工大学硕士学位论文, 2007.
    [39] Lun-Wei K, Yu-Ting L, Hsin-His Ch. Opinion extraction, summarization and racking in news and blog corpora. Proceedings of AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, AAAI Technical Report SS-06-03, California, USA, 2006: 100-107.
    [40]李培.产品评论挖掘的观点抽取和分类技术研究[D].重庆大学硕士学位论文, 2009.
    [41]国家精品课程资源网[EB/OL]. www.jingpinke.com.
    [42]哈尔滨工业大学信息检索研究中心[EB/OL]. http://ir.hit.edu.cn/
    [43]丁建立,慈祥,黄剑雄.网络评论倾向性分析[J].计算机应用, 2010, 11(30):2937-2940.
    [44]陈铭,李生红,陈秀真.基于句式结构的评论倾向性识别方法[J].通信技术, 2011, 02(44):100-101.
    [45]杨超,冯时,王大玲等.基于情感词典扩展技术的网络舆情倾向性分析[J].小型微型计算机系统, 2010(4):691-695.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700