基于Web评论信息的倾向性分析关键技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
社交网络的迅速兴起,网民规模的不断攀升,使得以互联网为代表的新兴媒体已经成为广大群众表达诉求、抨击时弊、建言献策、沟通交流的重要工具,成为群众行使知情权、参与权、表达权和监督权的重要渠道。与此同时,网络用户也由信息的被动接收者转变为信息的生产者,这便造成了用户产生的大量评论信息在互联网上堆积的情形。不仅如此,用户产生评论信息中还蕴含了用户情感态度、政治倾向等信息。挖掘用户产生内容信息所携带的情感信息,分析用户的情感倾向,对商品推荐、舆情发现以及信息预测等均有着重要的意义。
     迄今为止,研究者在倾向性分析领域做出了大量的研究,推进了倾向性分析研究的进步。由于用户的情感倾向性信息多蕴含在用户产生的文本信息中,而自然语言处理本身便是一项极具挑战性的工作。再加上蕴含在用户产生评论信息中的情感倾向还会依据语境的不同而发生变化,这便使得倾向性分析存在以下几个亟待解决的问题:
     倾向性分析存在语料分布极度不平衡现象。一些领域的语料容易通过互联网获取,而某些领域的语料属于稀有资源,如何解决语料分布不平衡问题,使得构建的情感词表具有较高的领域可移置性,达到跨领域倾向性分析的目的是当前亟待解决的首要问题。
     情感词不仅具有领域依赖性,而且具有上下文依赖性,同一个情感词在不同的上下文环境中会表现出不同的情感倾向,导致系统精确度大幅降低。如何解决情感词的上下文依赖问题是提高倾向性分析的关键所在。
     针对复杂的语言现象,如何捕捉比较词、否定词以及句式等因素对句子倾向性的影响,能否构建一个合理的句子倾向性分析模型,捕捉影响句子倾向性的多种因素,达到提高句子倾向性分析目的是倾向性分析所面临的问题之一。
     平面话题模型难以描述评论文本中主题与属性之间的关系,造成全局把握某一评论话题的全局情感倾向性困难的局面。能否构建一个合适的评论文本表示模型,用于描述评论文本中话题与子话题之间纵向层次关系及横向关联关系,最终达到描述用户全局情感倾向的目的,是当前面临的一个重要问题。
     本文针对上述问题,确立研究内容。主要工作如下:
     (1)研究跨领域情感词自动扩展方法,解决不同领域数据分布不平衡现象。针对倾向性分析中存在语料分布不平衡问题,提出一种跨领域倾向性分析方法。目的在于利用源领域中已标注信息分析目标领域中未登录词的情感倾向,用于未标注领域情感词自动扩展。该方法首先将情感词划分为依赖情感词和独立情感词两类,以此为基础扩展原有倾向性分析的两个假设,构建源领域与目标领域之间的关系,达到情感词自动扩展的目的。整个方法涉及情感词抽取和情感词倾向性定义两个步骤。情感词抽取阶段采用词性信息与改进的点互信息相结的方法计算候选情感词与评价对象之间的依赖强度,获取目标领域情感词集合。构造词与词、词与评价对象、词与文档之间关系,并利用这个关系计算每个情感词倾向强度,最终达到跨领域情感词扩展的目的。
     (2)研究评价短语倾向性分析方法,为解决情感词倾向性依赖下文依赖问题开辟新的途径。针对情感词倾向性存在上下文依赖性问题,提出一种基于评价对象隐性情感倾向的评价短语倾向性分析方法。该方法将情感词的上下文环境分解为评价对象,并对评价对象的隐性情感加以量化,以此为基础构建评价对象、情感词以及评价短语之间的关系。最后,依据启发式规则构建短语倾向性分析的目标函数,达到短语倾向性分析的目的。实验表明,结合评价对象隐性情感倾向的情形下,评价短语倾向性识别得到了有效的提高。
     (3)研究否定句倾向性分析方法,以解决否定词否定界限模糊的问题。针对句子倾向性分析中否定词否定界限模糊的问题,分析影响否定句倾向性分析的主要因素以及否定词的否定范围,将否定界限问题转化为否定词位置问题,以此为基础提出一种基于层叠HMM的否定句倾向性分析方法。该方法被划分为三个层次,其中HMM-1和HMM-2用于识别否定句中所包含的评价对象,以此为基础,计算短语的情感倾向。为了量化否定词对句子倾向性的影响,将句子中所包含的否定词作为触发条件修正评价短语的情感倾向,最后依据不同的句式计算句子的全局情感倾向。该方法参加了2012年第四届全国倾向性信息评测,提交的结果在所有提交结果中表现最优。
     (4)研究评论文本模型构建方法,用于解决平面话题模型关联检测困难的问题,为全局捕捉话题特征倾向奠定基础。针对平面话题模型关联检测困难的问题,本文提出一种融合扩展IB理论的评论文本模型构建方法。该方法将评论文本视为一个层次结构,首先将评论文本划分为一个个独立的语义单元,并将语义单元进一步划分为主题特征和语义单元特征两个部分。其中,主题属性用于同一话题或产品的全局关联,而语义单元属性则用于区分话题或子属性之间的关系。在语义单元划分中,本文将传统的信息瓶颈理论(The In-formation Bottleneck Method,简称IB)依据评论文本特征加以扩展,用于语义单元划分;在相关话题/产品关联检测中,本文采用加权KL的方法用于关联检测。为了验证这一思想的可行性,本文在数据集TDT4上进行测试,结果表明,本文构建的模型能够较准确的捕捉同一话题/产品之间的关联关系。
With the rapid rise of social networks and increasing scale of Internet users, emerging me-dia, represented by the Internet has become an indispensable tool for the public to express aspi-rations, criticize the current problems, make recommendations, and communicate effectively, aswell as an important channel for the mass to exercise their rights to know, to participate, to ex-press and to supervise. Thus, the users have turned into the producers of information from therecipients of information, contributing to accumulation of information resulted from a largenumber of users on the network. User-generated information contains much information such asemotional attitude, political tendency, etc. Mining emotional information carried by us-er-generated content information, analyzing users' emotional tendencies, is of great significanceto product recommendation, public opinion discovery and information prediction.
     So far, a lot of researches have been made by researchers in the field of orientation analysis,promoting the progress of tendency analysis. Because users' emotional information is mostlyembedded in user-generated text information, and natural language processing research itself is avery challenging task; in addition, users' emotional information may change according to differ-ent contexts. These will result in several tendency analysis problems urgently to be solved in thefollowing:
     (1)Corpus distribution imbalance exists in the tendency analysis; corpus of some areas canbe easily available via the Internet, while corpus of certain areas is difficult to obtain. How tosolve the problem of unbalanced distribution of corpus, to make the built emotional vocabularybe with high ability of field displacement, to achieve the goal of interdisciplinary tendency anal-ysis is the primary problem which needs to be solved currently.
     (2)Emotional words are not only be with field dependence, but context dependence, causingthe same emotional word in different contexts to show different emotional tendencies, whichsignificantly reduces the system accuracy. How to deal with the context-dependent issues ofemotional words is the key to improve orientation analysis.
     (3) For sentences may contain negative words, comparative words, emotional words withdifferent tendencies, and other complex language phenomenon, whether a reasonable sentencetendency analysis model can be built, to capture various factors influencing the sentence orienta-tion, and realize the purpose of improving sentence tendency analysis is one of the problemsconfronting orientation analysis.
     (4) Plane topic models are difficult to describe the relationships between topics and proper-ties in the comment text, resulting in difficulties in fully grasping the global emotional tendencyof certain comment topic. Whether an appropriate comment text representation model can bebuilt, to describe the longitudinal hierarchy and lateral correlation in the comment text, andeventually achieve the goal of describing users' final emotional tendency, is an important issuecurrently facing us.
     In response to above-mentioned problems, this paper established the research content, and ultimately made a breakthrough in the following several aspects. Major work is as follows:
     (1)Research on the problem of automatic extension of emotional words in various areas,and dealing with distribution imbalance of data in different fields. Aiming at the problem of un-balanced corpus in orientation analysis, this paper proposed a method of sentiment analysis forcross-domain. In this method, we analyzed the emotional tendency of the unknown words in thetarget field in use of the labeled information in the source field.
     This method firstly divided emotional words into two categories: dependent emotionalwords and independent, based on which two assumptions of the original orientation analysiswould be extended, the relationship between the source field and target field be constructed toachieve the goal of emotional words extension. The whole method involved emotional wordsextraction and emotional words orientation definition two steps. The phase of emotional wordsextraction adopted a method combining part-of-speech information and improved mutual infor-mation to calculate the dependence intensity between candidate emotional words and evaluationobjects, and obtain the emotional word set of the target field.
     For the purpose of orientation definition, the relationships between words and words, wordsand evaluation objects, words and documents were constructed, using which the emotional ten-dency of each emotional word could be calculated, ultimately achieving the goal of interdiscip-linary emotional words extension.
     (2) Research on orientation analysis of evaluation phrases. An evaluation phrases tendencyanalysis method basing on emotional expectations of evaluation objects was put forward. In viewof the problem of emotional context dependence, first of all, the context of emotional wordswould be decomposed into evaluation objects, the potential emotion of which was used to quan-tize the impact of evaluation objects on phrases tendency. On this basis, the relationships be-tween evaluation objects, emotional words, evaluation phrases could be constructed. Finally, theobjective function of phrase orientation analysis would be constructed based on heuristic rules,to achieve the goal of phrase orientation analysis. Experiments showed that, combining with theemotional expectations of evaluation objects, tendency recognition of evaluation phrases hadbeen effectively improved.
     (3)Research on the problem of negative sentences orientation analysis. For the negativephenomena that exist in the sentence tendency analysis, this article analyzed the main factorsinfluencing the negative sentences orientation analysis and the negative scope of negative words,on this basis, put forward a kind of negative sentences tendency analysis method based on cas-caded HMM. The method was divided into three levels, of which HMM HMM-1and HMM-2were applied to identify evaluation objects contained in the negative sentences, and define thepotential emotional tendency of every evaluation object. Then negative words contained in thesentences would be put as the trigger condition to correct the emotional tendency of evaluationphrases; finally, global tendency of the sentence be computed according to sentence rules. Thismethod attended Task1of2012the fourth national orientation information measurement, whichwas exactly Chinese negative sentences orientation analysis, and obtained optimal evaluationresults in all submitted results.
     (4)Research on the problem of comment text model construction, in order to fully capturethe emotional tendencies of network users on a particular topic or product, to solve the defectsthat it is difficult to capture the global information in simple use of evaluation attributes. Thispaper built a model for correlation detection of comment text. In this model, comment text wasseen as a hierarchy. First of all, the comment text would be divided into several individual se-mantic units; the semantic units further be divided into two parts: subject attribute and semanticunit attribute. Among them, the subject property was used for global correlation of the same top-ic or product, and the semantic unit attribute was used to distinguish the relationships betweenthe topics or child attributes. For the division of semantic units, in this paper, the traditional In-formation Bottleneck Method (referred to as IB) was expanded based on comment text feature,and used to divide semantic units; in the correlation detection of related topics/products, the me-thod of weighted KL for correlation detection was adopted. In order to verify the feasibility ofthis thought, this paper respectively conducted tests on TDT4data sets, and the results showedthat the model built in this paper could capture the correlation relationship between the sametopics/products more accurately.
引文
[1] Pak A, Paroubek P. Twitter as a Corpus for Sentiment Analysis and Opinion Mining[C]//Proceedings of the7th Int. Conf. Language Resources and Evaluation, European LanguageResources Association LREC.2010:1320-1326.
    [2] Wang H, Can D, Kazemzadeh A, et al. A system for real-time twitter sentiment analysis of2012us presidential election cycle[C].Proceedings of the ACL2012System Demonstra-tions. Association for Computational Linguistics,2012:115-120.
    [3] Tumasjan A, Sprenger T O, Sandner P G, et al. Predicting Elections with Twitter: What140Characters Reveal about Political Sentiment[J]. ICWSM,2010,10:178-185.
    [4] Bollen J, Mao H, Zeng X. Twitter mood predicts the stock market[J]. Journal of Computa-tional Science,2011,2(1):1-8.
    [5] Rui H, Liu Y, Whinston A B. Whose and What Chatter Matters? The effect of tweets onMovie Sales[J]. Decision Support Systems.2013,2:1-26.
    [6] Lakkaraju H, Ajmera J. Attention prediction on social media brand pages[C].Proceedings ofthe20th ACM international conference on Information and knowledge management. ACM,2011:2157-2160.
    [7]万源.基于语义统计分析的网络舆情挖掘技术研究[D].武汉,武汉理工大学博士论文,2012.
    [8]张长利.面向特定领域的互联网舆情分析技术研究[D].吉林,吉林大学博士论文,2011.
    [9]曹树金,周小又,陈桂鸿.网络舆情监控系统中的主题贴自动标引及情感倾向分析研究[J],图书情感报知识,2012,(1):66-73.
    [10]黄九鸣.面向舆情分析和属性发现的网络文本挖掘技术研究[D].长沙,国防科技术大学博士论文,2011.
    [11] Hu N, Koh N S, Reddy S K. Ratings lead you to the product, reviews help you clinch it?The mediating role of online review sentiments on product sales. Decision Support Systems,2014,57:42-53.
    [12] Zhang W, Xu H, Wan W. Weakness Finder: Find product weakness from Chinese reviews byusing aspects based sentiment analysis. Expert Systems with Applications2012,39:10283-10291.
    [13] Zhang M, Ye X. A generation model to unify topic relevance and lexicon-based sentimentfor opinion retrieval[C].Proceedings of the31st annual international ACM SIGIR confe-rence on Research and development in information retrieval. ACM,2008:411-418.
    [14] Li S T, Tsai F C. A fuzzy conceptualization model for text mining with application in opi-nion polarity classification. Knowledge-Based Systems2013,39:23-33.
    [15] Das S. The finance web: internet information and markets[J]. IEEE Intelligent Systems,2010,25(2):74-78.
    [16] Ma Z, Sheng O R L, Pant G. Discovering company revenue relations from news: A net-work approach[J]. Decision Support Systems,2009,47(4):408-414.
    [17] Schumaker R P, Zhang Y, Huang C N, et al. Evaluating sentiment in financial news articles.Decision Support Systems53(2012):458-464.
    [18] Yang J, Adamic L, Ackerman M, et al. The way i talk to you: sentiment expression in anorganizational context[C].Proceedings of the SIGCHI Conference on Human Factors inComputing Systems. ACM,2012:551-554.
    [19] Tan C, Lee L, Tang J, et al. User-level sentiment analysis incorporating social net-works[C].Proceedings of the17th ACM SIGKDD international conference on Knowledgediscovery and data mining. ACM,2011:1397-1405.
    [20] Laver M, Benoit K, Garry J. Extracting policy positions from political texts using words asdata[J]. American Political Science Review,2003,97(02):311-331.
    [21] Kim S M, Hovy E. Determining the sentiment of opinions[C].Proceedings of the20th in-ternational conference on Computational Linguistics. Association for Computational Lin-guistics,2004:1367-1385.
    [22] Yu H, Hatzivassiloglou V. Towards answering opinion questions: Separating facts fromopinions and identifying the polarity of opinion sentences[C].Proceedings of the2003con-ference on Empirical methods in natural language processing. Association for Computa-tional Linguistics,2003:129-136.
    [23] Hatzivassiloglou V, McKeown K R. Predicting the semantic orientation of adjec-tives[C].Proceedings of the35th Annual Meeting of the Association for ComputationalLinguistics and Eighth Conference of the European Chapter of the Association for Compu-tational Linguistics. Association for Computational Linguistics,1997:174-181.
    [24] Wiebe J. Learning subjective adjectives from corpora[C]. Proceedings of AAAI’00.2000:735-740.
    [25] Turney P D, Littman M L. Measuring praise and criticism: Inference of semantic orientationfrom association[J]. ACM Transactions on Information Systems (TOIS),2003,21(4):315-346.
    [26] Turney P, Littman M L. Unsupervised learning of semantic orientation from a hun-dred-billion-word corpus[C]. Proceedings of ERC1094, National Research Council of Can-da.2002,217-222.
    [27] Pang B, Lee L. Opinion mining and sentiment analysis [J]. Foundations and trends in in-formation retrieval,2008,2(1-2):1-135.
    [28] Bak J Y, Kim S, Oh A. Self-disclosure and relationship strength in twitter conversa-tions[C].Proceedings of the50th Annual Meeting of the Association for ComputationalLinguistics: Short Papers-Volume2. Association for Computational Linguistics,2012:60-64.
    [29] Hui P, Gregory M. Quantifying sentiment and influence in blogspaces[C].Proceedings ofthe First Workshop on Social Media Analytics. ACM,2010:53-61.
    [30] Gamon M, Aue A. Automatic identification of sentiment vocabulary: exploiting low associ-ation with known sentiment terms[C].Proceedings of the ACL Workshop on Feature Engi-neering for Machine Learning in Natural Language Processing. Association for Computa-tional Linguistics,2005:57-64.
    [31] Kim S M, Hovy E. Determining the sentiment of opinions[C].Proceedings of the20th in-ternational conference on Computational Linguistics. Association for Computational Lin-guistics,2004:1367-1374.
    [32] Esuli A, Sebastiani F. Sentiwordnet: A publicly available lexical resource for opinion min-ing[C].Proceedings of the5th Conference on Language Resources and Evaluation, LREC.2006,6:417-422.
    [33] Lu Y, Castellanos M, Dayal U, et al. Automatic construction of a context-aware sentimentlexicon: an optimization approach[C].Proceedings of the20th international conference onWorld wide web. ACM,2011:347-356.
    [34] Wu Q, Tan S. A two-stage framework for cross-domain sentiment classification[J]. ExpertSystems with Applications,2011,38(11):14269-14275.
    [35] Choi Y, Cardie C. Adapting a polarity lexicon using integer linear programming for do-main-specific sentiment classification[C].Proceedings of the2009Conference on EmpiricalMethods in Natural Language Processing: Volume2-Volume2. Association for Computa-tional Linguistics,2009:590-598.
    [36] Ding X, Liu B, Yu P S. A holistic lexicon-based approach to opinion min-ing[C].Proceedings of the2008International Conference on Web Search and Data Mining.ACM,2008:231-240.
    [37] Wilson T, Wiebe J, Hoffmann P. Recognizing contextual polarity: An exploration of fea-tures for phrase-level sentiment analysis[J]. Computational linguistics,2009,35(3):399-433.
    [38] Riloff E, Wiebe J. Learning extraction patterns for subjective expressions[C].Proceedings ofthe2003conference on Empirical methods in natural language processing. Association forComputational Linguistics,2003:105-112.
    [39] Gindl S, Weichselbraun A, Scharl A. Cross-domain contextualisation of sentiment lexicons
    [C].Proceedings of the2010conference ECAI,2010:771-776.
    [40] Weichselbraun A, Gindl S, Scharl A. Using games with a purpose and bootstrapping tocreate domain-specific sentiment lexicons[C].Proceedings of the20th ACM internationalconference on Information and knowledge management. ACM,2011:1053-1060.
    [41] Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summariza-tion based on minimum cuts[C].Proceedings of the42nd annual meeting on Association forComputational Linguistics. Association for Computational Linguistics,2004:271-278.
    [42] Eguchi K, Lavrenko V. Sentiment retrieval using generative models[C].Proceedings of the2006conference on empirical methods in natural language processing. Association forComputational Linguistics,2006:345-354.
    [43] Zhao J, Liu K, and Wang G. Adding redundant features for CRF-based sentence sentimentclassification[C].Proceedings of the2008Conference on Empirical Methods in NaturalLanguage Processing,2008:11-126.
    [44] Yu H, Hatzivassiloglou V. Towards answering opinion questions: Separating facts fromopinions and identifying the polarity of opinion sentences[C].Proceedings of the2003con-ference on Empirical methods in natural language processing. Association for Computa-tional Linguistics,2003:129-136.
    [45] Hu M, Liu B. Mining and summarizing customer reviews[C].Proceedings of the tenth ACMSIGKDD international conference on Knowledge discovery and data mining. ACM,2004:168-177.
    [46] Wang C, Lu J, Zhang G. A semantic classification approach for online product reviews[C].Proceedings of the2005IEEE/WIC/ACM International Conference on Web Intelligence.2005,273-286.
    [47] Nakagawa T, Inui K, Kurohashi S. Dependency tree-based sentiment classification usingCRFs with hidden variables[C].Human Language Technologies: The2010Annual Confe-rence of the North American Chapter of the Association for Computational Linguistics.Association for Computational Linguistics,2010:786-794.
    [48] Pang B, Lee L. Seeing stars: Exploiting class relationships for sentiment categorization withrespect to rating scales[C].Proceedings of the43th Annual Meeting on Association forComputational Linguistics,2005:115-124.
    [49] Snyder B, Barzilay R. Multiple Aspect Ranking Using the Good Grief Algo-rithm[C].HLT-NAACL.2007:300-307.
    [50] Socher R, Pennington J, Huang E H, et al. Semi-supervised recursive autoencoders for pre-dicting sentiment distributions[C].Proceedings of the Conference on Empirical Methods inNatural Language Processing. Association for Computational Linguistics,2011:151-161.
    [51] Zhao Y Y, Qin B, Liu T. Integrating intra-and inter-document evidences for improving sen-tence sentiment classification[J]. Acta Automatica Sinica,2010,36(10):1417-1425.
    [52] Whitelaw C, Garg N, Argamon S. Using appraisal groups for sentiment analy-sis[C].Proceedings of the14th ACM international conference on Information and know-ledge management. ACM,2005:625-631.
    [53] Pang B, Lee L, Vaithyanathan S. Thumbs up?: sentiment classification using machinelearning techniques[C].Proceedings of the ACL-02conference on Empirical methods innatural language processing-Volume10. Association for Computational Linguistics,2002:79-86.
    [54] Mullen T, Collier N. Sentiment Analysis using Support Vector Machines with Diverse In-formation Sources[C].EMNLP.2004,4:412-418.
    [55] McCallum A, Nigam K. A comparison of event models for naive bayes text classifica-tion[C].AAAI-98workshop on learning for text categorization.1998,752:41-48.
    [56] Lanquillon C. Learning from labeled and unlabeled documents: A comparative study onsemi-supervised text classification[M].Principles of Data Mining and Knowledge Discovery.Springer Berlin Heidelberg,2000:490-497.
    [57] Aue A, Gamon M. Customizing sentiment classifiers to new domains: A casestudy[C].Proceedings of recent advances in natural language processing (RANLP).2005,1(3.1):93-98.
    [58] Daumé III H, Marcu D. Domain Adaptation for Statistical Classifiers[J]. Journal of ArtificialIntelligence Research,2006,26,101-126.
    [59] Blitzer J, Dredze M, Pereira F. Biographies, bollywood, boom-boxes and blenders: Domainadaptation for sentiment classification[C].ACL.2007,7:440-447.
    [60] Tan S, Cheng X, Wang Y, et al. Adapting naive bayes to domain adaptation for sentimentanalysis[M].Advances in Information Retrieval. Springer Berlin Heidelberg,2009:337-349.
    [61] Li S, Wang Z, Zhou G, et al. Semi-supervised learning for imbalanced sentiment classifica-tion[C]. Proceedings of the Twenty-Second International Joint Conference on Artificial In-telligence,2011,1826-1831.
    [62] Wang S, Li D, Zhao L, et al. Sample cutting method for imbalanced text sentiment classifi-cation based on BRC[J]. Knowledge-Based Systems,2013,37:451-461.
    [63]邹嘉彦.评述新闻报道或文章色彩——正负两极性自动分类研究[C].自然语言理解与大规模内容计算——全国第八届计算语言学联合学术会议.北京:清华大学出版社,2005:21-23.
    [64]徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[C],第三届学生计算语言学研究研讨会论文集.2006:91-100.
    [65]姚天昉,聂青阳,李建超,李林林,娄德成,陈珂,付宇.一个用于汉语汽车评论意见挖掘系统[C].中文信息处理前沿进展.中国中文信息学会成立二十五周年学术会议.北京:清华大学出版社,2006:260-281.
    [66] Yuen R W M, Chan T Y W, Lai T B Y, et al. Morpheme-based derivation of bipolar seman-tic orientation of Chinese words[C].Proceedings of the20th international conference onComputational Linguistics. Association for Computational Linguistics,2004:1008.
    [67]朱嫣岚,闵锦,周雅倩,黄萱菁,吴立德.基于HowNet的词汇语义倾向计算[J],中文信息学报,2006,21(1):14-20.
    [68]徐琳宏,林鸿飞,杨志豪.基于语义理解的文本倾向性识别机制[C],第三届学生计算语言学研讨人论文集.2006:91-100.
    [69]王根,赵军.中文褒贬义词语倾向性的分析[C].第三届学生计算语言研讨会论文集.2006:81-85.
    [70]王素格,李德玉,魏英杰.基于赋权粗糙隶属度的文本情感分类方法[J].计算机研究与发展,2011,48(5):855-861.
    [71]赵妍妍,秦兵,车万翔,刘挺.基于句法路径的情感评价单元识别[J].软件学报,2011,22(5):887-898.
    [72] Wu Y, Wen M. Disambiguating dynamic sentiment ambiguous adjectives[C].Proceedingsof the23rd International Conference on Computational Linguistics. Association for Com-putational Linguistics,2010:1191-1199.
    [73]廖祥文,李艺红.基于N-gram超核的中文倾向性句子识别[J].中文信息学报,2011,25(5):89-100.
    [74] Shi H, Chen W, Li X. Opinion Sentence Extraction and Sentiment Analysis for ChineseMicroblogs[M].Natural Language Processing and Chinese Computing. Springer BerlinHeidelberg,2013:417-423.
    [75]廖祥文,曹冬林,方滨兴等.基于概率推理模型博客倾向性检索研究[J].计算机研究与发展,2009,46(9):1530-1536.
    [76]王素格,李德玉,魏英杰.基于赋权粗糙隶属度的文本情感分类方法[J].计算机研究与发展,2011,48(5):855-861.
    [77]杨江,彭石玉,侯敏.基于浅层篇章结构的评论文倾向性分析[J].中文信息学报,2011,25(2):83-88.
    [78]林政,谭松波,程学旗.基于情感关键句抽取的情感分类研究[J].计算机研究与发展,2012,49(11):2376-2382.
    [79] Wiebe J M. Tracking point of view in narrative[J]. Computational Linguistics,1994,20(2):233-287.
    [80] Tong R M. An operational system for detecting and tracking opinions in on-line discus-sion[C].Working Notes of the ACM SIGIR2001Workshop on Operational Text Classifica-tion.2001:1-6.
    [81] Ku L W, Liang Y T, Chen H H. Opinion Extraction, Summarization and Tracking in Newsand Blog Corpora[C].AAAI Spring Symposium: Computational Approaches to AnalyzingWeblogs.2006,100-107.
    [82] Li S, Lee S Y M, Chen Y, et al. Sentiment classification and polarity shift-ing[C].Proceedings of the23rd International Conference on Computational Linguistics.Association for Computational Linguistics,2010:635-643.
    [83] Esuli A, Sebastiani F. Determining the semantic orientation of terms through gloss classifi-cation[C].Proceedings of the14th ACM international conference on Information andknowledge management. ACM,2005:617-624.
    [84] Xu T, Peng Q, Cheng Y. Identifying the semantic orientation of terms using S-HAL for sen-timent analysis[J]. Knowledge-Based Systems,2012,35:279-289.
    [85] Wilson T, Wiebe J, Hoffmann P. Recognizing contextual polarity in phrase-level sentimentanalysis[C].Proceedings of the conference on human language technology and empiricalmethods in natural language processing. Association for Computational Linguistics,2005:347-354.
    [86] Li S, Lee S Y M, Chen Y, et al. Sentiment classification and polarity shifting[C].Proceedingsof the23rd International Conference on Computational Linguistics. Association for Com-putational Linguistics,2010:635-643.
    [87] Du W, Tan S. Building domain-oriented sentiment lexicon by improved information bottle-neck[C].Proceedings of the18th ACM conference on Information and knowledge manage-ment. ACM,2009:1749-1752.
    [88] Moreo A, Romero M, Castro J L, et al. Lexicon-based comments-oriented news sentimentanalyzer system[J]. Expert Systems with Applications,2012,39(10):9166-9180.
    [89] Kamps J, Marx M J, Mokken R J, et al. Using wordnet to measure semantic orientations ofadjectives[C]. In Proceedings of LREC,2004:1115-1118.
    [90] Popescu A M, Etzioni O. Extracting product features and opinions from reviews[C]. In:Proceedings of the conference on Human Language Technology and Empirical Methods inNatural Language Processing.2005:339-346.
    [91] Wang S, Li D, Zhao L, et al. Sample cutting method for imbalanced text sentiment classifi-cation based on BRC[J]. Knowledge-Based Systems,2013,37:451-461.
    [92] Tan S, Wu Q. A random walk algorithm for automatic construction of domain-oriented sen-timent lexicon[J]. Expert Systems with Applications,2011,38(10):12094-12100.
    [93] Qiu G, Liu B, Bu J, et al. Expanding Domain Sentiment Lexicon through Double Propaga-tion[C]. Proceedings of the Twenty-First international Joint Conference on Artificial Intel-ligence(IJCAI-09),2009,9:1199-1204.
    [94] Wilson T, Wiebe J, Hoffmann P. Recognizing contextual polarity: An exploration of fea-tures for phrase-level sentiment analysis[J]. Computational linguistics,2009,35(3):399-433.
    [95] Wiebe J. Learning subjective adjectives from corpora[C].AAAI/IAAI.2000:735-740.
    [96] Yu L C, Wu J L, Chang P C, et al. Using a contextual entropy model to expand emotionwords and their intensity for the sentiment classification of stock market news[J]. Know-ledge-Based Systems,2013,41:89-97.
    [97] Lambov D, Pais S, Dias G. Merged agreement algorithms for domain independent senti-ment analysis[J]. Procedia-Social and Behavioral Sciences,2011,27:248-257.
    [98] Qu L, Ifrim G, Weikum G. The bag-of-opinions method for review rating prediction fromsparse text patterns[C].Proceedings of the23rd International Conference on ComputationalLinguistics. Association for Computational Linguistics,2010:913-921.
    [99] Ding X, Liu B, Yu P S. A holistic lexicon-based approach to opinion min-ing[C].Proceedings of the2008International Conference on Web Search and Data Mining.ACM,2008:231-240.
    [100] Tan S, Wang Y, Wu G, et al. Using unlabeled data to handle domain-transfer problem ofsemantic detection[C].Proceedings of the2008ACM symposium on Applied computing.ACM,2008:896-903.
    [101] Wu Q, Tan S, Cheng X. Graph ranking for sentiment transfer[C].Proceedings of theACL-IJCNLP2009Conference Short Papers. Association for Computational Linguistics,2009:317-320.
    [102] Du W, Tan S, Cheng X, et al. Adapting information bottleneck method for automatic con-struction of domain-oriented sentiment lexicon[C].Proceedings of the third ACM interna-tional conference on Web search and data mining. ACM,2010:111-120.
    [103] Chen W, Zhou J. A Text Classifier with Domain Adaptation for Sentiment Classifica-tion[M].Information Retrieval Technology. Springer Berlin Heidelberg,2010:61-72.
    [104] He Y, Lin C, Alani H. Automatically extracting polarity-bearing topics for cross-domainsentiment classification[C].Proceedings of the49th Annual Meeting of the Association forComputational Linguistics: Human Language Technologies-Volume1. Association forComputational Linguistics,2011:123-131.
    [105] Guo H, Zhu H, Guo Z, et al. Domain customization for aspect-oriented opinion analysiswith multi-level latent sentiment clues[C].Proceedings of the20th ACM international con-ference on Information and knowledge management. ACM,2011:2493-2496.
    [106] Shamshurin I. Extracting domain-specific opinion words for sentiment analy-sis[M].Advances in Computational Intelligence. Springer Berlin Heidelberg,2013:58-68.
    [107] Gindl S, Weichselbraun A, Scharl A. Cross-domain contextualisation of sentiment lex-icons[C]. In Proceedings of the2010conference ECAI,2010:771-776.
    [108] Weichselbraun A, Gindl S, Scharl A. Using games with a purpose and bootstrapping tocreate domain-specific sentiment lexicons[C].Proceedings of the20th ACM internationalconference on Information and knowledge management. ACM,2011:1053-1060.
    [109] Yang Y Z, Liu P Y, Fei S D, Du W T. A Method of Phrase Sentiment Analysis Based onPotential Sentiment Expectation of Context Aspects. ICIC-EL.2013,7(10):2855-2860.
    [110] Cohen J. A coefficient of agreement for nominal scales[C]. Educational and Psychologicalmeasurements20,1960:37-46.
    [111] Kanayama H, Nasukawa T. Fully automatic lexicon expansion for domain-oriented senti-ment analysis[C].Proceedings of the2006Conference on Empirical Methods in NaturalLanguage Processing. Association for Computational Linguistics,2006:355-363.
    [112] Jo Y, Oh A H. Aspect and sentiment unification model for online review analy-sis[C].Proceedings of the fourth ACM international conference on Web search and datamining. ACM,2011:815-824.
    [113] Ding X, Liu B, Zhang L. Entity discovery and assignment for opinion mining applica-tions[C].Proceedings of the15th ACM SIGKDD international conference on Knowledgediscovery and data mining. ACM,2009:1125-1134.
    [114] Esuli A, Sebastiani F. Determining Term Subjectivity and Term Orientation for OpinionMining[C].EACL.2006,6:193-200.
    [115] Jiang L, Yu M, Zhou M, et al. Target-dependent twitter sentiment classifica-tion[C].Proceedings of the49th Annual Meeting of the Association for Computational Lin-guistics: Human Language Technologies-Volume1. Association for Computational Lin-guistics,2011:151-160.
    [116] Zhang L, Liu B. Identifying noun product features that imply opinions[C].Proceedings ofthe49th Annual Meeting of the Association for Computational Linguistics: Human Lan-guage Technologies: short papers-Volume2. Association for Computational Linguistics,2011:575-580.
    [117] Fei Z, Liu J, Wu G. Sentiment classification using phrase patterns[C].Computer and In-formation Technology,2004. CIT'04. The Fourth International Conference on. IEEE,2004:1147-1152.
    [118] Pang B, Lee L. A sentimental education: Sentiment analysis using subjectivity summariza-tion based on minimum cuts[C].Proceedings of the42nd annual meeting on Association forComputational Linguistics. Association for Computational Linguistics,2004:271-279.
    [119] Wilson T, Wiebe J, Hoffmann P. Recognizing contextual polarity: An exploration of fea-tures for phrase-level sentiment analysis[J]. Computational linguistics,2009,35(3):399-433.
    [120] Nam S H, Na S H, Kim J, et al. Partially Supervised Phrase-Level Sentiment Classifica-tion[M].Computer Processing of Oriental Languages. Language Technology for the Know-ledge-based Economy. Springer Berlin Heidelberg,2009:225-235.
    [121] Yessenalina A, Cardie C. Compositional matrix-space models for sentiment analy-sis[C].Proceedings of the Conference on Empirical Methods in Natural LanguageProcessing. Association for Computational Linguistics,2011:172-182.
    [122] Drury B, Almeida J J. Identification of fine grained feature based event and sentimentphrases from business news stories[C].Proceedings of the International Conference on WebIntelligence, Mining and Semantics. ACM,2011:27.
    [123] Chen L, Qi L, Wang F. Comparison of feature-level learning methods for mining onlineconsumer reviews[J]. Expert Systems with Applications,2012,39(10):9588-9601.
    [124] Wu Y, Wen M. Disambiguating dynamic sentiment ambiguous adjectives[C].Proceedingsof the23rd International Conference on Computational Linguistics. Association for Com-putational Linguistics,2010:1191-1199.
    [125] Brody S, Elhadad N. An unsupervised aspect-sentiment model for online re-views[C].Human Language Technologies: The2010Annual Conference of the NorthAmerican Chapter of the Association for Computational Linguistics. Association for Com-putational Linguistics,2010:804-812.
    [126] Turney P D. Thumbs up or thumbs down?: semantic orientation applied to unsupervisedclassification of reviews[C].Proceedings of the40th annual meeting on association forcomputational linguistics. Association for Computational Linguistics,2002:417-424.
    [127] Meena A, Prabhakar T V. Sentence level sentiment analysis in the presence of conjunctsusing linguistic analysis[M].Advances in Information Retrieval. Springer Berlin Heidelberg,2007:573-580.
    [128] Martín-Valdivia M T, Martínez-Cámara E, Perea-Ortega J M, et al. Sentiment polarity de-tection in spanish reviews combining supervised and unsupervised approaches[J]. ExpertSystems with Applications,2013,40(10):3934-3942.
    [129] Das D, Bandyopadhyay S. Sentence-Level Emotion and Valence Tagging[J]. CognitiveComputation,2012,4(4):420-435.
    [130] Moilanen K, Pulman S. Sentiment composition[C].Proceedings of the Recent Advances inNatural Language Processing International Conference.2007:378-382.
    [131] Benamara F, Chardon B, Mathieu Y, et al. How do negation and modality impact on opi-nions?[C].Proceedings of the Workshop on Extra-Propositional Aspects of Meaning inComputational Linguistics. Association for Computational Linguistics,2012:10-18.
    [132] Jia L, Yu C, Meng W. The effect of negation on sentiment analysis and retrieval effec-tiveness[C].Proceedings of the18th ACM conference on Information and knowledge man-agement. ACM,2009:1827-1830.
    [133]杜文韬,刘培玉,费绍栋,张朕.基于关联特征词表的中文比较句识别[J].计算机应用,2013,33(6):1591-1594.
    [134]张辰,冯冲,刘全超,师超,黄河燕,周海云.基于多特征融合的中文比较句识别算法[J].中文信息学报,2013,27(6):110-116.
    [135]杨源,林鸿飞.基于产品属性的条件句倾向性分析[J].中文信息学报,2011,25(3):86-92.136] Fine S, Singer Y, Tishby N. The hierarchical hidden Markov model: Analysis and applica-tions[J]. Machine learning,1998,32(1):41-62.
    [137]张磊,李梦诗,陈黎,等.基于双层HHMM的产品评论特征和情感分类[J],四川大学学报(工程科学版),2013,45(2):94-102.
    [138] Choi Y, Oh H J, Myaeng S H. A generate-and-test method of detecting negative-sentimentsentences[M].Computational Linguistics and Intelligent Text Processing. Springer BerlinHeidelberg,2012:500-512.
    [139] Mei Q, Ling X, Wondra M, et al. Topic sentiment mixture: modeling facets and opinionsin weblogs[C].Proceedings of the16th international conference on World Wide Web. ACM,2007:171-180.
    [140] Zeng J, Zhang S. Incorporating topic transition in topic detection and tracking algo-rithms[J]. Expert Systems with Applications,2009,36(1):227-232.
    [141] Wu X, Ide I, Satoh S. News topic tracking and re-ranking with query expansion based onnear-duplicate detection[M].Advances in Multimedia Information Processing-PCM2009.Springer Berlin Heidelberg,2009:755-766.
    [142] Silva I S, Gomide J, Veloso A, et al. Effective sentiment stream analysis withself-augmenting training and demand-driven projection[C].Proceedings of the34th interna-tional ACM SIGIR conference on Research and development in Information Retrieval.ACM,2011:475-484.
    [143] Steyvers M, Smyth P, Rosen-Zvi M, et al. Probabilistic author-topic models for informa-tion discovery[C].Proceedings of the tenth ACM SIGKDD international conference onKnowledge discovery and data mining. ACM,2004:306-315.
    [144] Liu Y, Niculescu-Mizil A, Gryc W. Topic-link lda: joint models of topic and author com-munity[C].Proceedings of the26th Annual International Conference on Machine Learning.ACM,2009:665-672.
    [145] Hofmann T. Probabilistic latent semantic indexing[C].Proceedings of the22nd annual in-ternational ACM SIGIR conference on Research and development in information retrieval.ACM,1999:50-57.
    [146] Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J]. the Journal of machineLearning research,2003,3:993-1022.
    [147] Blei D M, Griffiths T L, Jordan M I, et al. Hierarchical Topic Models and the Nested Chi-nese Restaurant Process[C].NIPS.2003,16.
    [148] Zhai C X, Velivelli A, Yu B. A cross-collection mixture model for comparative text min-ing[C].Proceedings of the tenth ACM SIGKDD international conference on Knowledgediscovery and data mining. ACM,2004:743-748.
    [149] Lacoste-Julien S, Sha F, Jordan M I. DiscLDA: Discriminative Learning for Dimensional-ity Reduction and Classification[C].NIPS.2008,83:85.
    [150] Chen E, Lin Y, Xiong H, et al. Exploiting probabilistic topic models to improve text cate-gorization under class imbalance[J]. Information Processing&Management,2011,47(2):202-214.
    [151] Zeng J, Wu C, Wang W. Multi-grain hierarchical topic extraction algorithm for text min-ing[J]. Expert Systems with Applications,2010,37(4):3202-3208.
    [152] Li W, McCallum A. Pachinko allocation: DAG-structured mixture models of topic correla-tions[C]. Proceedings of the23rd international conference on Machine learning,2006:577-584.
    [153] Mimno D, Li W, McCallum A. Mixtures of hierarchical topics with pachinko alloca-tion[C].Proceedings of the24th international conference on Machine learning. ACM,2007:633-640.
    [154] Du L, Buntine W, Jin H. A segmented topic model based on the two-parameter Pois-son-Dirichlet process[J]. Machine learning,2010,81(1):5-19.
    [155] Wu H, Bu J, Chen C, et al. Locally discriminative topic modeling[J]. Pattern Recognition,2012,45(1):617-625.
    [156] Kumaran G, Allan J. Text classification and named entities for new event detec-tion[C].Proceedings of the27th annual international ACM SIGIR conference on Researchand development in information retrieval. ACM,2004:297-304.
    [157] Allan J, Carbonell J G, Doddington G, et al. Topic detection and tracking pilot study finalreport[C]. Proceedings of the Broadcast News Transcription and Understanding Workshop,1998:194-218.
    [158] Naptali W, Tsuchiya M, Nakagawa S. Topic-dependent language model with voting onnoun history[J]. ACM Transactions on Asian Language Information Processing (TALIP),2010,9(7):1-31.
    [159]石晶,范猛,李万龙.基于LDA模型的主题分析[J].自动化学报,2009,35(12):1586-1592.
    [160] Nallapati R, Feng A, Peng F, et al. Event threading within news topics[C].Proceedings ofthe thirteenth ACM international conference on Information and knowledge management.ACM,2004:446-453.
    [161] Chemudugunta C, Smyth P, Steyvers M. Combining concept hierarchies and statisticaltopic models[C].Proceedings of the17th ACM conference on Information and knowledgemanagement. ACM,2008:1469-1470.
    [162]洪宇,张宇,范基礼,等.基于语义域语言模型的中文话题关联检测[J].软件学报,2008,19(9):2265-2275.
    [163] Wang L, Li F. Story link detection based on event words[M].Computational Linguisticsand Intelligent Text Processing. Springer Berlin Heidelberg,2011:202-211.
    [164]胡艳丽,白亮,张维明.一种话题演化建模与分析方法.自动化学报,2012,38(10):1690-1697.
    [165] Zhu T, Wang B, Wu B, et al. Topic correlation and individual influence analysis in onlineforums[J]. Expert Systems with Applications,2012,39(4):4222-4232.
    [166] Garrido G, Penas A, Cabaleiro B, et al. Temporally anchored relation extrac-tion[C].Proceedings of the50th Annual Meeting of the Association for Computational Lin-guistics,2012:107-116.
    [167] Chambers N. Labeling documents with timestamps: Learning from their time expres-sions[C].Proceedings of the50th Annual Meeting of the Association for ComputationalLinguistics,2012:98-106.
    [168] Lakshmi K, Mukherjee S. Using cohesion-model for story link detection system[J].IJCSNS International Journal of Computer Science and Network Security,2007,7(3):59-66.
    [169] Zhang K, Zi J, Wu L G. New event detection based on indexing-tree and named enti-ty[C].Proceedings of the30th annual international ACM SIGIR conference on Research anddevelopment in information retrieval. ACM,2007:215-222.
    [170] Nomoto T. Two-tier similarity model for story link detection[C].Proceedings of the19thACM international conference on Information and knowledge management. ACM,2010:789-798.
    [171]张阔,李涓子,吴刚,王克宏.基于关键词元的话题内事件检测[J].计算机研究与发展,2009,46(2):245-252.
    [172] Tishby N, Pereira F C, Bialek W. The information bottleneck method[C]. Proceedings ofthe37thAllerton Conerence on Communiciation, Control and Computing. Illinois, USA,1999.368-377.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700