网络评论观点知识发现研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
当今的中国,客观存在两个社会舆论场,一个是以报纸、广播电视等为主流媒体的社会舆论场,一个是以互联网和近几年来兴起的Web2.0应用为平台的民间舆论场。在新的Web2.0环境下,基于互联网的社会舆论平台除了原有的网站新闻评论、BBS等形式外,又涌现出了聚合新闻(RSS)、维基百科(Wiki)、QQ等即时通信工具(IM)、(微)博客、播客、淘宝与易趣综合的商务平台等新形式,使得网络当中的评论信息量得到了快速增长。目前我国网民规模已经进入发展平台期,手机成为新增网民的第一主力,微博、社区等微内容成为网络评论观点的主要来源,及时性、开放性、交互性、思想性、草根性成为网络评论信息的新特征,深深影响着人们生活的各个领域,改变了社会舆论生成演变与聚合的机制,拓展了社会舆论的传播空间。
     在Web2.0环境下,人们普遍感到,获得观点已经与获取信息同等重要,但要想从中获得体现价值的观点信息却变得越来越困难。究其原因在于:一是由于发表评论的人角度或目的不同,评论观点经常是正面和负面意见相混合,从中准确获取评论信息将花费很多时间和精力;二是由于以Web2.0应用为平台的民间舆论场的信息源受到较大的污染,网络评论中的这些主观信息五花八门、纷繁芜杂,良莠不齐,而以往采用的传统网络社会舆论分析技术手段(主要对象是网页和论坛)对动态性更强、结构更复杂的Web2.0网络应用处理能力有限,无法获取这些深层社会舆论信息要素,也无法甄对信息真伪,影响了网络评论信息分析效果。鉴于此,开展对于Web2.0的网络评论信息的分析研究,有助于我们更好地发掘蕴含在网络评论背后的观点信息,为决策和对未来的预测提供更加深层和丰富的信息支持,同时在理论上丰富网络评论信息分析的理论体系。
     本论文以Web2.0应用为平台的民间舆论场的信息源为逻辑起点,综合运用文本挖掘、观点挖掘、知识发现、LDA主题模型、本体学习等理论和方法,从主题聚类视角对网络评论信息分析模式、观点挖掘的理论、技术、方法及其应用等问题进行了较深入系统的研究。
     论文所做的主要研究工作如下:
     (1)对选题相关的国内外研究现状、热点与前沿、应用进展进行了较全面系统的分析与综述。梳理、分析了网络评论观点知识发现的相关理论与方法,为本研究工作的展开奠定了坚实的理论与方法基础。
     (2)以显式观点的特征-情感关联关系发现方式作为非结构化评论文本的观点挖掘基础,利用网站提供的半结构化的显式观点提取评论对象的特征、情感极性和二者搭配关系,构建观点知识库,在一定程度上解决情感词语境敏感问题,将观点知识库作为非结构化评论文本的观点挖掘基础,辅助完整的挖掘工作。
     (3)提出基于LDA(Latent DirichletAllocation,潜在狄利克雷分配)主题聚类的网络评论知识发现的主要任务和解决方法,包括相似评论文本聚类、评论主要观点抽取、深度观点判定等方法。
     (4)从认知视角,分析探讨了面向隐性认知的网络评论知识发现规律,在此基础上以领域知识为核心,将基于观点词的一般挖掘与基于主题的深度挖掘相融合,构建了多库融合的网络评论观点知识发现模式。
     (5)以教育领域网络评论观点挖掘为例进行了实证研究,为其应用研究提供了有价值的参考。
     论文取得的创新研究成果包括以下三个方面:
     (1)构建了基于本体的观点知识库,并提出了基于观点本体知识库的观点挖掘模式,有助于解决隐式观点识别和语境敏感问题,并可辅助提高领域词典的动态扩展性。
     (2)基于主题聚类视角,运用LDA主题模型,结合观点分离与观点摘要集成算法,提出了网络评论主要观点识别、深度观点发现等方法。
     (3)将基于观点词的一般挖掘与基于主题的深度挖掘相融合,通过领域知识进行互补,构建了观点-领域知识-主题多库融合的网络评论观点知识发现模式。
Nowadays,there exist objectively two fields of public opinions. One is society publicopinion field which mainly refers to mainstream media,including newspapers,radio andtelevisions. The other is the private opinions platform which is based on the Internet and theapplications of Web2.0. During the new environment of Web2.0,there are many new formsof the communication added to the public opinion platform. The reviews on the platformhave increased rapidly due to both of the original news commentary,BBS and the newcommunications forms. The new communication methods include the tools as newsaggregation (RSS),Wikipedia (Wiki),instant messaging (IM) as QQ,micro-blog,podcast,taobao and ebay and so on. Recently the Internet users have entered a developmentplatform in which mobile phone has played the most important role. Micro-blog andcommunity also become the sources of online comment views. The new features of theinformation,which are timeless,openness,interactivity,thoughtful and grass-rooted hasbrought many changes on many aspects. People’s life has been influenced by these features.Moreover,it also changes the public opinion evolution information and the mechanism ofpolymerization. Besides the space of the public opinion spreading also has connected withthe new information features.
     Under the environment of Web2.0,it becomes important to get ideas as well as getaccess to information,while it becomes much more difficult to get valuable informationideas. There are two reasons for illustrating this phenomenon. One is that different peoplewould have different opinions in different ways. And the review ideas often mix opinionswith positive and negative aspects. It will cost a lot time and energy to capture accuratereviews information. The other reason is that the comment information from the publicopinion platform with Web2.0has been polluted by large Web reviews. This kind ofinformation is intermingled with good and bad,multifarious as well as stemming. Whereasthe traditional Web public opinion analysis technology,which mainly focused on the webpages and BBS,has limitations to make analysis about the application of Web2.0. The failureto get authoritative information affects the Web reviews and analysis of information.Therefore,it is necessary for us to carry out the research study about the Web2.0Webreviews information. It will help us to make better exploration to the hidden information inthe Web. Moreover,it will also provide support for making decisions and prediction of thefuture strategies. At the same,it will also enrich the Web reviews the theoretical systeminformation analysis.
     With Web2.0application of the public opinion platform as the logical starting point,thisthesis has carried out a deeper system research study which focuses on the Web reviews analysis models,reviews on the Web information theory,technology and methods from theperspective of topic clustering. This thesis mainly uses the methods of the integrated use oftext mining,opinion mining,knowledge discovery,LDA topic model,ontology learning,and methods of visualization technology.
     The main researches this thesis has done are as follows:(1) This thesis has made a comprehensive analysis of the reviews from home and broad andit includes hotspot and frontier as well as application progress of this selected topic.(2) Opinion mining from unstructured text is based on discovery method of aspect-sentimentrelationship,making use of semi-structured explicit view as the extracting key of aspect,sentiment polarity and relationship in comments objects,and constructing opinionknowledge base. So we can solve the problem of context-aware,which will help the wholework of opinion mining.(3) The method of web reviews discovery based on LDA topic clustering model is putforward,including similar comments text clustering and main opinion recognition and deepviews judgment.(4) From a cognitive perspective,analysis of the discovery regulation of web reviews forimplicit cognition and the discovery mode of web reviews based on LDA and opinionmining is conducted. In the educational field network review opinion mining is used as acase study,which provides valuable reference for the research of its application.
     The innovations of this research include the following three aspects:(1) The opinion knowledge base based on ontology is constructed,and opinion miningmethod based on it is put forward,which will help resolve the problem of context-aware andimprove the dynamic expansion of domain dictionary.(2) Based on the perspective of topic clustering,using LDA topic model,combining theideas of algorithm of opinion separation and integration,we proposed the method of mainopinion recognition and deep views judgment.(3) Based on the general mining of opinion word and depth mining based on topic,byincorporating domain knowledge complementary,the knowledge discovery mode of theview-knowledge-topic multi-base integration from web reviews is constructed.
引文
[1]包胜华.基于Web的实体信息搜索与挖掘研究[D].上海交通大学,2008
    [2]Mike Thelwall,Kevan Buckley,Georgios Paltoglou.Sentiment in Twitter Events[J]. Journal of the American Society forInformation Science and Technology,2011,62(2):406–418
    [3]Chmiel A,Sienkiewicz J,Thelwall M,et al.Collective emotions online and their influence on community life[J].PloSone,2011,6(7):e22207
    [4]Cambria E,Hussain A.Sentic computing:Techniques,tools,and applications[M].Springer,2012
    [5]Cambria E,Xia Y,Hussain A.Affective Common Sense Knowledge Acquisition for SentimentAnalysis[C]//LREC.2012:3580-3585
    [6]Wiegand M,Balahur A,Roth B,et al.A survey on the role of negation in sentiment analysis[C]//Proceedings of theworkshop on negation and speculation in natural language processing. Association for Computational Linguistics.2010:60-68
    [7]Balahur A,Mihalcea R,Montoyo A.Computational approaches to subjectivity and sentiment analysis:Present andenvisaged methods and applications[J].Computer Speech&Language,2014,28(1):1-6
    [8]Das S R, Chen M Y. Yahoo! for Amazon:Sentiment extraction from small talk on the web[J]. Management Science,2007,53(9):1375-1388.
    [9]AntweilerW, Frank MZ. Is all that talk just noise? The information content of internet stock message boards[J]. Journalof Finance,2004,59(3):1259-1294
    [10]陈博. WEB文本情感分类中关键问题的研究[D].北京邮电大学,2008
    [11]Liu Y, Huang J, An A, et a.l ARSA:A sentiment-aware model for predicting sales performance usingblogs[C]//Proceedings of the30thAnnual International ACMSIGIR Conference on Research and Development inInformation Retrieval(SIGIR), New York, NY, USA:ACM,2007:607-614
    [12]Ghose A, Ipeirotis P G. Designing novel review ranking systems: Predicting usefulness and impact ofreviews[C]//Proceedings of the9thInternationalConferenceonElectronicCommerce (ICEC), New York, NY, USA:ACM,2007:303-310
    [13]Pavlou P A, Dimoka A. The nature and role of feedback text comments in on line marketplaces:Implications for trustbuilding, price premiums, and seller differentiation[J]. Information Systems Research,2006,17(4):392-
    414.Science,2007,53(9):1375-1388
    [14]曹树金,陈桂鸿,陈忆金.网络舆情主题标引算法与实现[J].图书情报知识,2012(1):52-59
    [15]曹树金,张学莲,陈忆金.网络舆情意见挖掘中极性词典构建和极性识别方法研究[J].图书情报知识,2012(1):60-65
    [16]陈桂鸿,曹树金,陈忆金.网络舆情信息提取与预处理研究[J].图书情报知识,2011(6):50-54
    [17]章成志,苏新宁.基于条件随机场的自动标引模型研究[J].中国图书馆学报,2008(5):89-94.
    [18]张莉,苏新宁,王东波.通用领域中文评论的意见挖掘研究[J].情报理论与实践,2012,35(4):103-108
    [19]章成志.基于集成学习的自动标引方法研究[J].情报学报,2010(1):3-8
    [20] http://blog.sina.com.cn/runorsoft
    [21]王伟,许鑫.基于聚类的网络舆情热点发现及分析[J].现代图书情报技术,2009,3(3):74-79
    [22]温有奎.基于“知识元”的知识组织与检索[J].计算机工程与应用,2005(1):55-57
    [23]陈洪澜.论知识分类的十大方式[J].科学学研究,2007,25(1):26-31
    [24] http://www.ictclas.org
    [25] http://www.ltp-cloud.com
    [26]程显毅,朱倩.文本挖掘原理.北京:科学出版社,2010
    [27]赵君喆.汉语专业领域命名实体语义关系自动抽取研究[D].华中师范大学,2007
    [28]陈晨.基于Mapreduce计算模型的专利技术—功效—应用图构建与应用研究[D].北京工业大学,2013
    [29] BAEZA-YATESR,RIBEIRO-NETO B.Modern Information Retrieval[M].Addison Wesley Longman PublishingCo.Inc.,1999
    [30] Chen H. Knowledge management systems:a text mining perspective[J].2001.
    [31] Thuraisingham B. Data mining:technologies,techniques,tools,and trends[M]. CRC press,1998.
    [32] http://blog.csdn.net/qll125596718/article/details/8306767,Liam Q的专栏-文本特征提取
    [33]寇广增.基于意见挖掘通用框架的情感极性强度模糊性研究[D].武汉大学,2010
    [34] Tsytsarau,Mikalai,and Themis Palpanas."Survey on mining subjective data on the web." Data Mining andKnowledge Discovery24.3(2012):478-514.
    [35]Liu B. Sentiment analysis and opinion mining[J]. Synthesis Lectures on Human Language Technologies,2012,5(1):1-167
    [36]Soo-Min Kim,Eduard Hovy.Determining the Sentiment of Opinions.COLING '04Proceedings of the20thinternational conference on Computational Linguistics,Association for Computational Linguistics Stroudsburg,PA,
    [43]刘兵著,俞勇等译. Web数据挖掘(第2版)[M].北京:清华大学出版社,2013
    [44]王辉;,晖昱,左万利.观点挖掘综述[J].计算机应用研究,2009(1):25-29
    [45] Ning Yu. Semi-supervised learning for identifying opinions in Web content. Indiana University(School of Library andInformation Science),2011:57-72
    [46]梁昌勇,王倩倩,陆文星,等.结合商品标题和描述的在线评论特征词选择方法研究[J].现代图书情报技术,2011,5:011.
    [47] Lu Y,Zhai C. Opinion integration through semi-supervised topic modeling[C]//Proceedings of the17th internationalconference on World Wide Web. ACM,2008:121-130.
    [48] Hu M,Liu B.Mining and summarizing customer reviews[C]//Proceedings of the tenth ACM SIGKDD internationalconference on Knowledge discovery and data mining.ACM,2004:168-177.
    [49] http://www.searchforum.org.cn/tansongbo/corpus-senti.htm
    [50]孟宪军.互联网文本聚类与检索技术研究[D].哈尔滨工业大学,2009
    [51]http://www.keenage.com/
    [52] http://www.sogou.com/labs/dl/w.html
    [53] http://www.lemurproject.org/
    [54] http://blog.csdn.net/sharpdew/archive/2005/07/29/438241.aspx
    [55] T. Strohman,D. Metzler,H. Turtle,et al. Indri:A Language-model Based Search Engine for Complex Queries
    (extended Version)[J].2005
    [56] What is R?[EB/OL].[2013-8-11].www.r-project.org
    [57]任昭春.面向网络论坛的动态主题建模与文本摘要[D].山东大学,2012.
    [58]David M. Blei, AndrewY. Ng, Michael I. Jordan, Latent Dirichlet Allocation[J], Journal of MachineLearning Research3,2003:993‐1022
    [59]姚全珠,宋志理,彭程.基于LDA模型的文本分类研究[J].计算机工程与应用,2011,47(13):150-153
    [60]丁轶群.基于概率生成模型的文本主题建模及其应用[D].浙江大学,2010
    [61]http://blog.csdn.net/nanjunxiao/article/details/9006539
    [62]http://blog.csdn.net/yangliuy/article/details/8302599
    [63]http://cos.name/2010/10/lda_topic_model/
    [64]范云满,马建霞.利用LDA的领域新兴主题探测技术综述[J].现代图书情报技术,2012,12:58-65.
    [65] RANDA.Introduction to Objeetivist Epistemology.NewAmerieanLib rary,1979
    [66]李雄飞,董元方,李军等.数据挖掘与知识发现.北京:高等教育出版社,2010.2-5
    [67]孙吉红,焦玉英.知识发现及其发展趋势研究[J].情报理论与实践,2006,29(5):528-530.
    [68]杨炳儒.基于内在认知机理的知识发现理论[M].北京:国防工业出版社,2009.
    [69]邱均平等.网络数据分析.北京:北京大学出版社,2004
    [70]陈晓美,毕强,滕广青,等,语义网环境下数字图书馆知识发现的维度框架研究[J].情报学报,2014,33(2):
    148-157
    [71] Mannila H. Theoretical frameworks for data mining[J]. ACM SIGKDD Explorations Newsletter,2000,1(2):30-32.
    [72]綦艳霞,杨炳儒.KDD中知识评价的研究综述[J].计算机应用研究,2001,18(12):1-4
    [73]杨炳儒,王建新.KDD中双库协同机制的研究(Ⅰ)[J].中国工程科学,2002,4(4):41-51
    [74]杨炳儒,申江涛,陈泓婕等.基于知识库的知识发现(KDK)的结构模型与挖掘算法研究[J].中国工程科学,2003,5(6):49-54
    [75]梁开健,梁泉,杨炳儒等.动态KDD过程中矛盾规则的研究[J].计算机应用研究,2006,23(1):79-81
    [76]杨炳儒,唐志刚,杨珺等.专家系统中基于认知的知识自动获取机制[J].高技术通讯,2010,20(5):493-498
    [77]王振业,李舒.新闻评论与电子媒介[M].北京:中国广播电视出版社,2004.
    [78]潘瑛.比较视角下的网络评论[J].当代传播,2004(3)61-62
    [79]邹伟,何志武.网络评论的自由与控制[J].新闻界,2003(2):24-25
    [80]李琳.从网络的发展看舆论监督的本位回归[J].今传媒,2005(6):44-45
    [81]贺立凯.网络评论现状与发展研究综述.今传媒[J],2012(6):150-151
    [82]陈晓美,王付国,吴宏伟,等.社会化网络评论观点挖掘的研究热点与应用进展[J].情报科学,2013,11:119-124
    [83]http://blog.sina.com.cn/s/blog_62f799910100gic8.html
    [84]http://blog.sina.com.cn/s/blog_7bdf6c710101bpkw.html
    [85]Hu Ming qing,Liu Bing.Mining and summarizing customer reviews[C].Proceedings of the8th ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining,Seattle,WA,2004
    [86]Keven.Keven s Blog[EB/OL]. http://my.donews.com/keven/,2012-10-11
    [87]周立柱,贺宇凯,王建勇.情感分析研究综述[J].计算机应用,2008,28(11):2725-2728
    [88]张紫琼,叶强,李一军.互联网商品评论情感分析研究综述[J].管理科学学报,2010,13(6):84-96
    [89]朱嫣岚,闵锦,周雅倩,等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20
    [90]Yu H, Hatzivassiloglou V. Towards answering opinion questions:Separating facts from opinions and identifying thepolarity of opinion sentences[C]//Proceedings of the2003conference on Empirical methods in natural language processing.Association for Computational Linguistics,2003:129-136
    [91] Miao,Qingliang,Qiudan Li,and Daniel Zeng."Fine‐g rained opinion mining byintegrating multiple reviewsources." Journal of the american society for information science and technology61.11(2010):2288-2299.
    [92] Product Feature Categorization with Multilevel Latent Semantic Association
    [93]何琳.领域本体的半自动构建及检索研究[M].南京:东南大学出版社,2009:71-99.
    [94]Zhou L, Chaovalit P. Ontology‐supported polarity mining[J]. Journal of the American Society for InformationScience and technology,2008,59(1):98-110
    [95] Surowiecki J. The wisdom of crowds:Why the many are smarter than the few and how collective wisdom shapesbusiness[J]. Economies,Societies and Nations,2004.
    [96]樊嘉禄,陈发俊.“盲人摸象”的认识论启示[J].安徽农业大学学报(社会科学版),2001,01:33-34.
    [97]张小平,周雪忠,黄厚宽,冯奇,陈世波.基于词相似性与CRP的主题模型[J].模式识别与人工智能,2010,01:72-76
    [98]张亚昕.不确定数据聚类算法研究[J].计算技术与自动化,2013,02:60-63
    [99]李金广.数据挖掘中聚类算法研究综述[J].中国科技信息,2010,17:48-49
    [100]张连文,袁世宏.隐结构模型与中医辨证研究(I)一隐结构法的基本思想以及隐结构分析工具.北京中医药大学学报,2006,29(6):365-369.
    [101] M.Sobel.Causal Inference in latent variable models.A.von Eye and C.Clogg(Eds.)Latent variables analysis:Applications for Developmental.1994.
    [102]张小平.主题模型及其在中医临床诊疗中的应用研究[D].北京交通大学,2011
    [103]毕强,牟冬梅,陈晓美.数字图书馆KOS的变革与创新[J].图书馆学研究,2009(11):11-14
    [104]徐戈,王厚峰.自然语言处理中主题模型的发展.计算机学报,2011,34(8):1423-1436.
    [105] LDA(Latent Dirichlet Allocation)主题模型算法_AriannaChen_新浪博客.htm
    [106]张冬梅.文本情感分类及观点摘要关键问题研究[D].济南:山东大学,2012
    [107]付玲,张晖.结合LDA和谱聚类的多文档摘要[J]. Computer Engineering and Applications,2013,49(16).
    [108] Chesley P,Vincent B,Xu L,et al. Using verbs and adjectives to automatically classify blog sentiment[J]. Training,2006,580(263):233.
    [109]孙昌年,郑诚,夏青松.基于LDA的中文文本相似度计算[J].计算机技术与发展,2013(1):217-220
    [110] http://www.tuicool.com/articles/eMRvE3
    [111] http://download.labs.sogou.com/dl/sogoulabdown/SogouC.mini.20061102.tar.gz
    [112]王国璋.汉语褒贬义词语用法词典[J].2001
    [113] Lu Y,Zhai C. Opinion integration through semi-supervised topic modeling[C]//Proceedings of the17th internationalconference on World Wide Web. ACM,2008:121-130.
    [114]秦兵,刘挺,陈尚林,等.多文档文摘中句子优化选择方法研究[J].计算机研究与发展,2006,43(6):1129-1134.
    [115] Steyvers M,Griffiths T. Probabilistic topic models[J]. Handbook of latent semantic analysis,2007,427(7):424-440
    [116]杨炳儒.知识发现系统框架及其理论体系的构造方法论[J].中国工程科学,2011,09:83-91
    [117]杨炳儒,宋威,徐章艳等.基于内在认知机理的知识发现理论及其应用[J].自然科学进展,2006,16(1):107-115
    [118]钱智勇.基于本体的专题域知识库系统设计与实现——以张謇研究专题知识库系统实现为例[J].情报理论与实践,2006,04:476-479
    [119]罗晨光,山川,王珊.基于本体的古籍知识库建设初探[J].现代图书情报技术,2007,04:8-11.
    [120]张鹏翼,周妍,袁兴福.公众议题知识库的多层本体设计[J].图书情报工作,2013,13:132-139
    [121]李建伟,宋文,汤怡洁,刘毅,王兴兰.科研本体知识库数据建设研究[J].现代图书情报技术,2013,11:15-21
    [122]丁晟春,成晓.基于用户提问的领域本体知识库的知识检索[J].现代图书情报技术,2007,01:62-64
    [123]夏书剑,秦延斌.以中医药本体知识库为基础的开放存取资源的知识元数据提取与存储技术研究[J].现代计算
    机(专业版),2012,03:14-18
    [124]许德山,张智雄,邢美凤.面向本体知识库的可视化检索研究[J].情报理论与实践,2010,08:114-117.
    [125]洪韵佳,许鑫.基于领域本体的知识库多层次文本聚类研究——以中华烹饪文化知识库为例[J].现代图书情报技术,2013,12:19-26
    [126]陈晓美,毕强.面向文本的领域本体学习方法与应用研究综述[J].图书情报工作,2011,23:27-31
    [127]詹卫东.面向自然语言处理的大规模语义知识库研究述要[C]//中文信息处理若干重要问题.北京:科学出版社,2003:107-121.
    [128]董振东,董强.知网.[2010-08-23].http://www.keenage.com/zhiwang/c_zhiwang.html.
    [129]杜小勇,李曼,王珊.本体学习研究综述[J].软件学报.2006,17(9):1837-1847.
    [130] Peter D. Turney,PMichael L. Littman.Corpus-based Learning of Analogies and Semantic Relations[J]. MachineLearning,2005,60(3):251-278.
    [131] Pmarti A.Hearst.Automatic acquisition of hyponyms from large text corpora [C]//International Conference OnComputational Linguistics:Dr.D.E.Walker,ACL Sec,1999:539545
    [132] Morin E. Automatic Acquisition of Semantic Relations between Terms from Technical Corpora [C]//Proceedings ofthe5th International Congress on Terminology and Knowledge Engineering,TermNet-Verlag,Vienna,1999:268-278.
    [133]郑丽萍,梁永全.基于聚类分析法的本体构造方法.青岛大学学报(自然科学版)[J].2005(9):55-58.
    [134] Nicola Fanizzi,Claudia d'Amato,Floriana Esposito.Metric-based stochastic conceptual clustering for ontologies [J].Information Systems,2009,34(8):792-806.
    [135]徐惠,高志强,陆青健,朱万颖.ORIGO:一种基于数字化图书馆资源的本体学习方法[J].计算机科学.2008,35(4):55-57.
    [136]何绍义.概念信息检索的理论与实践[J].情报学报,1995(2):134-141
    [137]何琳,杜慧平,侯汉清.一种基于领域本体的语义检索系统的设计与实现[J].图书情报工作,2008(8):85-88
    [138]何琳.领域本体的半自动构建及检索研究[M].南京:东南大学出版社,2009:71-99.
    [139]Carl Weter.卡爾威特的教育[M].哈尔滨:黑龙江科学技术出版社,2010.
    [140] http://baike.baidu.com/
    [141] Rudolf Steiner.人学[M].台北:洪叶事业文化有限公司,2010.
    [1]包胜华.基于Web的实体信息搜索与挖掘研究[D].上海交通大学,2008
    [2] Mike Thelwall,Kevan Buckley,Georgios Paltoglou.Sentiment in TwitterEvents[J]. Journal of the American Society for Information Science andTechnology,2011,62(2):406–418
    [3] Chmiel A,Sienkiewicz J,Thelwall M,et al.Collective emotions online andtheir influence on community life[J].PloS one,2011,6(7):e22207
    [4] Cambria E, Hussain A.Sentic computing: Techniques, tools, andapplications[M].Springer,2012
    [5] Cambria E,Xia Y,Hussain A.Affective Common Sense Knowledge Acquisitionfor Sentiment Analysis[C]//LREC.2012:3580-3585
    [6] Wiegand M,Balahur A,Roth B,et al.A survey on the role of negation insentiment analysis[C]//Proceedings of the workshop on negation andspeculation in natural language processing. Association for ComputationalLinguistics.2010:60-68
    [7] Balahur A,Mihalcea R,Montoyo A.Computational approaches to subjectivityand sentiment analysis: Present and envisaged methods andapplications[J].Computer Speech&Language,2014,28(1):1-6
    [8] Das S R, Chen M Y. Yahoo! for Amazon:Sentiment extraction from small talkon the web[J]. Management Science,2007,53(9):1375-1388.
    [9] AntweilerW, Frank MZ. Is all that talk just noise? The information contentof internet stock message boards[J]. Journal of Finance,2004,59(3):1259-1294
    [10]陈博. WEB文本情感分类中关键问题的研究[D].北京邮电大学,2008
    [11] Liu Y, Huang J, An A, et a.l ARSA:A sentiment-aware model forpredicting sales performance using blogs[C]//Proceedings of the30thAnnual International ACMSIGIR Conference on Research and Development inInformation Retrieval(SIGIR), New York, NY, USA:ACM,2007:607-614
    [12] Ghose A, Ipeirotis P G. Designing novel review ranking systems:Predicting usefulness and impact of reviews[C]//Proceedings ofthe9thInternationalConferenceonElectronicCommerce (ICEC), New York,NY, USA:ACM,2007:303-310
    [13] Pavlou P A, Dimoka A. The nature and role of feedback text commentsin on line marketplaces:Implications for trust building, price premiums,and seller differentiation[J]. Information Systems Research,2006,17(4):392-414.Science,2007,53(9):1375-1388
    [14]曹树金,陈桂鸿,陈忆金.网络舆情主题标引算法与实现[J].图书情报知识,2012(1):52-59
    [15]曹树金,张学莲,陈忆金.网络舆情意见挖掘中极性词典构建和极性识别方法研究[J].图书情报知识,2012(1):60-65
    [16]陈桂鸿,曹树金,陈忆金.网络舆情信息提取与预处理研究[J].图书情报知识,2011(6):50-54
    [17]章成志,苏新宁.基于条件随机场的自动标引模型研究[J].中国图书馆学报,2008(5):89-94.
    [18]张莉,苏新宁,王东波.通用领域中文评论的意见挖掘研究[J].情报理论与实践,2012,35(4):103-108
    [19]章成志.基于集成学习的自动标引方法研究[J].情报学报,2010(1):3-8
    [20] http://blog.sina.com.cn/runorsoft
    [21]王伟,许鑫.基于聚类的网络舆情热点发现及分析[J].现代图书情报技术,2009,3(3):74-79
    [22]温有奎.基于“知识元”的知识组织与检索[J].计算机工程与应用,2005(1):55-57
    [23]陈洪澜.论知识分类的十大方式[J].科学学研究,2007,25(1):26-31
    [24] http://www.ictclas.org
    [25] http://www.ltp-cloud.com
    [26]程显毅,朱倩.文本挖掘原理.北京:科学出版社,2010
    [27]赵君喆.汉语专业领域命名实体语义关系自动抽取研究[D].华中师范大学,2007
    [28]陈晨.基于Mapreduce计算模型的专利技术—功效—应用图构建与应用研究[D].北京工业大学,2013
    [29] BAEZA-YATESR,RIBEIRO-NETO B.Modern Information Retrieval[M].AddisonWesley Longman Publishing Co.Inc.,1999
    [30] Chen H. Knowledge management systems:a text mining perspective[J].2001.
    [31] Thuraisingham B. Data mining:technologies,techniques,tools,andtrends[M]. CRC press,1998.
    [32] http://blog.csdn.net/qll125596718/article/details/8306767,Liam Q的专栏-文本特征提取
    [33]寇广增.基于意见挖掘通用框架的情感极性强度模糊性研究[D].武汉大学,2010
    [34] Tsytsarau,Mikalai,and Themis Palpanas."Survey on mining subjectivedata on the web." Data Mining and Knowledge Discovery24.3(2012):478-514.
    [35] Liu B. Sentiment analysis and opinion mining[J]. Synthesis Lectureson Human Language Technologies,2012,5(1):1-167
    [36] Soo-Min Kim,Eduard Hovy.Determining the Sentiment of Opinions.COLING'04Proceedings of the20th international conference on ComputationalLinguistics,Association for Computational Linguistics Stroudsburg,PA,USA2004,Article No.1367
    [37] http://s3.amazonaws.com/DataSiftReports/2012-05-18_Facebook_IPO/index.html
    [38] http://www.marketingpilgrim.com/2013/05/want-to-know-the-mood-in-america-check-the-twitter-hedonometer. html
    [39]仇光,郑淼,张晖,朱建科,卜佳俊,陈纯,杭航.基于正则化主题建模的隐式产品属性抽取[J].浙江大学学报(工学版),2011(2):288-294
    [40]姚天昉,程希文,徐飞玉等.文本意见挖掘综述[J].中文信息学报,2008,22(3):71-80
    [41]侯锋,王传廷,李国辉.网络意见挖掘,摘要与检索研究综述[J].计算机科学,2009,36(7):15-19
    [42] Liu B, Zhang L. A survey of opinion mining and sentimentanalysis[M]//Mining Text Data. Springer US,2012:415-463
    [43]刘兵著,俞勇等译. Web数据挖掘(第2版)[M].北京:清华大学出版社,2013
    [44]王辉;,晖昱,左万利.观点挖掘综述[J].计算机应用研究,2009(1):25-29
    [45] Ning Yu. Semi-supervised learning for identifying opinions in Webcontent. Indiana University(School of Library and Information Science),2011:57-72
    [46]梁昌勇,王倩倩,陆文星,等.结合商品标题和描述的在线评论特征词选择方法研究[J].现代图书情报技术,2011,5:011.
    [47] Lu Y,Zhai C. Opinion integration through semi-supervised topicmodeling[C]//Proceedings of the17th international conference on WorldWide Web. ACM,2008:121-130.
    [48] Hu M,Liu B.Mining and summarizing customer reviews[C]//Proceedingsof the tenth ACM SIGKDD international conference on Knowledge discoveryand data mining.ACM,2004:168-177.
    [49] http://www.searchforum.org.cn/tansongbo/corpus-senti.htm
    [50]孟宪军.互联网文本聚类与检索技术研究[D].哈尔滨工业大学,2009
    [51] http://www.keenage.com/
    [52] http://www.sogou.com/labs/dl/w.html
    [53] http://www.lemurproject.org/
    [54] http://blog.csdn.net/sharpdew/archive/2005/07/29/438241.aspx
    [55] T. Strohman,D. Metzler,H. Turtle,et al. Indri:A Language-model BasedSearch Engine for Complex Queries(extended Version)[J].2005
    [56] What is R?[EB/OL].[2013-8-11].www.r-project.org
    [57]任昭春.面向网络论坛的动态主题建模与文本摘要[D].山东大学,2012.
    [58] David M. Blei, AndrewY. Ng, Michael I. Jordan, Latent DirichletAllocation[J], Journal of Machine Learning Research3,2003:993-1022
    [59]姚全珠,宋志理,彭程.基于LDA模型的文本分类研究[J].计算机工程与应用,2011,47(13):150-153
    [60]丁轶群.基于概率生成模型的文本主题建模及其应用[D].浙江大学,2010
    [61] http://blog.csdn.net/nanjunxiao/article/details/9006539
    [62] http://blog.csdn.net/yangliuy/article/details/8302599
    [63] http://cos.name/2010/10/lda_topic_model/
    [64]范云满,马建霞.利用LDA的领域新兴主题探测技术综述[J].现代图书情报技术,2012,12:58-65.
    [65] RANDA.Introduction to Objeetivist Epistemology.NewAmerieanLib rary,1979
    [66]李雄飞,董元方,李军等.数据挖掘与知识发现.北京:高等教育出版社,2010.2-5
    [67]孙吉红,焦玉英.知识发现及其发展趋势研究[J].情报理论与实践,2006,29(5):528-530.
    [68]杨炳儒.基于内在认知机理的知识发现理论[M].北京:国防工业出版社,2009.
    [69]邱均平等.网络数据分析.北京:北京大学出版社,2004
    [70]陈晓美,毕强,滕广青,等,语义网环境下数字图书馆知识发现的维度框架研究[J].情报学报,2014,33(2):148-157
    [71] Mannila H. Theoretical frameworks for data mining[J]. ACM SIGKDDExplorations Newsletter,2000,1(2):30-32.
    [72]綦艳霞,杨炳儒.KDD中知识评价的研究综述[J].计算机应用研究,2001,18(12):1-4
    [73]杨炳儒,王建新.KDD中双库协同机制的研究(Ⅰ)[J].中国工程科学,2002,4(4):41-51
    [74]杨炳儒,申江涛,陈泓婕等.基于知识库的知识发现(KDK)的结构模型与挖掘算法研究[J].中国工程科学,2003,5(6):49-54
    [75]梁开健,梁泉,杨炳儒等.动态KDD过程中矛盾规则的研究[J].计算机应用研究,2006,23(1):79-81
    [76]杨炳儒,唐志刚,杨珺等.专家系统中基于认知的知识自动获取机制[J].高技术通讯,2010,20(5):493-498
    [77]王振业,李舒.新闻评论与电子媒介[M].北京:中国广播电视出版社,2004.
    [78]潘瑛.比较视角下的网络评论[J].当代传播,2004(3)61-62
    [79]邹伟,何志武.网络评论的自由与控制[J].新闻界,2003(2):24-25
    [80]李琳.从网络的发展看舆论监督的本位回归[J].今传媒,2005(6):44-45
    [81]贺立凯.网络评论现状与发展研究综述.今传媒[J],2012(6):150-151
    [82]陈晓美,王付国,吴宏伟,等.社会化网络评论观点挖掘的研究热点与应用进展[J].情报科学,2013,11:119-124
    [83] http://blog.sina.com.cn/s/blog_62f799910100gic8.html
    [84] http://blog.sina.com.cn/s/blog_7bdf6c710101bpkw.html
    [85] Hu Ming qing, Liu Bing.Mining and summarizing customerreviews[C].Proceedings of the8th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining,Seattle,WA,2004
    [86] Keven.Keven s Blog[EB/OL]. http://my.donews.com/keven/,2012-10-11
    [87]周立柱,贺宇凯,王建勇.情感分析研究综述[J].计算机应用,2008,28(11):2725-2728
    [88]张紫琼,叶强,李一军.互联网商品评论情感分析研究综述[J].管理科学学报,2010,13(6):84-96
    [89]朱嫣岚,闵锦,周雅倩,等.基于HowNet的词汇语义倾向计算[J].中文信息学报,2006,20(1):14-20
    [90] Yu H, Hatzivassiloglou V. Towards answering opinion questions:Separating facts from opinions and identifying the polarity of opinionsentences[C]//Proceedings of the2003conference on Empirical methods innatural language processing. Association for Computational Linguistics,2003:129-136
    [91] Miao,Qingliang,Qiudan Li,and Daniel Zeng."Fine‐grained opinionmining by integrating multiple review sources." Journal of the americansociety for information science and technology61.11(2010):2288-2299.
    [92] Product Feature Categorization with Multilevel Latent SemanticAssociation
    [93]何琳.领域本体的半自动构建及检索研究[M].南京:东南大学出版社,2009:71-99.
    [94] Zhou L, Chaovalit P. Ontology‐supported polarity mining[J]. Journalof the American Society for Information Science and technology,2008,59(1):98-110
    [95] Surowiecki J. The wisdom of crowds:Why the many are smarter than thefew and how collective wisdom shapes business[J]. Economies,Societies andNations,2004.
    [96]樊嘉禄,陈发俊.“盲人摸象”的认识论启示[J].安徽农业大学学报(社会科学版),2001,01:33-34.
    [97]张小平,周雪忠,黄厚宽,冯奇,陈世波.基于词相似性与CRP的主题模型[J].模式识别与人工智能,2010,01:72-76
    [98]张亚昕.不确定数据聚类算法研究[J].计算技术与自动化,2013,02:60-63
    [99]李金广.数据挖掘中聚类算法研究综述[J].中国科技信息,2010,17:48-49
    [100]张连文,袁世宏.隐结构模型与中医辨证研究(I)一隐结构法的基本思想以及隐结构分析工具.北京中医药大学学报,2006,29(6):365-369.
    [101] M.Sobel.Causal Inference in latent variable models.A.von Eye andC. Clogg (Eds.) Latent variables analysis: Applications forDevelopmental.1994.
    [102]张小平.主题模型及其在中医临床诊疗中的应用研究[D].北京交通大学,2011
    [103]毕强,牟冬梅,陈晓美.数字图书馆KOS的变革与创新[J].图书馆学研究,2009(11):11-14
    [104]徐戈,王厚峰.自然语言处理中主题模型的发展.计算机学报,2011,34(8):1423-1436.
    [105] LDA(Latent Dirichlet Allocation)主题模型算法_AriannaChen_新浪博客.htm
    [106]张冬梅.文本情感分类及观点摘要关键问题研究[D].济南:山东大学,2012
    [107]付玲,张晖.结合LDA和谱聚类的多文档摘要[J]. Computer Engineeringand Applications,2013,49(16).
    [108] Chesley P,Vincent B,Xu L,et al. Using verbs and adjectives toautomatically classify blog sentiment[J]. Training,2006,580(263):233.
    [109]孙昌年,郑诚,夏青松.基于LDA的中文文本相似度计算[J].计算机技术与发展,2013(1):217-220
    [110] http://www.tuicool.com/articles/eMRvE3
    [111] http://download.labs.sogou.com/dl/sogoulabdown/SogouC.mini.20061102.tar.gz
    [112]王国璋.汉语褒贬义词语用法词典[J].2001
    [113] Lu Y,Zhai C. Opinion integration through semi-supervised topicmodeling[C]//Proceedings of the17th international conference on WorldWide Web. ACM,2008:121-130.
    [114]秦兵,刘挺,陈尚林,等.多文档文摘中句子优化选择方法研究[J].计算机研究与发展,2006,43(6):1129-1134.
    [115] Steyvers M,Griffiths T. Probabilistic topic models[J]. Handbook oflatent semantic analysis,2007,427(7):424-440
    [116]杨炳儒.知识发现系统框架及其理论体系的构造方法论[J].中国工程科学,2011,09:83-91
    [117]杨炳儒,宋威,徐章艳等.基于内在认知机理的知识发现理论及其应用[J].自然科学进展,2006,16(1):107-115
    [118]钱智勇.基于本体的专题域知识库系统设计与实现——以张謇研究专题知识库系统实现为例[J].情报理论与实践,2006,04:476-479
    [119]罗晨光,山川,王珊.基于本体的古籍知识库建设初探[J].现代图书情报技术,2007,04:8-11.
    [120]张鹏翼,周妍,袁兴福.公众议题知识库的多层本体设计[J].图书情报工作,2013,13:132-139
    [121]李建伟,宋文,汤怡洁,刘毅,王兴兰.科研本体知识库数据建设研究[J].现代图书情报技术,2013,11:15-21
    [122]丁晟春,成晓.基于用户提问的领域本体知识库的知识检索[J].现代图书情报技术,2007,01:62-64
    [123]夏书剑,秦延斌.以中医药本体知识库为基础的开放存取资源的知识元数据提取与存储技术研究[J].现代计算机(专业版),2012,03:14-18
    [124]许德山,张智雄,邢美凤.面向本体知识库的可视化检索研究[J].情报理论与实践,2010,08:114-117.
    [125]洪韵佳,许鑫.基于领域本体的知识库多层次文本聚类研究——以中华烹饪文化知识库为例[J].现代图书情报技术,2013,12:19-26
    [126]陈晓美,毕强.面向文本的领域本体学习方法与应用研究综述[J].图书情报工作,2011,23:27-31
    [127]詹卫东.面向自然语言处理的大规模语义知识库研究述要[C]//中文信息处理若干重要问题.北京:科学出版社,2003:107-121.
    [128]董振东,董强.知网.[2010-08-23].http://www.keenage.com/zhiwang/c_zhiwang.html.
    [129]杜小勇,李曼,王珊.本体学习研究综述[J].软件学报.2006,17(9):1837-1847.
    [130] Peter D. Turney,PMichael L. Littman.Corpus-based Learning of Analogiesand Semantic Relations[J]. Machine Learning,2005,60(3):251-278.
    [131] Pmarti A.Hearst.Automatic acquisition of hyponyms from large textcorpora [C]//International Conference On Computational Linguistics:Dr.D.E.Walker,ACL Sec,1999:539545
    [132] Morin E. Automatic Acquisition of Semantic Relations between Terms fromTechnical Corpora [C]//Proceedings of the5th International Congress onTerminology and Knowledge Engineering,TermNet-Verlag,Vienna,1999:268-278.
    [133]郑丽萍,梁永全.基于聚类分析法的本体构造方法.青岛大学学报(自然科学版)[J].2005(9):55-58.
    [134] Nicola Fanizzi, Claudia d'Amato, Floriana Esposito.Metric-basedstochastic conceptual clustering for ontologies [J]. Information Systems,2009,34(8):792-806.
    [135]徐惠,高志强,陆青健,朱万颖.ORIGO:一种基于数字化图书馆资源的本体学习方法[J].计算机科学.2008,35(4):55-57.
    [136]何绍义.概念信息检索的理论与实践[J].情报学报,1995(2):134-141
    [137]何琳,杜慧平,侯汉清.一种基于领域本体的语义检索系统的设计与实现[J].图书情报工作,2008(8):85-88
    [138]何琳.领域本体的半自动构建及检索研究[M].南京:东南大学出版社,2009:71-99.
    [139] Carl Weter.卡爾威特的教育[M].哈尔滨:黑龙江科学技术出版社,2010.
    [140] http://baike.baidu.com/
    [141] Rudolf Steiner.人学[M].台北:洪叶事业文化有限公司,2010.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700