用户名: 密码: 验证码:
基于文本挖掘的中医文本情感分析
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Sentiment Analysis of Chinese Medicine Text Based on Text Mining
  • 作者:窦鹏伟 ; 王珍 ; 佘侃侃 ; 樊文玲 ; 王旭东
  • 英文作者:DOU Pengwei;WANG Zhen;SHE Kankan;FAN Wenling;WANG Xudong;Institute of Information Technology,Nanjing University of Chinese Medicine;
  • 关键词:文本挖掘 ; 中医文本 ; 中文分词 ; 情感分析
  • 英文关键词:text mining;;Chinese medicine text;;word segmentation;;sentiment analysis
  • 中文刊名:ZYHS
  • 英文刊名:Chinese Archives of Traditional Chinese Medicine
  • 机构:南京中医药大学信息技术学院;
  • 出版日期:2017-05-10
  • 出版单位:中华中医药学刊
  • 年:2017
  • 期:v.35
  • 基金:国家自然科学基金项目(81274095);; 国家社会科学重大项目(12&ZD114);; 江苏省自然科学基金青年科学基金项目(BK20140958);; 江苏省高校自然科学基金项目(14KJB520032);; 南京中医药大学重点培育学科“软件工程”项目
  • 语种:中文;
  • 页:ZYHS201705037
  • 页数:4
  • CN:05
  • ISSN:21-1546/R
  • 分类号:136-139
摘要
对中医评论性文本进行情感分析具有重要的学术研究价值,是了解中医发展的社会认可情况、探索民众对中医的情感倾向的有效途径。研究基于文本挖掘的情感分析方法,改进了基于词典的中文分词方法,利用基于细粒度词汇权重的情感分析方法进行中医文本情感分析。以典型的中医评论文本为例,进行评论语句的识别解析和情感计算。通过与其它情感分析方法的比较,验证了基于细粒度词汇情感权重算法在中医文本情感分析中的有效性。
        To carry on the sentiment analysis on Chinese medicine texts has an important academic research value and it is an effective way to acquire the social approbation and emotional tendency of the development of Chinese medicine. Sentiment analysis method based on text mining was studied. The word segmentation method based on dictionary was improved and the analysis method of fine-grained sentiment was used to study Chinese medicine texts. As a result,a typical comment was used to prove the effectiveness of this proposed method in the sentiment analysis of Chinese medicine text.
引文
[1]Turney,Peter.Thumbs Up or Thumbs Down?Semantic Orientation Applied to Unsupervised Classification of Reviews[J].Proceedings of the Association for Computational Linguistics,2002:417-424.
    [2]Pang B,Lee L.A sentimental education:Sentiment analysis using subjectivity summarization based on minimum cuts[C]//ACL’04.2004.
    [3]Whitelaw C,Garg N,Argamon S.Using appraisal groups for sentiment analysis[C]//CIKM’05.New York,NY,USA,2005:625-631.
    [4]顾益军,刘小明.融合多种情感资源的微博情感分类研究[J].计算机科学,2015,4:209-212,239.
    [5]薛为民,陆玉昌.文本挖掘技术研究[J].北京联合大学学报(自然科学版),2005,4:59-63.
    [6]Peng,H C,Long,et al.Feature selection based on mutual information:criteria of max-dependency,max-relevance,and min-redundancy[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(8):1226-1238.
    [7]麦范金,王挺.基于双向最大匹配和HMM的分词消歧模型[J].现代图书情报技术,2008,8:37-41.
    [8]柳位平,朱艳辉,栗春亮,等.中文基础情感词词典构建方法研究[J].计算机应用,2009,10:2875-2877.
    [9]韩琳.黄侃字词关系研究学术史价值考察[J].湖北民族学院学报(哲学社会科学版),2007,6:92-96.
    [10]魏慧萍.关于汉语字词关系的再思考[J].南京师范大学学报(社会科学版),2004(1):135-140.
    [11]Ku Lw,Liang Y T,hen H.H.Opinion Extraction,Summarization and Tracking in News and Blog Corpora[C].In:Proceedings of AAAI’2006,2006:100-107.
    [12]董振东.知网[CP/OL].[2012-03-24].http://www.keenage.com.
    [13]唐浩浩,王波,周杰,等.基于词亲和度的微博词语语义倾向识别算法[J].数据采集与处理,2015,1:137-147.
    [14]曹莉,韩佩玉,陈颖,等.中医药一体化语言系统中语义关系的探讨[J].时珍国医国药,2006,3:444-445.
    [15]化柏林.知识抽取中的停用词处理技术[J].现代图书情报技术,2007,8:48-51.
    [16]方皓,张功耀,陈士奎,等.取消中医:无知还是拯救?[J].中国医疗前沿,2006,6:76-80.
    [17]TSOUBKY,YUENRWM,KWONGOY,et al.Polarity classification of celebrity coverage in the Chinese press[C]∥Proceeding of the 2005 International Conference on Intelligence Analysis.Virginia,USA:[s.n.],2005.
    [18]庞剑锋,卜东波,白硕.基于向量空间模型的文本自动分类系统的研究与实现[J].计算机应用研究,2001,9:23-26.
    [19]Lafferty,J,Mc Callum,et al."Conditional random fields:Probabilistic models for segmenting and labeling sequence data".Proc.18th International Conf.on Machine Learning.Morgan Kaufmann,2001:282-289.
    [20]He,X,Zemel.Multiscale conditional random fields for image labeling[J].IEEE Computer Society,2004.
    [21]Chang KY,Lin T,Shih L,et al.Analysis and Prediction of the Critical Regions of Antimicrobial Peptides Based on Conditional Random Fields.[s.l.].2015.
    [22]Kenneth Ward Church,Patrick Hanks."Word association norms,mutual information,and lexicography".Comput[J].Linguist,1990,16(1):22-29.
    [23]Bouma,Gerlof.Normalized Mutual Information in Collocation Extraction[C].Proceedings of the Biennial GSCL Conference,2009.
    [24]王立霞,淮晓永.基于语义的中文文本关键词提取算法[J].计算机工程,2012,1:1-4.
    [25]彭志平,李晓明,柯文德.基于本体概念群组划分的语义距离计算方法[J].模式识别与人工智能,2011,2:194-200.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700