基于情感特征聚类的半监督情感分类
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Semi-Supervised Sentiment Classification Based on Sentiment Feature Clustering
  • 作者:李素科 ; 蒋严冰
  • 英文作者:Li Suke;Jiang Yanbing;School of Software and Microelectronics,Peking University;
  • 关键词:半监督式学习 ; 情感特征聚类 ; 情感分类 ; 观点挖掘 ; Web挖掘 ; 数据挖掘
  • 英文关键词:semi-supervised learning;;sentiment feature clustering;;sentiment classification;;opinion mining;;Web mining;;data mining
  • 中文刊名:JFYZ
  • 英文刊名:Journal of Computer Research and Development
  • 机构:北京大学软件与微电子学院;
  • 出版日期:2013-12-15
  • 出版单位:计算机研究与发展
  • 年:2013
  • 期:v.50
  • 基金:国家自然科学基金项目(61170002)
  • 语种:中文;
  • 页:JFYZ201312012
  • 页数:8
  • CN:12
  • ISSN:11-1777/TP
  • 分类号:92-99
摘要
情感分类是观点挖掘的一个重要的方面.提出了一种基于情感特征聚类的半监督式情感分类方法,该方法只需要对少量训练数据实例进行情感类别标注.首先从消费者评论中提取普通分类特征和情感特征,普通分类特征可以用来训练一个情感分类器.然后使用spectral聚类算法把这些情感特征映射成扩展特征.普通分类特征和扩展特征一起通过训练得到另一个情感分类器.2个分类器再从未标签数据集中选择实例放入到训练集合中,并通过训练得到最终的情感分类器.实验结果表明,在同样的数据集上该方法的情感分类准确度比基于self-learning SVM的方法和基于co-training SVM的方法的情感分类准确度要高.
        Sentiment classification for text is an important aspect of opinion mining.This paper proposes a semi-supervised sentiment classification method based on sentiment feature clustering.The method only requires a small number of labeled training data instances.Firstly,the method extracts common text features and sentiment features.Common text features can be used to train the first sentiment classifier.Then the spectral clustering-based algorithm is employed to map sentiment features into extended features.The extended features and common text features are combined together to form the second sentiment classifier.The two classifiers select instances from the unlabeled dataset into the training dataset to train the final sentiment classifier.Experimental results show that the proposed method can reach higher sentiment classification accuracy than both the selflearning SVM-based method and the co-training SVM-based method.
引文
[1]Pang B,Lee L,Vaithyanathan S.Thumbs up/sentiment classification using machine learning techniques[C]//Proc of the 40th Annual Meeting on Association for Computational Linguistics(ACL'02).Stroudsburg,USA:Association for Computational Linguistics,2002:79-86
    [2]Turney P.Thumbs up or thumbs down/:semantic orientation applied to unsupervised classification of reviews[C]//Proc of the 40th Annual Meeting on Association for Computational Linguistics(ACL'02).Stroudsburg,USA:Association for Computational Linguistics,2002:417-424
    [3]Pang B,Lee L.Opinion mining and sentiment analysis[J].Foundations and Trends in Information Retrieval,2008,2(1/2):1-135
    [4]Subasic P,Huettner A.Affect analysis of text using fuzzy semantic typing[J].IEEE Trans on Fuzzy Systems,2001,9(4):417-424
    [5]Rakesh A,Rajagopalan S,Srikant R,et al.Mining newsgroups using networks arising from social behavior[C]//Proc of the 12th Int Conf on World Wide Web(WWW'03).New York:ACM,2003:529-535
    [6]Zhou S,Chen Q,Wang X.Active deep networks for semisupervised sentiment classification[C]//Proc of the 23rd Int Conf on Computational Linguistics:Posters(COLING'10).Stroudsburg,USA:Association for Computational Linguistics,2010:1515-1523
    [7]Xia R,Zong C,Li S.Ensemble of feature sets and classification algorithms for sentiment classification[J].Information Sciences,2011,181(6):1138-1152
    [8]Li S,Hao J.Spectral Clustering-Based Semi-supervised Sentiment Classification[G]//LNCS 7713:Proc of the 8th Advanced Data Mining and Applications.Berlin:Springer,2012:271-283
    [9]Mohar B.The Laplacian spectrum of graphs[J].Graph Theory Combinatorics,and Applications,1991,2:871-898
    [10]Mohar B,Juvan M.Graph Symmetry:Algebraic Methods and Applications[M].Berlin:Springer,1997:227-275
    [11]Ng A,Jordan M,Weiss Y.Advances in Neural Information Processing Systems 14[M].Cambridge,USA:MIT Press,2001:849-856
    [12]Tong S,Koller D.Support vector machine active learning with applications to text classification[J].Journal of Machine Learning Research,2002,2:45-66
    [13]Blum A,Mitchell T.Combining labeled and unlabeled data with co-training[C]//Proc of the 11st Annual Conf on Computational Learning Theory.New York:ACM,1998:92-100
    [14]Li S,Huang C,Zhou G,et al.Employing personal/impersonal views in supervised and semi-supervised sentiment classification[C]//Proc of the 48th Annual Meeting on Association for Computational Linguistics(ACL'10).Stroudsburg,USA:Association for Computational Linguistics,2010:414-423
    [15]Wan X.Co-training for cross-lingual sentiment classification[C]//Proc of Joint Conf of the 47th Annual Meeting of the ACL and the 4th Int Joint Conf on Natural Language Processing of the AFNLP.Stroudsburg,USA:Association for Computational Linguistics,2009:235-243
    [16]Dasgupta S,Ng V.Mine the easy,classify the hard:a semisupervised approach to automatic sentiment classification[C]//Proc of the Joint Conf of the 47th Annual Meeting of the ACL and the 4th Int Joint Conf on Natural Language Processing of the AFNLP.Stroudsburg,USA:Association for Computational Linguistics,2009:701-709
    [17]Sindhwani V,Melville P.Document-Word Co-regularization for Semi-supervised Sentiment Analysis[C]//Proc of the 8th IEEE Int Conf on Data Mining(ICDM'08).Piscataway,NJ:IEEE,2008:1025-1030

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700