Hybrid sentiment analysis framework for a morphologically rich language
详细信息    查看全文
  • 作者:Miljana Mladenović ; Jelena Mitrović…
  • 刊名:Journal of Intelligent Information Systems
  • 出版年:2016
  • 出版时间:June 2016
  • 年:2016
  • 卷:46
  • 期:3
  • 页码:599-620
  • 全文大小:1,055 KB
  • 刊物类别:Computer Science
  • 刊物主题:Data Structures, Cryptology and Information Theory
    Artificial Intelligence and Robotics
    Document Preparation and Text Processing
    Business Information Systems
  • 出版者:Springer Netherlands
  • ISSN:1573-7675
  • 卷排序:46
文摘
This paper presents a process of building a Sentiment Analysis Framework for Serbian (SAFOS). We created a hybrid method that uses a sentiment lexicon and Serbian WordNet (SWN) synsets assigned with sentiment polarity scores in the process of feature selection. As the use of stemming for morphologically rich languages (MRLs) may result in loss or giving incorrect sentiment meaning to words, we decided to expand the sentiment lexicon, as well as the lexicon generated using SWN, by adding morphological forms of emotional terms and phrases. It was done using Serbian Morphological Electronic Dictionaries. A new feature reduction method for document-level sentiment polarity classification using maximum entropy modeling is proposed. It is based on mapping of a large number of related feature candidates (sentiment words, phrases and their inflectional forms) to a few concepts and using them as features. Testing was performed on a 10-fold cross validation set and on test sets containing news and movie reviews. The results of all experiments show that sentiment feature mapping for feature set reduction achieves better results over the basic set of features. For both test sets, the best classification accuracy scores were achieved for the combination of unigram and bigram features reduced by sentiment feature mapping (accuracy 78.3 % for movie reviews and 79.2 % for news test set). In 10-fold cross-validation, best average accuracy score of 95.6 % was obtained using unigrams as features, reduced by the mapping procedure.KeywordsSentiment analysisMachine learningMaximum entropy classifierFeature selectionMorphological e-dictionaries

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700