一种面向商品评价对象挖掘的领域词典构建法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Method on Domain Dictionary Construction for Object Mining on Commodity Comments
  • 作者:石玉鑫 ; 杨泽青 ; 赵志滨 ; 姚兰
  • 英文作者:SHI Yuxin;YANG Zeqing;ZHAO Zhibin;YAO Lan;School of Computer Science and Engineering,Northeastern University;
  • 关键词:领域词典 ; 对象挖掘 ; 商品评论 ; LDA ; PMI
  • 英文关键词:domain dictionary;;object mining;;commodity comment;;LDA;;PMI
  • 中文刊名:ZGGC
  • 英文刊名:Software Engineering
  • 机构:东北大学计算机科学与工程学院;
  • 出版日期:2019-01-05
  • 出版单位:软件工程
  • 年:2019
  • 期:v.22;No.235
  • 基金:the National Key R&D Program of China under grant(2018YFB1004700)资助
  • 语种:中文;
  • 页:ZGGC201901001
  • 页数:7
  • CN:01
  • ISSN:21-1603/TP
  • 分类号:5-11
摘要
通过挖掘商品评论中的评价对象,可以得知用户更关心商品哪些方面的属性,从而帮助企业改进商品,帮助用户选择商品。因此,商品评价对象的挖掘具有重要的意义。本文提出了一种用于商品评价对象挖掘的领域词典构建方法:首先基于LDA模型,提出了一种领域基础词典的构建方法;然后,分别提出了基于词汇之间的PMI值和基于依存句法分析的领域词典扩充方法。本文基于京东商城的洗衣液产品真实评论数据集,使用构建的词典分别进行了一级标签评价对象挖掘和二级标签评价对象挖掘的实验。实验结果表明,本文提出的方法在进行评价对象挖掘时具有良好的性能;相比一级标签评价对象,扩充后的词典对二级标签评价对象挖掘的效果有更好的提升。
        Enterprises hope to be aided by object mining on comments of their products,which reveals the clients' concerns,to improve their manufacturing.This object mining also makes sense to subsequent consumers while they are making their choice.Therefore,it is significant to mine objects of a comment.This paper proposes a method on domain dictionary construction for object mining on comments of commodity:Firstly,a method based on the LDA model,a basic domain dictionary is proposed;then,the domain dictionary expansion methods based on the PMI value of words and dependency parsing are proposed respectively.Data applied for experiments in this paper is from detergent sale data of JD.COM.The dictionaries are applied on this data set for the first-level and second-level label object mining.The experimental results prove the proposed method's great potential in object mining.Compared with the first-level label object mining,the extensive dictionary has improved the second-level label object mining.
引文
[1] Mashechkin I V,Petrovskiy M I,Popov D S,et al.Applying text mining methods for data loss prevention[J].Programming&Computing Software,2015,41(1):23-30.
    [2] Pavlinek M,Podgorelec V.Text classification method based on self-training and LDA topic models[J].Expert Systems with Applications,2017,80:83-93.
    [3] He T,Hao R,Qi H,et al.Mining Feature-Opinion from Reviews Based on Dependency Parsing[J].International Journal of Software Engineering&Knowledge Engineering,2017,26(9n10):1581-1591.
    [4] Tomas P,Virginijus M.Comparison of Na?ve Bayes,Random Forest,Decision Tree,Support Vector Machines,and Logistic Regression Classifiers for Text Reviews Classification[J].Baltic Journal of Modern Computing,2013.
    [5] Mandal S,Gupta S.A novel dictionary-based classification algorithmforopinionmining[C].SecondInternational Conference on Research in Computational Intelligence and Communication Networks.IEEE,2017:175-180.
    [6]尹文科,朱明,陈天昊.基于Wiki链接结构图聚类的领域词典构建方法[J].小型微型计算机系统,2014,35(6):1286-1292.
    [7]李伟卿,王伟军.基于大规模评论数据的产品特征词典构建方法研究[J].数据分析与知识发现,2018,2(1):41-50.
    [8] Chen Z,Cafarella M,Jagadish H V.Long-tail Vocabulary Dictionary Extraction from the Web[C].Proceedings of the Ninth ACM International Conference on Web Search and Data Mining,2016:625-634.
    [9] Kim M,Kim J,Cui J.Performance Evaluation of DomainSpecific Sentiment Dictionary Construction Methods for Opinion Mining[J].International Journal of Database Theory and Application,2016,9:257-268.
    [10] Wu J,Li Y.Research on construction of semantic dictionary in the football field[C].IEEE,International Conference on Software Engineering Research,Management and Applications.IEEE,2017:303-306.
    [11] Alqasemi F,Abdelwahab A,Abdelkader H,et al.Opinion Lexicon Automatic Construction on Arabic language[C].International Conference on Advanced Technology and Applied Sciences,2017.
    [12] Ju M,Duan H,Li H.A CRF-based Method for Automatic Construction of Chinese Symptom Lexicon[C].International Conference on Information Technology in Medicine and Education.IEEE,2016:5-8.
    [13]ChengY,HuangY.ResearchandDevelopmentof DomainDictionaryConstructionSystem[C].IEEE/W I C/A C MInternational Conference on Web Intelligence,2017:1162-1165.
    [14] Zhang S,Wei Z,Wang Y,et al.Sentiment analysis of Chinese micro-blog text based on extended sentiment dictionary[J].Future Generation Computer Systems-The International Journal of eScience,2018(81):395-403.
    [15] Song Y,Jeong S,Kim H.A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia[J].Journal of KIISE,2015,42(11):1397-1403.
    [16]GuoX,HeT,XingY.Constructionofrelationalword dictionary and learning of relational rules in PPI extraction from biomedical literatures[J].International Journal of Data Mining and Bioinformatics,2016,15(2):125-144.
    [17] Hangya V.Automatic Construction of Domain Specific Sentiment Lexicons for Hungarian[C].18th International Conference on Text,Speech and Dialogue,2015:183-190.
    [18] Wu F,Huang Y,Song Y,et al.Towards building a high-quality microblog-specific Chinese sentiment lexicon[J].Decision Support Systems,2016,87:39-49.
    [19] Liu J,Yan M,Luo J.Research on the Construction of Sentiment Lexicon Based on Chinese Microblog[C].8th International Conference on Intelligent Human-Machine Systems and Cybernetics(IHMSC),2016:56-59.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700