面向属性抽取的门控动态注意力机制

英文篇名：Gated Dynamic Attention Mechanism towards Aspect Extraction
作者：程梦 ; 洪宇 ; 唐建 ; 张家硕 ; 邹博伟 ; 姚建民
英文作者：CHENG Meng;HONG Yu;TANG Jian;ZHANG Jiashuo;ZOU Bowei;YAO Jianmin;School of Computer Science and Technology,Soochow University;
关键词：注意力机制 ; 属性抽取 ; 条件随机场 ; 情感分析
英文关键词：Attention Mechanism;;Aspect Extraction;;Conditional Random Field;;Sentiment Analysis
中文刊名：MSSB
英文刊名：Pattern Recognition and Artificial Intelligence
机构：苏州大学计算机科学与技术学院;
出版日期：2019-02-15
出版单位：模式识别与人工智能
年：2019
期：v.32;No.188
基金：国家重点研发计划项目(No.2017YFB1002104);; 国家自然科学基金项目(No.61672367,61672368)资助~~
语种：中文;
页：MSSB201902011
页数：9
CN：02
ISSN：34-1089/TP
分类号：90-98

摘要

在现阶段属性抽取研究中,现有注意力建模及训练较刚性(单句一次成型),而单句中不同词汇的上下文存在语境语义的差异,一致的注意力分布缺少动态的适应性.因此,文中提出面向属性抽取的门控动态注意力机制,利用双向长短时记忆网络捕获目标句中每个单词的隐层表示.在注意力模型处理词一级属性预测时,根据目标词及其上下文,计算适应该目标词的注意力分布向量,可以根据上下文的变化自动调整注意力权重的分配.借助门控调整注意力向量流向下一层神经元的信息量,最终使用条件随机场进行属性标记.应用2014-2016语义评估官方数据集验证文中方法的有效性,F1值均有所提高.
In the current aspect extraction researches, the attention modeling and training are fixed, and the sentence is modeled in one time step. However, the semantics of the words vary in contexts, and a fixed attention distribution lacks dynamic adaptability. Therefore, a gated dynamic attention mechanism towards aspect extraction is proposed in this paper. A bidirectional long short term memory network is exploited to obtain hidden representations of words in a target sentence. Then, a specific attention distribution is computed according to the target word and its context while the attention model labelling words. Thus, the attention-weight distribution can be automatically adjusted according to the changes of contexts. Next, a gate is adopted to adjust the quantities of information flowing to the next units. Finally, conditional random field is utilized to label the aspect. The official datasets of 2014-2016 semantic evaluation are employed to verify the effectiveness of the proposed method, and F1 scores are increased.

引文

[1] PONTIKI M, GALANIS D, PAVLOPOULOS J, et al. SemEval-2014 Task 4: Aspect Based Sentiment Analysis // Proc of the 8th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2014: 27-35.
    [2] HU M Q, LIU B. Mining and Summarizing Customer Reviews // Proc of the 10th ACM SIGKDD International Conference on Know-ledge Discovery and Data Mining. New York, USA: ACM, 2004: 168-177.
    [3] ZHUANG L, JING F, ZHU X Y. Movie Review Mining and Su-mmarization // Proc of the 15th ACM International Conference on Information and Knowledge Management. New York, USA: ACM, 2006: 43-50.
    [4] BLAIR-GOLDENSOHN S, HANNAN K, MCDONALD R, et al. Building a Sentiment Summarizer for Local Service Reviews // Proc of the Workshop on NLP in the Information Explosion Era. Stroudsburg, USA: ACL, 2008, XIV: 339-348.
    [5] WANG B, WANG H F. Bootstrapping Both Product Features and Opinion Words from Chinese Customer Reviews with Cross-Inducing[C/OL]. [2018-05-30]. http://www.aclweb.org/anthology/I/I08/I08-1038.pdf.
    [6] MEI Q Z, LING X, WONDRA M, et al. Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs // Proc of the 16th International Conference on World Wide Web. New York, USA: ACM, 2007: 171-180.
    [7] TITOV I, Mcdonald R. Modeling Online Reviews with Multi-grain Topic Models // Proc of the 17th International Conference on World Wide Web. New York, USA: ACM, 2008: 111-120.
    [8] LIN C H, HE Y L. Joint Sentiment/Topic Model for Sentiment Analysis // Proc of the 18th ACM Conference on Information and Knowledge Management. New York, USA: ACM, 2009: 375-384.
    [9] MUKHERJEE A, LIU B. Aspect Extraction through Semi-supervised Modeling // Proc of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: ACL, 2012: 339-348.
    [10] JIN W, HO H H, SRIHARI R K. OpinionMiner: A Novel Machine Learning System for Web Opinion Mining and Extraction // Proc of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM, 2009: 1195-1204.
    [11] JAKOB N, GUREVYCH I. Extracting Opinion Targets in a Single-and Cross-Domain Setting with Conditional Random Fields // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2010: 1035-1045.
    [12] LI F T, HAN C, HUANG M L, et al. Structure-Aware Review Mining and Summarization // Proc of the 23rd International Conference on Computational Linguistics. Stroudsburg, USA: ACL, 2010: 653-661.
    [13] XU L H, LIU K, ZHAO J. Joint Opinion Relation Detection Using One-Class Deep Neural Network // Proc of the 25th International Conference on Computational Linguistics. Stroudsburg, USA: ACL, 2014: 677-687.
    [14] LIU P F, JOTY S, MENG H L. Fine-Grained Opinion Mining with Recurrent Neural Networks and Word Embeddings // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2015: 1433-1443.
    [15] LI X, LAM W. Deep Multi-task Learning for Aspect Term Extraction with Memory Interaction // Proc of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2017: 2886-2892.
    [16] TOH Z Q, SU J. NLANGP at SemEval-2016 Task 5: Improving Aspect Based Sentiment Analysis Using Neural Network Features // Proc of the 10th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2016: 282-288.
    [17] BAHDANAU D, CHO K, BENGIO Y. Neural Machine Translation by Jointly Learning to Align and Translate[C/OL]. [2018-05-30]. https://arxiv.org/pdf/1409.0473v7.pdf.
    [18] WANG W Y, PAN S J, DAHLMEIER D, et al. Coupled Multi-layer Attentions for Co-extraction of Aspect and Opinion Terms // Proc of the 31st AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2017: 3316-3322.
    [19] YANG Z C, YANG D Y, DYER C, et al. Hierarchical Attention Networks for Document Classification // Proc of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2016: 1480-1489.
    [20] SCHUSTER M, PALIWAL K K. Bidirectional Recurrent Neural Networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.
    [21] COLLINS M. The Forward-Backward Algorithm[C/OL]. [2018-05-30]. http://www.cs.columbia.edu/～mcollins/fb.pdf.
    [22] HUANG Z H, XU W, YU K. Bidirectional LSTM-CRF Models for Sequence Tagging[C/OL]. [2018-05-30]. https://arxiv.org/pdf/1508.01991.pdf.
    [23] PONTIKI M, GALANIS D, PAPAGEORGIOU H, et al. Semeval-2015 Task 12: Aspect Based Sentiment Analysis // Proc of the 9th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2015: 486-495.
    [24] PONTIKI M, GALANIS D, PAPAGEORGIOU H, et al. SemEval-2016 Task 5: Aspect Based Sentiment Analysis // Proc of the 10th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2016: 19-30.
    [25] GREFF K, SRIVASTAVA R K, KOUTNík J, et al. LSTM: A Search Space ODYSSEY. IEEE Transactions on Neural Networks and Learning Systems, 2017, 28(10): 2222-2232.
    [26] CHERNYSHEVICH M. IHS R&D Belarus: Cross-Domain Extraction of Product Features Using Conditional Random Fields // Proc of the 8th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2014: 309-313.
    [27] TOH Z Q, WANG W T. DLIREC: Aspect Term Extraction and Term Polarity Classification System // Proc of the 8th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2014: 235-240.
    [28] YIN Y C, WEI F R, DONG L, et al. Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction // Proc of the 25th International Joint Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2016: 2979-2985.
    [29] VICENTE I S, SARALEGI X, AGERRI R. EliXa: A Modular and Flexible ABSA Platform // Proc of the 9th International Workshop on Semantic Evaluation. Stroudsburg, USA: ACL, 2015: 748-752.
    [30] ZAREMBA W, SUTSKEVER I, VINYALS O. Recurrent Neural Network Regularization[C/OL]. [2018-05-30]. https://arxiv.org/pdf/1409.2329.pdf.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700