基于自注意力与动态路由的文本建模方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Text Modeling Method Based on Self-Attention and Dynamic Routing
  • 作者:沈炜域
  • 英文作者:SHEN Wei-yu;School of Information Technology and Network Security,People's Public Security University of China;
  • 关键词:胶囊网络 ; 动态路由 ; 注意力机制 ; 文本建模
  • 英文关键词:capsule network;;dynamic routine;;self attention;;text model
  • 中文刊名:RJDK
  • 英文刊名:Software Guide
  • 机构:中国人民公安大学信息技术与网络安全学院;
  • 出版日期:2019-01-15
  • 出版单位:软件导刊
  • 年:2019
  • 期:v.18;No.195
  • 语种:中文;
  • 页:RJDK201901014
  • 页数:6
  • CN:01
  • ISSN:42-1671/TP
  • 分类号:62-66+70
摘要
有别于RNN和CNN,动态路由与注意力机制为捕捉文本序列的长程和局部依赖关系提供了新思路。为更好地进行文本编码,尽可能多地保留文本特征、增加特征多样性,基于动态路由与注意力机制的思想,整合胶囊网络和自注意力网络的语言信息特征抽取能力,构建一种深度网络模型CapSA,并通过3种不同领域的文本分类实验验证模型效果。实验结果显示,相较于几种基于RNN或CNN的模型,基于CapSA模型的文本分类模型取得了更高的F1值,表明该模型具有更好的文本建模能力。
        Compared with RNN and CNN,dynamic routing and attention mechanism provide some new ways to capture the long-term and local dependencies.Based on the ideas of dynamic routing and self-attention,this paper proposed CapSA(Capsule-Self-Attention),a deep neural network,to better model text sequences,retain features as many as possible and increase the diversity of features.This paper verified the effectiveness of the proposed model by text classification experiments in different fields.The results show that the text classification model based on CapSA gained higher F1-score than several strong models based on RNN and CNN,indicating that CapSA has better text modeling performance.
引文
[1]CHUNG J,GULCEHRE C,BENGIO Y,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling[J/OL].ArXiv Preprint ArXiv:1412.3555.https://arxiv.org/abs/1412.3555.
    [2]KIM,Y.Convolutional neural networks for sentence classification[C].Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing,2014:1746-1751.
    [3]LIN Z,FENG M,BENGIO Y,et al.A structured self-attentive sentence embedding[J/OL].ArXiv Preprint ArXiv:1703.03130.https://arxiv.org/abs/1703.03130.
    [4]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C].Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2016.
    [5]SABOUR S,FROSST N,HINTON G E.Dynamic routing between capsules[C].Advances in Neural Information Processing Systems,2017:1-11.
    [6]IYYER M,MANJUNATHA V,BOYD-GRABER J,et al.Deep unordered composition rivals syntactic methods for text classification[C].Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,2015:1681-1691.
    [7]LIU B,HUANG M,SUN J,et al.Incorporating domain and sentiment supervision in representation learning for domain adaptation[C].Proceedings of the 24th International Conference on Artificial Intelligence,2015:1277-1284.
    [8]JOULIN A,GRAVE E,BOJANOWSKI P,et al.Bag of tricks for efficient text classification[C].Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,2017:1-5.
    [9]BLUNSOM P,GREFENSTETTE E,KALCHBRENNER N.A convolutional neural network for modelling sentences[C].Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,2014:1-11.
    [10]LEI T,BARZILAY R,JAAKKOLA T.Molding CNNs for text:non-linear,non-consecutive convolutions[J].Indiana University Mathematics Journal,2015,58(3):1151-1186.
    [11]CONNEAU A,SCHWENK H,BARRAULT L,et al.Very deep convolutional networks for text classification[C].Proceedings of the15th Conference of the European Chapter of the Association for Computational Linguistics,2017:1-10.
    [12]ZHU X,SOBIHANI P,GUO H.Long short-term memory over recursive structures[C].ICML'15 Proceedings of the 32nd International Conference on International Conference on Machine Learning,2015:1604-1612.
    [13]TAI K S,SOCHER R,MANNING C D.Improved semantic representations from tree-structured long short-term memory networks[C].Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics,2015:1577-1586.
    [14]YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C].Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,2016:1-11.
    [15]ZHOU X,WAN X,XIAO J.Attention-based LSTM network for cross-lingual sentiment classification[C].Proceedings of the 2016Conference on Empirical Methods in Natural Language Processing,2016:1-23.
    [16]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[J/OL].ArXiv Preprint ArX-iv:1409.0473.https://arxiv.org/abs/1409.0473.
    [17]SHANG L,LU Z,LI H.Neural responding machine for short-text conversation[C].Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing,2015.
    [18]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C].Advances in Neural Information Processing Systems,2017:1-15.
    [19]SHEN T,ZHOU T,LONG G,et al.DiSAN:Directional self-attention network for RNN/CNN-free language understanding[C].The32nd AAAI Conference on Artificial Intelligence,2018:1-10.
    [20]JUN L I,SUN M.Experimental study on sentiment classification of chinese review using machine learning techniques[C].2007 IEEEinternational conference on natural language processing and knowledge engineering,2007:393-400.
    [21]QIU X,GONG J,HUANG X.Overview of the NLPCC 2017 shared task:Chinese news headline categorization[J/OL].Computer Science:2018,10619.https://link.springer.com/chapter/10.1007/978-3-319-73618-1_85.
    [22]CLEVERT D A,UNTERTHINER T,HOCHREITER S.Fast and accurate deep network learning by exponential linear units(ELUs)[J].ArXiv Preprint ArXiv:1511.07289.https://arxiv.org/abs/1511.07289.
    [23]LAI,S,XU,et.al.Recurrent convolutional neural networks for text classification[C].Proceedings of the 29th AAAI Conference on Artificial Intelligence,2015:2267-2273.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700