摘要
目前,事件检测的难点在于一词多义和多事件句的检测.为了解决这些问题,提出了一个新的基于语言模型的带注意力机制的循环卷积神经网络模型(recurrent and convolutional neural network with attention based on language models,LM-ARCNN).该模型利用语言模型计算输入句子的词向量,将句子的词向量输入长短期记忆网络获取句子级别的特征,并使用注意力机制捕获句子级别特征中与触发词相关性高的特征,最后将这两部分的特征输入到包含多个最大值池化层的卷积神经网络,提取更多上下文有效组块.在ACE2005英文语料库上进行实验,结果表明,该模型的F1值为74.4%,比现有最优的文本嵌入增强模型(DEEB)高0.4%.
Now main difficulties of event detection lie in polysemy and multi-event detection.To overcome these difficulties,we propose a novel recurrent and convolutional network with attention based on language model(LM-ARCNN).The model first learns word embeddings from Language Models(ELMo),and places these learned embeddings into a long-short term memory neural network(LSTM)which can capture sentence-level features.Then it utilizes attention mechanism to learn information from the learned sentence features to find the features which are more closely relative to candidate trigger words.Finally,it places these learned sentence features and attention features into a multi-pooling convolutional networks(DMCNN)which uses a dynamic multipooling layer according to event trigger to reserve more crucial context chunks.Experiments in ACE2005 English corpus show that the model achieves the state-of-the-art performance with F1 value is 74.4%.
引文
[1] GRISHMAN R,WESTBROOK D,MEYERS A.NYU’s English ace 2005 system description[J].Journal on Satisfiability,2005,51(11):1927-1938.
[2] AHN D.The stages of event extraction[C]∥Proceedings of the Workshop on Annotating and Reasoning about Time and Events.Sydney:ACL,2006:1-8.
[3] JI H,GRISHMAN R.Refining event extraction through cross-document inference[C]∥Meeting of the Association for Computational Linguistics.Columbus:ACL,2008:254-262.
[4] LIAO S,GRISHMAN R.Using document level crossevent inference to improve event extraction[C]∥Meeting of the Association for Computational Linguistics.Uppsala:ACL,2010:789-797.
[5] HONG Y,ZHANG J,MA B,et al.Using cross-entity inference to improve event extraction[C]∥Meeting of the Association for Computational Linguistics:Human Language Technologies.Portland:ACL,2011:1127-1136.
[6] LI Q,JI H,HUANG L.Joint event extraction via structured prediction with global features[C]∥Meeting of the Association for Computational Linguistics.Sofia:ACL,2013:73-82.
[7] CHEN Y,XU L,LIU K,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]∥Meeting of the Association for Computational Linguistics.Beijing:ACL,2015:167-176.
[8] NGUYEN T H,CHO K,GRISHMAN R.Joint event extraction via recurrent neural networks[C]∥Conference of the North American Chapter of the Association for ComputationalLinguistics:HumanLanguage Technologies.San Diego:NAACL-HLT,2016:300-309.
[9] FENG X,HUANG L,TANG D.A language-independent neural network for event detection[J].Science China Information Sciences,2018,61(9):92-106.
[10] LIU S,CHEN Y,LIU K,et al.Exploiting argument information to improve event detection via supervised attention mechanisms[C]∥Meeting of the Association for Computational Linguistics.Vancouver:ACL,2017:1789-1798.
[11] DUAN S,HE R,ZHAO W.Exploiting document level information to improve event detection via recurrent neural networks[C]∥Proceedings of the Eighth International Joint Conference on Natural Language Processing.Taipei:IJCNLP,2017:352-361.
[12] ZHAO Y,JIN X,WANG Y,et al.Document embedding enhancedeventdetectionwithhierarchicaland supervised attention[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Short Papers).Melbourne:ACL,2018:1-6.
[13] NGUYEN T H,GRISHMAN R.Graph convolutional networks with argument-aware pooling for event detection[C]∥Association for the Advancement of Artificial Intelligence.New Orleans:AAAI,2018:5900-5907.
[14] HONG Y,ZHOU W,ZHANG J,et al.Self-regulation:employing agenerative adversarial network to improve event detection[C]∥Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics.Melbourne:ACL,2018:1-12.
[15]MIKOLOV T,CHEN K,CORRADO G S,et al.Efficient estimation of word representations in vector space[EB/OL].(2013-09-07)[2018-12-19].http:∥arxiv.org/abs/1301.3781.
[16] PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]∥Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.New Orleans:NAACL-HLT,2018:2227-2237.
[17] BA J L,KIROS J R,HINTON G E.Layer Normalization[EB/OL].(2016-07-21)[2018-12-19].https:∥arxiv.org/abs/1607.06450v1.
[18] LUONG M T,PHAM H,MANNING C D.Effective approaches to attention-based neural machine translation[C]∥Proceedings of the 2015Conference on Empirical Methods in Natural Language Processing.Lisbon:EMNLP,2015:1412-1421.
[19] YIN W,SCHUTZE H,XIANG B,et al.ABCNN:attentionbased convolutional neural network for modeling sentence pairs[J].Transactions of the Association for Computational Linguistics,2016,4(1):259-272.