面向事件抽取的深度与主动联合学习方法

英文篇名：Combining Deep Learning and Active Learning for Event Extraction
作者：邱盈盈 ; 洪宇 ; 周文瑄 ; 姚建民 ; 朱巧明
英文作者：QIU Yingying;HONG Yu;ZHOU Wenxuan;YAO Jianmin;ZHU Qiaoming;Provincial Key Laboratory of Computer Information Processing Technology,Soochow University;
关键词：事件抽取 ; 深度学习 ; 主动学习 ; 循环神经网络
英文关键词：event extraction;;deep learning;;active learning;;RNN
中文刊名：MESS
英文刊名：Journal of Chinese Information Processing
机构：苏州大学江苏省计算机信息处理重点实验室;
出版日期：2018-06-15
出版单位：中文信息学报
年：2018
期：v.32
基金：国家自然科学基金(61373097,61672367,61672368);; 江苏省科技计划(BK20151222);; 教育部—中国移动基金(MCM20150602)
语种：中文;
页：MESS201806012
页数：9
CN：06
ISSN：11-2325/N
分类号：103-111

摘要

事件抽取旨在从非结构化的文本中抽取出事件的信息,并以结构化的形式予以呈现。监督学习作为基础的事件抽取方法往往受制于训练语料规模小、类别分布不平衡和质量参差不齐的问题。同时,传统基于特征工程的事件抽取方法往往会产生错误传递的问题,且特征工程较为复杂。为此,该文提出了一种联合深度学习和主动学习的事件抽取方法。该方法将RNN模型对触发词分类的置信度融入在主动学习的查询函数中,以此在主动学习过程中提高语料标注效率,进而提高实验的最终性能。实验结果显示,这一联合学习方法能够辅助事件抽取性能的提升,但也显示,联合模式仍有较高的提升空间,有待进一步思考和探索。
Event extraction aims at extracting event information from raw texts and representing them as a structured text.As a basic event extraction method,supervised learning often suffers from small scale,imbalanced distribution and uneven quality of training corpus.Moreover,traditional event extraction methods based on feature engineering are complicated and will always cause error propagation.To address these issues,this paper presents a method to combine deep learning and active learning by the confidence of the query function based on RNN's trigger classification,in order to improve the quality and efficiency of corpus annotation as well as the ultimate performance.The experimental results show that this joint learning method can improve the event extraction,with substantial room for further exploration.

引文

[1]Shasha Liao,Ralph Grishman.Using prediction from sentential scope to build a Pseudo co-testing learner for event extraction[C]//Proceedings of the 5th International Joint Conference on Natural Language Processing(IJCNLP),Chiang Mai,Thailand,2011:714-722.
    [2]Yu Hong,Jianfeng Zhang,Bin Ma,et al.Using cross-entity inference to improve event extraction[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics(ACL),Portland,USA,2010:1127-1136.
    [3]Heng Ji,Ralph Grishman.Refining event extraction through cross-document inference[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics(ACL),Colunbus,USA,2008:254-262.
    [4]Qi Li,Heng Ji,Liang Huang.Joint event extraction via structured prediction with global features[C]//Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics(ACL).Sofia,Bulgaria,2013:73-82.
    [5]肖升,何炎祥.事件超图模型及类型识别[J].中文信息学报,2013,27(01):30-38.
    [6]Ofer Bronstein,Ido Dagan,Qi Li,et al.Seed-based event trigger labeling:How far can event descriptions get us?[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(ACL-IJCNLP),Beijing,China,2015:372-376.
    [7]Thien Huu Nguyen,Ralph Grishman.Event detection and domain adaptation with convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the7th International Joint Conference on Natural Language Processing(ACL-IJCNLP),Beijing,China,2015:365-371.
    [8]Hu B,Lu Z,et al.Convolutional neural networkarchitectures for matching natural language sentences[C]//Proceedings of the Advances in Neural Information Processing Systems(NIPS),Quebec,Canada,2014,2042-2050.
    [9]Yubo Chen,Liheng Xu,Kang Liu,et al.Event extraction via dynamic multi-pooling convolutional neural networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing(ACL-IJCNLP),Beijing,China,2015:167-176.
    [10]David Yarowsky.Unsupervised word sense disambiguation rivaling supervised methods[C]//Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics(ACL).Cambridge,US,1995:189-196.
    [11]Prashant Gupta,Heng Ji.Predicting unknown time arguments based on cross-event propagation[C]//Proceedings of the ACL-IJCNLP 2009 Conference Short Papers,Suntec,Singapore,2009:369-372.
    [12]Shasha Liao,Ralph Grishman.Using document level cross-event inference to improve event extraction[C]//Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics(ACL),Uppsala,Sweden,2010:789-797.
    [13]Shasha Liao,Ralph Grishman.Can document selection help semi-supervised learning?A case study on event extraction[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics(ACL),Oregon,Portland,2011:260-265.
    [14]Ralph Grishman,David Westbrook,Adam Meyers.NYU’s English ACE 2005system description[C]//Proceedings of ACE 2005 Evaluation Workshop,Gaithersburg,USA,2005:5-19.
    [15]Peifeng Li,Qiaoming Zhu,Guodong Zhou.Employing event inference to improve semi-supervised chinese event extraction[C]//Proceedings of COLING2014and the 25th International Conference on Computational Linguistics(Coling),Dublin,Ireland,2014:2161-2171.
    [16]Shasha Liao,Ralph Grishman.Filtered ranking for bootstrapping in event extraction[C]//Proceedings of the 23rd International Conference on Computational Linguistics(Coling),Beijing,China,2010:680-688.
    [17]徐霞,李培峰,朱巧明.一个半监督的中文事件抽取方法[J].中文信息学报,2016,30(02):168-174.
    [18]Mesnil G,He X,Deng L,et al.Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding[C]//Proceedings of Interspeech(Interspeech),Lyon,France,2013:3771-3775.
    [19]George A M.WordNet:A lexical database for English[J].Communications of the ACM,1995,38(11):39-41.
    [20]Lewis,J Catlett.Heterogeneous uncertainty sampling for supervised learning[C]//Proceedings of the International Conference on Machine Learning(ICML).Morgan Kaufmann,1994,148-156.
    [21]Minling Zhang,Zhihua Zhou.ML-KNN:A lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [22]Schein A I,Ungar L H.Active learning for logistic regression:an evaluation[J].Machine Learning,2007,68(3):235-265.
    (1)来自ACE 2005事件抽取语料中的文档APW_ENG_20030502.0686.sgm
    (2)来自ACE 2005事件抽取语料中的文档APW_ENG_20030502.0768.sgm
    (3)并行抽取模型也称为联合模型Joint model,为区别本文方法而采用并行模型的称谓。
    (1)http://mallet.cs.umass.edu/
    (1)https://code.google.com/p/word2vec
    (1)http://www.deeplearning.net/software/theano/

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700