基于深度学习的机器阅读理解综述

英文篇名：Survey on Deep-learning-based Machine Reading Comprehension
作者：李舟军 ; 王昌宝
英文作者：LI Zhou-jun;WANG Chang-bao;School of Computer Science and Engineering,Beihang University;
关键词：自然语言处理 ; 机器阅读理解 ; 深度学习 ; 词向量 ; 注意力机制
英文关键词：Natural language processing;;Machine reading comprehension;;Deep learning;;Word vector;;Attention mechanism
中文刊名：JSJA
英文刊名：Computer Science
机构：北京航空航天大学计算机学院;
出版日期：2019-07-15
出版单位：计算机科学
年：2019
期：v.46
基金：国家自然科学基金项目(U1636211,61672081);; 北京成像技术高精尖创新中心项目(BAICIT-2016001);; 国家重点研发计划项目(2016QY04W0802)资助
语种：中文;
页：JSJA201907002
页数：6
CN：07
ISSN：50-1075/TP
分类号：13-18

摘要

阅读理解能力是人类智能中最关键的能力之一,而机器阅读理解作为自然语言处理领域皇冠上的明珠,一直是该领域的研究焦点。近年来,随着深度学习方法的快速发展,机器阅读理解技术获得了长足的进步。首先,对基于深度学习的机器阅读理解技术的研究背景和发展历史进行了概述;然后,详细介绍了词向量、注意力机制以及答案预测这三大关键技术的研究进展;在此基础上,分析了目前机器阅读理解研究所面临的问题;最后,对机器阅读理解技术的未来发展趋势进行了展望。
Natural language processing is the key to achieving artificial intelligence.Machine reading comprehension,as the crown jewel in the field of natural language processing,has always been the focus of research in the field.With the rapid development of deep learning and neural network in recent years,machine reading comprehension has made great progress.Firstly,the research background and development history of machine reading comprehension were introduced.Then,by reviewing the important progress in the development of word vector,attention mechanism and answer prediction,the problems in recent research related to machine reading comprehension were proposed.Finally,the outlook of machine reading comprehension was discussed.

引文

[1] HERMANN K M,KOCISKY T,GREFENSTETTE E,et al.Teaching machines to read and comprehend[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2015:1693-1701.
    [2] RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100 000+ Questions for Machine Comprehension of Text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Stroudsburg:ACL,2016:2383-2392.
    [3] NGUYEN T,ROSENBERG M,SONG X,et al.MS MARCO:A human generated machine reading comprehension dataset [DB/OL].[2018-08-28].https://arxiv.org/abs/1611.09268.
    [4] HIRSCHMAN L,LIGHT M,BRECK E,et al.Deep read:A reading comprehension system[C]//Proceedings of the 37th annual meeting of the Association for Computational Linguistics on Computational Linguistics.Stroudsburg:ACL,1999:325-332.
    [5] RILOFF E,THELEN M.A rule-based question answering system for reading comprehension tests[C]//Proceedings of the 2000 ANLP/NAACL Workshop on Reading comprehension tests as evaluation for computer-based language understanding sytems-Volume 6.Stroudsburg:ACL,2000:13-19.
    [6] POON H,CHRISTENSEN J,DOMINGOS P,et al.Machine reading at the university of washington[C]//Proceedings of the NAACL HLT 2010 First International Workshop on Forma-lisms and Methodology for Learning by Reading.Association for Computational Linguistics.Stroudsburg:ACL,2010:87-95.
    [7] RICHARDSON M,BURGES C J C,RENSHAW E.Mctest:A challenge dataset for the open-domain machine comprehension of text[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Stroudsburg:ACL,2013:193-203.
    [8] KADLEC R,SCHMID M,BAJGAR O,et al.Text understan- ding with the attention sum reader network [DB/OL].[2018-08-28].https://arxiv.org/abs/1603.01547.
    [9] CHEN D,BOLTON J,MANNING C D.A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Stroudsburg:ACL,2016:2358-2367.
    [10] CUI Y,CHEN Z,WEI S,et al.Attention-over-Attention Neural Networks for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Stroudsburg:ACL,2017:593-602.
    [11] DHINGRA B,LIU H,YANG Z,et al.Gated-Attention Readers for Text Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Vo-lume 1:Long Papers).Stroudsburg:ACL,2017:1832-1846.
    [12] WANG S,JIANG J.Machine comprehension using match-lstm and answer pointer [DB/OL].[2018-08-28].https://ar-xiv.org/abs/1608.07905.
    [13] WANG S,JIANG J.Learning Natural Language Inference with LSTM[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Stroudsburg:ACL,2016:1442-1451.
    [14] VINYALS O,FORTUNATO M,JAITLY N.Pointer networks[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2015:2692-2700.
    [15] SEO,MINJOON,et al.Bidirectional attention flow for machine comprehension [DB/OL].[2018-08-28].https://arxiv.org/abs/1611.01603.
    [16] XIONG C,ZHONG V,SOCHER R.Dynamic coattention network for question answering:U.S.Patent Application 15/421,193[P].2018-05-10.
    [17] WANG W,YANG N,WEI F,et al.Gated self-matching networks for reading comprehension and question answering[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Stroudsburg:ACL,2017:189-198.
    [18] JOSHI M,CHOI E,WELD D,et al.TriviaQA:A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Stroudsburg:ACL,2017:1601-1611.
    [19] HE W,LIU K,LYU Y,et al.DuReader:a Chinese Machine Reading Comprehension Dataset from Real-world Applications[DB/OL].[2018-08-28].https://arxiv.org/abs/1711.05073.
    [20] TAN C,WEI F,YANG N,et al.S-net:From answer extraction to answer generation for machine reading comprehension [DB/OL].[2018-08-28].https://arxiv.org/abs/1706.04815.
    [21] CLARK C,GARDNER M.Simple and Effective Multi-Para- graph Reading Comprehension[C]//Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2018:845-855.
    [22] WANG Y,LIU K,LIU J,et al.Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification[C]//Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2018:1918-1927.
    [23] KOCISKY T,SCHWARZ J,BLUNSOM P,et al.The Narra- tiveQA Reading Comprehension Challenge[J].Transactions of the Association for Computational Linguistics,2018,6:317-328.
    [24] DEERWESTER S,DUMAIS S T,FURNAS G W,et al.Indexing by latent semantic analysis[J].Journal of the American society for Information Science,1990,41(6):391-407.
    [25] LUND K,BURGESS C.Producing high-dimensional semantic spaces from lexical co-occurrence[J].Behavior Research Me-thods,Instruments,& Computers,1996,28(2):203-208.
    [26] ROHDE D L T,GONNERMAN L M,PLAUT D C.An improved model of semantic similarity based on lexical co-occurrence[J].Communications of the ACM,2006,8(627-633):116.
    [27] BENGIO Y,DUCHARME R,VINCENT P,et al.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3:1137-1155.
    [28] MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//dAdvancesf in Neural Information Processing Systems.Cambridge:MIT Press,2013:3111-3119.
    [29] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient estimation of word representations in vector space[J].arXiv:1301.3781,2013.
    [30] PENNINGTON J,SOCHER R,MANNING C.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).Stroudsburg:ACL,2014:1532-1543.
    [31] MELAMUD O,GOLDBERGER J,DAGAN I.context2vec:Learning generic context embedding with bidirectional lstm[C]//Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning.Stroudsburg:ACL,2016:51-61.
    [32] MCCANN B,BRADBURY J,XIONG C,et al.Learned in translation:Contextualized word vectors[C]//Advances in Neural Information Processing Systems.Cambridge:MIT Press,2017:6294-6305.
    [33] PETERS M,NEUMANN M,IYYER M,et al.Deep Contextua- lized Word Representations[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(Vo-lume 1:Long Papers).Stroudsburg:ACL,2018:2227-2237.
    [34] RENSINK R A.The dynamic representation of scenes[J].Vi- sual cognition,2000,7(1/2/3):17-42.
    [35] MNIH V,HEESS N,GRAVES A.Recurrent models of visual attention[C]//Advances in Neural Information Processing Systems.Stroudsburg:ACL,2014:2204-2212.
    [36] BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate [DB/OL].[2018-08-28].https://arxiv.org/abs/1409.0473.
    [37] CUI Y,LIU T,CHEN Z,et al.Consensus Attention-based Neural Networks for Chinese Reading Comprehension[C]//Proceedings of COLING 2016,the 26th International Conference on Computational Linguistics:Technical Papers.Pisa:ACM,2016:1777-1786.
    [38] CUI Y,CHEN Z,WEI S,et al.Attention-over-Attention Neural Networks for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).Stroudsburg:ACL,2017:593-602.
    [39] WANG S,JIANG J.Machine comprehension using match-lstm and answer pointer [DB/OL].[2018-08-28].https://arxiv.org/abs/1608.07905.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700