基于段落内部推理和联合问题答案匹配的选择型阅读理解模型

英文篇名：Reasoning over intra-document and jointly matching question and candidate answer to the passage based multiple-choice Reading Comprehension
作者：王霞 ; 孙界平 ; 琚生根 ; 胡思才
英文作者：WANG Xia;SUN Jie-Ping;JU Sheng-Gen;HU Si-Cai;College of Computer Science, Sichuan University;Troops 61920 of PLA;
关键词：共同匹配 ; 多粒度 ; 机器阅读理解
英文关键词：Joint match;;Multi-Granularity;;Machine reading comprehension
中文刊名：SCDX
英文刊名：Journal of Sichuan University(Natural Science Edition)
机构：四川大学计算机学院;解放军61920部队;
出版日期：2019-05-13 15:24
出版单位：四川大学学报(自然科学版)
年：2019
期：v.56
基金：南方电网公司科技资助项目(GZKJXM20170162);; 2018四川省新一代人工智能重大专项(18ZDZX0137)
语种：中文;
页：SCDX201903008
页数：8
CN：03
ISSN：51-1595/N
分类号：53-60

摘要

针对当前机器阅读理解方法中仅将问题与段落匹配会导致段落中的信息丢失或将问题和答案连接成单个序列与段落匹配会丢失问题与答案之间的交互,和传统的循环网络顺序解析文本从而忽略段落内部推理的问题,提出一种改进段落编码并且将段落与问题和答案共同匹配的模型.模型首先把段落在多个粒度下切分为块,编码器利用神经词袋表达将块内词嵌入向量求和,其次,将块序列通过前向全连接神经网络扩展到原始序列长度.然后,通过两层前向神经网络建模每个单词所在不同粒度的块之间的关系构造门控函数以使模型具有更大的上下文信息同时捕获段落内部推理.最后,通过注意力机制将段落表示与问题和答案的交互来选择答案.在SemEval-2018 Task 11任务上的实验结果表明,本文模型在正确率上超过了相比基线神经网络模型如Stanford AR和GA Reader提高了9%～10%,比最近的模型SurfaceLR至少提高了3%,超过TriAN的单模型1%左右.除此之外,在RACE数据集上的预训练也可以提高模型效果.
For the current machine reading comprehension method, only matching the question with the paragraph will result in the loss of information of the paragraph or matching the connection of the question and the answer with the paragraph will lose the interaction between the question and the answer. A model that matches an improved encoder of the paragraph with questions and answers is proposed. Firstly, the paragraph is chunked into blocks with multiple granularities, the encoder uses the neural bag-of-words to express the words of each block, and sum the embedding of all words that reside in each block. Next, the blocks are passed into fully-connected layers and expanded to original sequence lengths. The gating function are then constructed through two-layer feed-forward neural network which modeling the relationships between all blocks that each word resides in, allowing for possessing a larger overview of the context information and capturing the intra-document relationships. Finally, the attention mechanism is used to model the interaction between the passage and the question as well as the answer to select an answer. Experimental results on the SemEval-2018 Task 11 demonstrate that our approach's improvement over the baselines such as Stanford AR and GA Reader ranges from 9% to 10%, pulls ahead of recent model SurfaceLR by at least 3% and outperforms the TriAN by 1%. Besides, pretraining the model on RACE datasets helps to improve the overall performance.

引文

[1]Kocisky T,Schwarz J,Blunsom P,et al.The narrativeqa reading comprehension challenge [J].Trans Assoc Comput Linguist,2018,6:317.
    [2] 吴昊,平鹏,孙立博,等.基于改进LRCN模型的驾驶行为图像序列识别方法 [J].江苏大学学报:自然科学版,2018,39:303.
    [3] 张雅俊,高陈强,李佩,等.基于卷积神经网络的人流量统计 [J].重庆邮电大学学报:自然科学版,2017,29:265.
    [4] 崔一鸣,刘挺,王士进.斯坦福SQuAD挑战赛的中国亮丽榜单 [J].中国计算机学会通讯,2017,13:9.
    [5] 高云龙,左万利,王英,等.基于集成神经网络的短文本分类模型 [J].吉林大学学报:理科版,2018,56:933.
    [6] 王斯盾,琚生根,周刚,等.基于集成分类器的用户属性预测研究 [J].四川大学学报:自然科学版,2017,54:1195.
    [7] Yin W,Ebert S,Schütze H.Attention-based convolutional neural network for machine comprehension [C]//Proceedings of the Workshop on Human-Computer Question Answering.Stroudsburg:ACL,2016.
    [8] Zhu H,Wei F,Qin B,et al.Hierarchical attention flow for multiple-choice reading comprehension [C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2018.
    [9] Tang D,Qin B,Liu T.Document modeling with gated recurrent neural network for sentiment classification [C]//Proceedings of the 2015 conference on empirical methods in natural language processing(EMNLP).Stroudsburg:ACL,2015.
    [10] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need [C]//Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS).London:MIT,2017.
    [11] Hermann K M,Kocisky T,G refenstette E,et al.Teaching machines to read and comprehend [C]//Proceedings of the Annual Conference on Neural Information Processing Systems (NIPS).London:MIT,2015.
    [12] Hill F,Bordes A,Chopra S,et al.The goldilocks principle:Reading children's books with explicit memory representations [C]//International Conference on Learning Representations(ICLR).New York:arXiv.org,2016.
    [13] Rajpurkar P,Zhang J,Lopyrev K,et al.Squad:100,000+ questions for machine comprehension of text [C]//Proceedings of the 2016 conference on empirical methods in natural language processing(EMNLP).Stroudsburg:ACL,2016.
    [14] Joshi M,Choi E,Weld D S,et al.Triviaqa:A large scale distantly supervised challenge dataset for reading comprehension [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2017.
    [15] Trischler A,Wang T,Yuan X,et al.NewsQA:a machine comprehension dataset [C]//Proceedings of the 1st Workshop on Representation Learning for NLP(RepL4NLP).Stroudsburg:ACL,2016.
    [16] Richardson M,Burges C J C,Renshaw E.Mctest:a challenge dataset for the open-domain machine comprehension of text [C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing(EMNLP).Stroudsburg:ACL,2013.
    [17] Lai G,Xie Q,Liu H,et al.RACE:Large-scale reading comprehension dataset from examinations [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP).Stroudsburg:ACL,2017.
    [18] Kadlec R,Schmid M,Bajgar O,et al.Text understanding with the attention sum reader network [C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2016.
    [19] Tseng B H,Shen S S,Lee H Y,et al.Towards machine comprehension of spoken content:initial TOEFL listening comprehension test by machine [C]//The Annual Conference of the International Speech Communication Association.Dresden:ISCA,2016.
    [20] Cui Y,Chen Z,Wei S,et al.Attention-over-attention neural networks for reading comprehension [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2017.
    [21] Wang S,Jiang J.Machine comprehension using match-LSTM and answer pointer [C]// Proceedings of the International Conference on Learning Representations(ICLR).New York:arXiv.org,2017.
    [22] Clark P,Etzioni O,Khot T,et al.Combining retrieval,statistics,and inference to answer elementary science questions [C]//Proceedings of the 28th AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2016.
    [23] Liu F,Perez J.Gated end-to-end memory networks [C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics(EACL).Stroudsburg:ACL,2017.
    [24] Wang W,Yang N,Wei F,et al.Gated self-matching networks for reading comprehension and question answering [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2017.
    [25] Shen Y,Huang P S,Gao J,et al.ReasoNet:learning to stop reading in machine comprehension [C]// Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2017.
    [26] Dhingra B,Jin Q,Yang Z,et al.Neural models for reasoning over multiple mentions using coreference [C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL).Stroudsburg:ACL,2018.
    [27] Palangi H,Smolensky P,He X,et al.Question-answering with grammatically-interpretable representations [C]//Proceedings of the 30th AAAI Conference on Artificial Intelligence.Menlo Park:AAAI,2018.
    [28] Pennington J,Socher R,Manning C.Glove:Global vectors for word representation [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing(EMNLP).Stroudsburg:ACL,2014.
    [29] Chen D,Fisch A,Weston J,et al.Reading wikipedia to answer open-domain questions [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2017.
    [30] Chen Q,Zhu X,Ling Z,et al.Enhanced LSTM for natural language inference [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2017.
    [31] Ostermann S,Modi A,Roth M,et al.MCScript:A novel dataset for assessing machine comprehension using script knowledge [C]//Proceedings of the 11 International conference on Language Resources and Evaluation.Stroudsburg:ACL,2018.
    [32] Merkhofer E,Henderson J,Bloom D,et al.MITRE at SemEval-2018 Task 11:Commonsense Reasoning without Commonsense Knowledge [C]//Proceedings of the 12th International Workshop on Semantic Evaluations.Stroudsburg:ACL,2018.
    [33] Chen D,Bolton J,Manning C D.A thorough examination of the cnn/daily mail reading comprehension task [C]// Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.Stroudsburg:ACL,2016.
    [34] Dhingra B,Liu H,Yang Z,et al.Gated-attention readers for text comprehension [C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(ACL).Stroudsburg:ACL,2017.
    [35] Wang L.Yuanfudao at SemEval-2018 Task 11:three-way attention and relational knowledge for commonsense machine comprehension [C]//Proceedings of the 12th International Workshop on Semantic Evaluations.Stroudsburg:ACL,2018.引用本文格式:中文:王霞,孙界平,琚生根,等.基于段落内部推理和联合问题答案匹配的选择型阅读理解模型 [J].四川大学学报:自然科学版,2019,56:423.英文:Wang X,Sun J P,Ju S G,et al.Reasoning over intra-document and jointly matching question and candidate answer to the passage based multiple-choice Reading Comprehension [J].J Sichuan Univ:Nat Sci Ed,2019,56:423.