一种针对成分树的混合神经网络模型

英文篇名：A Hybrid Neural Network Model on Constituent Tree Structure
作者：霍欢 ; 薛瑶环 ; 黄君扬 ; 金轩城 ; 邹依婷
英文作者：HUO Huan;XUE Yaohuan;HUANG Junyang;JIN Xuancheng;ZOU Yiting;School of Optical-electrical and Computer Engineering,University of Shanghai for Science and Technology;Shanghai Key Laboratory of Data Science,Fudan University;
关键词：成分树 ; C-TreeLSTM ; 短语语义向量 ; 混合模型
英文关键词：constituent tree;;C-TreeLSTM;;phrase semantic vector;;hybrid model
中文刊名：MESS
英文刊名：Journal of Chinese Information Processing
机构：上海理工大学光电信息与计算机工程学院;复旦大学上海市数据科学重点实验室;
出版日期：2019-03-15
出版单位：中文信息学报
年：2019
期：v.33
基金：国家自然科学基金(61003031);; 上海重点科技攻关项目(14511107902);; 上海市工程中心建设项目(GCZX14014);; 上海市一流学科建设项目(XTKX2012);; 上海市数据科学重点实验室开放课题(201609060003);; 沪江基金研究基地专项(C14001)
语种：中文;
页：MESS201903002
页数：9
CN：03
ISSN：11-2325/N
分类号：13-21

摘要

为了提高自然语言处理的准确度,很多工作将句法成分树与LSTM相结合,提出了各种针对成分树的LSTM模型(文中用C-TreeLSTM统称这类模型)。考虑到C-TreeLSTM模型在计算内部节点隐藏状态的过程中,由于一个重要信息来源(即单词)的缺失导致文本建模的准确度不高,该文提出一种针对成分树的混合神经网络模型,通过在C-TreeLSTM模型的节点编码过程中注入各节点所覆盖的短语语义向量来增强节点对文本语义的记忆,故将此模型命名为SC-TreeLSTM。实验结果表明,该模型在情感分类和机器阅读理解两类任务上表现优异。
Current methods of combining constituent trees with LSTM(C-TreeLSTM)suffere from low accuracy for text modeling due to withouth computing the words in hidden state of internal nodes.This paper proposes a hybrid neural network model,i.e.SC-TreeLSTM,based on the constituent tree structure.The model enhances nodes memory of text semantics by injecting phrase semantic vectors which is covered by corresponding node during encoding.The experimental results show that the SC-TreeLSTM achieves excellent performance in both sentiment classification and machine reading comprehension tasks.

引文

[1]Peter W Foltz,Walter Kintsch,Thomas K Landauer.The measurement of textual coherence with latent semantic analysis[J].Discourse Processes,1998,25(2-3):285-307.
    [2]Elman J L.Finding structure in time[J].Cognitive Science,1990,14(2):179-211.
    [3]Mikolov T A.Statistical language models based on neural networks[C]//Proceedings of the Presentation at Google,Mountain View,2012.
    [4]霍欢,张薇,刘亮,等.一种针对句法树的混合神经网络模型[J].中文信息学报,2017,31(06):58-66.
    [5]Miwa M,Bansal M.End-to-end relation extraction using LSTMs on sequences and tree structures[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016:1105-1116.
    [6]Chen H,et al.Improved neural machine translation with a syntax-aware encoder and decoder[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017:1936-1945.
    [7]Gers F A,Schmidhuber J.Recurrent nets that time and count[C]//Proceedings of Ieee-Inns-Enns International Joint Conference on Neural Networks.IEEEComputer Society,2000:189-194.
    [8]Zhu X,Sobhani P,Guo H.Long short-term memory over tree structures[J].arXiv preprint arXiv:1503.04881,2015.
    [9]Kai Sheng Tai,Richard Socher,Christopher D.Manning.Improved semantic representations from treestructured long short-term memory networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics,2015:1556-1566.
    [10]Phong Le,Willem H Zuidema.Compositional distributional semantics with long short term memory[C]//Proceedings of the 4th Joint Conference on Lexical and Computational Semantics,2015:10-19.
    [11]Teng Z,Zhang Y.Bidirectional tree-structured LSTMwith head lexicalization[J].arXiv preprint arXiv:1611.06788,2016.
    [12]Graves A,Jaitly N,Mohamed A R.Hybrid speech recognition with deep bidirectional LSTM[C]//Proceedings of Automatic Speech Recognition and Understanding.IEEE,2014:273-278.
    [13]Pennington J,Socher R,Manning C.Glove:Global vectors for word representation[C]//Proceedings of the 2014Conference on Empirical Methods in Natural Language Processing,2014(14):1532-1543.
    [14]Mikolov T,Sutskever I,Chen K,et al.Distributed representations of words and phrases and their compositionality[J].Advances in Neural Information Processing Systems,2013(26):3111-3119.
    [15]Yoon Kim.Convolutional neural networks for sentence classification[C]//Proceedings of the 2014Conference on Empirical Methods in Natural Language Processing,2014(14):1746-1751.
    [16]Huo Huan,et al.Collaborative filtering recommendation model based on convolutional denoising auto encoder[C]//Proceedings of the 12th Chinese Conference on Computer Supported Cooperative Work and Social Computing(ChineseCSCW17).ACM,2017:64-71.
    [17]Cross J,Huang L.Span-based constituency parsing with a structure-label system and provably optimal dynamic oracles[C]//Proceedings of the 2016Conference on Empirical Methods in Natural Language Processing,2016:1-11.
    [18]Socher R,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing,2013:1631-1642.
    [19]Klein D,Manning C D.Accurate unlexicalized parsing[C]//Proceedings of Meeting on Association for Computational Linguistics,2003:423-430.
    [20]Kingma D P,Ba J.Adam:A method for stochastic optimization[J].arXiv preprint arXiv:1412.6980:1-15,2014.
    [21]Jiwei Li,et al.When are tree structures necessary for deep learning of representations?[C]//Proceedings of the 2015Conference on Empirical Methods in Natural Language Processing,2015:2304-2314.
    [22]Ling Gan,Houyu Gong.Text sentiment analysis based on fusion of structural information and Serialization information[C]//Proceedings of the 8th International Joint Conference on Natural Language Processing,2017:336-341.
    [23]Seo M,et al.Bidirectional attention flow for machine comprehension[J].arXiv preprint arXiv:1611.01603,2016.
    [24]Pranav Rajpurkar,Jian Zhang,Konstantin Lopyrev.SQuAD:100,000+questions for machine comprehension of text[C]//Proceedings of the 2016Conference on Empirical Methods in Natural Language Processing,2016:2383-2392.
    [25]Christopher D Manning,Mihai Surdeanu,John Bauer.The Stanford CoreNLP natural language processing toolkit[C]//Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics,2014:55-60.
    [26]Alexandros Potamianos,Filippos Kokkinos.Structural attention neural networks for improved sentiment analysis[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics,2017:586-591.
    [27]Ma M,et al.Dependency-based convolutional neural networks for sentence embedding[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics,2015:174-179.
    [28]Huadong Chen,et al.Improved neural machine translation with a syntax-aware encoder and decoder[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017:1936-1945.
    [29]Lili Mou,et al.Discriminative neural sentence modeling by tree-based convolution[C]//Proceedings of the 2015Conference on Empirical Methods in Natural Language Processing,2015:2315-2325.
    [30]Mou L,et al.Backward and forward language modeling for constrained sentence generation[J].arXiv preprint arXiv:1604.0006:473-482,2016.
    [31]Jin Wang,et al.Dimensional sentiment analysis using a regional CNN-LSTM model[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016:225-230.
    [32]Xiao Y,Cho K.Efficient character-level document classification by combining convolution and recurrent layers[J].arXiv preprint arXiv:1602.00367,2016.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700