摘要
生成过程中利用词汇特征(包含n-gram和词性信息)识别更多重点词汇内容,进一步提高摘要生成质量,提出了一种基于sequence-to-sequence(Seq2Seq)结构和attention机制的、融合了词汇特征的生成式摘要算法。算法的输入层将词性向量与词向量合并后作为编码器层的输入,编码器层由双向LSTM组成,上下文向量由编码器的输出和卷积神经网络提取的词汇特征向量构成。模型中的卷积神经网络层控制词汇信息,双向LSTM控制句子信息,解码器层使用单向LSTM为上下文向量解码并生成摘要。实验结果显示,在公开数据集和自采数据集上,融合词汇特征的摘要生成模型性能优于对比模型,在公开数据集上的ROUGE-1,ROUGE-2,ROUGE-L分数分别提升了0.024,0.033,0.030。因此,摘要的生成不仅与文章的语义、主题等特征相关,也与词汇特征相关,所提出的模型在融合关键信息的生成式摘要研究中具有一定的参考价值。
In order to use lexical features(including n-gram and part of speech information) to identify more key vocabulary content in the summarization generation process to further improve the quality of the summarization, an algorithm based on sequence-to-sequence(Seq2 Seq) structure and attention mechanism and combining lexical features is proposed. The input layer of the algorithm combines the part of speech vector with the word vector, which is the input of the encoder layer. The encoder layer is composed of bi-directional LSTM, and the context vector is composed of the output of the encoder and the lexical feature vector extracted from the convolution neural network. The convolutional neural network layer in the model controls the lexical information, the bi-directional LSTM controls the sentence information, and the decoder layer uses unidirectional LSTM to decode the context vector and generates the summarization. The experiments on public dataset and the self-collected dataset show that the performance of the summarization generation model considering lexical feature is better than that of the contrast model. The ROUGE-1, ROUGE-2 and ROUGE-L scores on the public dataset are improved by 0.024, 0.033 and 0.030, respectively. Therefore, the generation of summarization is not only related to the semantics and themes of the article, but also to the lexical features.The proposed model provides a certain reference value in the research of generating summarization of integrating key infromation.
引文
[1] RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon:[s.n.], 2015:379-389.
[2] CHOPRA S, AULI M, RUSH A M. Abstractive sentence summarization with attentive recurrent neural networks[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. San Diego:[s.n.], 2016: 93-98.
[3] NALLAPATI R, ZHOU B, SANTOS C N D, et al. Abstractive text summarization using sequence-to-sequence rnns and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Berlin:[s.n.],2016:280-290.
[4] HU Baotian, CHEN Qingcai, ZHU Fangze. LCSTS:A large scale Chinese short text summarization dataset[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon:[s.n.],2015: 1967-1972.
[5] YAO Jinge, WAN Xiaojun, XIAO Jianguo. Recent advances in document summarization[J]. Knowledge and Information Systems, 2017, 53(2): 297-336.
[6] KIKUCHI Y, NEUBIG G, SASANO R, et al. Controlling output length in neural encoder-decoders[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin:[s.n.], 2016:1328-1338.
[7] ZENG Wenyuan, LUO Wenjie, FIDLER S, et al. Efficient summarization with read-again and copy mechanism[C]// Proceedings of the International Conference on Learning Representations.[S.l.]:[s.n.],2017:1-13.
[8] SEE A, LIU P J, MANNING C D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. [S.l.]:[s.n.],2017:1073-1083.
[9] GEHRING J, AULI M, GRANGIER D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning. [S.l.]:[s.n.],2017: 1243-1252.
[10] CHANG C T, HUANG C C, HSU J Y J, et al. A hybrid word-character model for abstractive summarization[EB/OL]. https://arxiv.org/pdf/1802.09968v2.pdf, 2018-02-28.
[11] MA Shuming, SUN Xu, LIN Junyang, et al. Autoencoder as assistant supervisor: Improving text representation for Chinese social media text summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers). Melbourne:[s.n.], 2018:725-731.
[12] PAULUS R, XIONG Caiming, SOCHER R. A deep reinforced model for abstractive summarization[C]//Proceedings of Sixth International Conference on Learning Representations. [S.l.]:[s.n.],2017:1-13.
[13] WANG Li, YAO Junlin, TAO Yunzhe, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI). [S.l.]:[s.n.], 2018: 4453-4460.
[14] FAN A, GRANGIER D, AULI M. Controllable abstractive summarization[C]//Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Melbourne:[s.n.],2017:45-54.
[15] GAO Shen, CHEN Xiuying, LI Piji, et al. Abstractive text summarization by incorporating reader comments[EB/OL]. https://arxiv.org/pdf/1812.05407v1.pdf, 2018-12-13.
[16] GAO Shen, CHEN Xiuying, LI Piji,et al. Product-aware answer generation in e-commerce question-answering[C]//Proceedings of the 12th ACM International Conference on Web Search and Data Mining.[S.l.]:[s.n.],2019:07696.
[17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
[18] LIN Junyang, SUN Xu, MA Shuming, et al. Global encoding for abstractive summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Short Papers). Melbourne:[s.n.], 2018:163-169.
[19] DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney:[s.n.], 2017: 933-941.
[20] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]// Proceedings of International Conference on Learning Representations.San Diego:[s.n.], 2015:1409.0473V6.
[21] KINGMA D P, BA J L. Adam: A method for stochastic optimization[C]//Proceedings of International Conference on Learning Representations.San Diego:[s.n.], 2015:1412.6980V9.
[22] LIN C Y, HOVY E. Automatic evaluation of summaries using N-gram co-occurrence statistics[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Stroudsburg:[s.n.], 2003:71-78.