融合词汇特征的生成式摘要模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Abstractive summarization model considering hybrid lexical features
  • 作者:江跃华 ; 丁磊 ; 李娇娥 ; 杜皓晅 ; 高凯
  • 英文作者:JIANG Yuehua;DING Lei;LI Jiaoe;DU Haoxuan;GAO Kai;School of Information Science and Engineering, Hebei University of Science and Technology;Information Center of Shijiazhuang Public Security Bureau;Xi'dian University;
  • 关键词:自然语言处理 ; 文本摘要 ; 注意力机制 ; LSTM ; CNN
  • 英文关键词:natural language processing;;text summarization;;attention mechanism;;LSTM;;CNN
  • 中文刊名:HBQJ
  • 英文刊名:Journal of Hebei University of Science and Technology
  • 机构:河北科技大学信息科学与工程学院;石家庄市公安局信息中心;西安电子科技大学通信工程学院;
  • 出版日期:2019-04-24 11:10
  • 出版单位:河北科技大学学报
  • 年:2019
  • 期:v.40;No.147
  • 基金:国家自然科学基金(61772075);; 河北省自然科学基金(F2017208012);; 教育部人文社会科学研究专项任务项目(工程科技人才培养研究)(17JDGC022)
  • 语种:中文;
  • 页:HBQJ201902010
  • 页数:7
  • CN:02
  • ISSN:13-1225/TS
  • 分类号:58-64
摘要
生成过程中利用词汇特征(包含n-gram和词性信息)识别更多重点词汇内容,进一步提高摘要生成质量,提出了一种基于sequence-to-sequence(Seq2Seq)结构和attention机制的、融合了词汇特征的生成式摘要算法。算法的输入层将词性向量与词向量合并后作为编码器层的输入,编码器层由双向LSTM组成,上下文向量由编码器的输出和卷积神经网络提取的词汇特征向量构成。模型中的卷积神经网络层控制词汇信息,双向LSTM控制句子信息,解码器层使用单向LSTM为上下文向量解码并生成摘要。实验结果显示,在公开数据集和自采数据集上,融合词汇特征的摘要生成模型性能优于对比模型,在公开数据集上的ROUGE-1,ROUGE-2,ROUGE-L分数分别提升了0.024,0.033,0.030。因此,摘要的生成不仅与文章的语义、主题等特征相关,也与词汇特征相关,所提出的模型在融合关键信息的生成式摘要研究中具有一定的参考价值。
        In order to use lexical features(including n-gram and part of speech information) to identify more key vocabulary content in the summarization generation process to further improve the quality of the summarization, an algorithm based on sequence-to-sequence(Seq2 Seq) structure and attention mechanism and combining lexical features is proposed. The input layer of the algorithm combines the part of speech vector with the word vector, which is the input of the encoder layer. The encoder layer is composed of bi-directional LSTM, and the context vector is composed of the output of the encoder and the lexical feature vector extracted from the convolution neural network. The convolutional neural network layer in the model controls the lexical information, the bi-directional LSTM controls the sentence information, and the decoder layer uses unidirectional LSTM to decode the context vector and generates the summarization. The experiments on public dataset and the self-collected dataset show that the performance of the summarization generation model considering lexical feature is better than that of the contrast model. The ROUGE-1, ROUGE-2 and ROUGE-L scores on the public dataset are improved by 0.024, 0.033 and 0.030, respectively. Therefore, the generation of summarization is not only related to the semantics and themes of the article, but also to the lexical features.The proposed model provides a certain reference value in the research of generating summarization of integrating key infromation.
引文
[1] RUSH A M, CHOPRA S, WESTON J. A neural attention model for abstractive sentence summarization[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon:[s.n.], 2015:379-389.
    [2] CHOPRA S, AULI M, RUSH A M. Abstractive sentence summarization with attentive recurrent neural networks[C]// Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. San Diego:[s.n.], 2016: 93-98.
    [3] NALLAPATI R, ZHOU B, SANTOS C N D, et al. Abstractive text summarization using sequence-to-sequence rnns and beyond[C]//Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning. Berlin:[s.n.],2016:280-290.
    [4] HU Baotian, CHEN Qingcai, ZHU Fangze. LCSTS:A large scale Chinese short text summarization dataset[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon:[s.n.],2015: 1967-1972.
    [5] YAO Jinge, WAN Xiaojun, XIAO Jianguo. Recent advances in document summarization[J]. Knowledge and Information Systems, 2017, 53(2): 297-336.
    [6] KIKUCHI Y, NEUBIG G, SASANO R, et al. Controlling output length in neural encoder-decoders[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Austin:[s.n.], 2016:1328-1338.
    [7] ZENG Wenyuan, LUO Wenjie, FIDLER S, et al. Efficient summarization with read-again and copy mechanism[C]// Proceedings of the International Conference on Learning Representations.[S.l.]:[s.n.],2017:1-13.
    [8] SEE A, LIU P J, MANNING C D. Get to the point: Summarization with pointer-generator networks[C]// Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. [S.l.]:[s.n.],2017:1073-1083.
    [9] GEHRING J, AULI M, GRANGIER D, et al. Convolutional sequence to sequence learning[C]//Proceedings of the 34th International Conference on Machine Learning. [S.l.]:[s.n.],2017: 1243-1252.
    [10] CHANG C T, HUANG C C, HSU J Y J, et al. A hybrid word-character model for abstractive summarization[EB/OL]. https://arxiv.org/pdf/1802.09968v2.pdf, 2018-02-28.
    [11] MA Shuming, SUN Xu, LIN Junyang, et al. Autoencoder as assistant supervisor: Improving text representation for Chinese social media text summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers). Melbourne:[s.n.], 2018:725-731.
    [12] PAULUS R, XIONG Caiming, SOCHER R. A deep reinforced model for abstractive summarization[C]//Proceedings of Sixth International Conference on Learning Representations. [S.l.]:[s.n.],2017:1-13.
    [13] WANG Li, YAO Junlin, TAO Yunzhe, et al. A reinforced topic-aware convolutional sequence-to-sequence model for abstractive text summarization[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI). [S.l.]:[s.n.], 2018: 4453-4460.
    [14] FAN A, GRANGIER D, AULI M. Controllable abstractive summarization[C]//Proceedings of the 2nd Workshop on Neural Machine Translation and Generation. Melbourne:[s.n.],2017:45-54.
    [15] GAO Shen, CHEN Xiuying, LI Piji, et al. Abstractive text summarization by incorporating reader comments[EB/OL]. https://arxiv.org/pdf/1812.05407v1.pdf, 2018-12-13.
    [16] GAO Shen, CHEN Xiuying, LI Piji,et al. Product-aware answer generation in e-commerce question-answering[C]//Proceedings of the 12th ACM International Conference on Web Search and Data Mining.[S.l.]:[s.n.],2019:07696.
    [17] HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780.
    [18] LIN Junyang, SUN Xu, MA Shuming, et al. Global encoding for abstractive summarization[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics(Short Papers). Melbourne:[s.n.], 2018:163-169.
    [19] DAUPHIN Y N, FAN A, AULI M, et al. Language modeling with gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney:[s.n.], 2017: 933-941.
    [20] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[C]// Proceedings of International Conference on Learning Representations.San Diego:[s.n.], 2015:1409.0473V6.
    [21] KINGMA D P, BA J L. Adam: A method for stochastic optimization[C]//Proceedings of International Conference on Learning Representations.San Diego:[s.n.], 2015:1412.6980V9.
    [22] LIN C Y, HOVY E. Automatic evaluation of summaries using N-gram co-occurrence statistics[C]//Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology.Stroudsburg:[s.n.], 2003:71-78.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700