基于篇章修辞结构的自动文摘连贯性研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on Automatic Summarization Coherence Based on Discourse Rhetoric Structure
  • 作者:刘凯 ; 王红玲
  • 英文作者:LIU Kai;WANG Hongling;School of Computer Science and Technology,Soochow University;
  • 关键词:篇章修辞结构 ; 中文自动文摘 ; 连贯性 ; 可读性 ; 实体网格模型 ; LSTM
  • 英文关键词:discourse rhetoric structure;;Chinese automatic summarization;;coherence;;readability;;entity-grid model;;LSTM
  • 中文刊名:MESS
  • 英文刊名:Journal of Chinese Information Processing
  • 机构:苏州大学计算机科学与技术学院;
  • 出版日期:2019-01-15
  • 出版单位:中文信息学报
  • 年:2019
  • 期:v.33
  • 基金:国家自然科学基金(61402314)
  • 语种:中文;
  • 页:MESS201901013
  • 页数:8
  • CN:01
  • ISSN:11-2325/N
  • 分类号:82-89
摘要
尽管抽取式自动文摘方法是目前自动文摘领域的主流方法,并且取得了长足的进步,但抽取式自动文摘形成的摘要由于缺乏句子之间的合理指代或篇章结构,使得文摘缺乏连贯性而影响可读性。为提高自动摘要的可读性,该文尝试将篇章修辞结构信息应用于中文自动文摘。首先,基于汉语篇章修辞结构抽取摘要,然后使用基于LSTM的方法对文本连贯性进行建模,并使用该模型对文摘的连贯性做出评价。实验结果表明:在摘要抽取方面,基于篇章修辞结构的自动文摘相比于传统的抽取方法具有更好的ROUGE评价值;在使用基于LSTM连贯性模型评价摘要连贯性方面,篇章结构信息在自动抽取文摘时可以很好地提炼出文章的主旨,同时使摘要具有更好的结果。
        In order to improve the readability of automatic summaries,this article attempts to apply the discourse rhetorical structure information to Chinese automatic summarization.First,abstracts are extracted based on the rhetorical structure of Chinese texts.Then the LSTM-based methods are adopted to evaluate the coherence of the abstracts.The experimental results show that,automatic abstraction based on discourse rhetorical structure has better ROUGE value than traditional methods.The coherence evaluation results show that the discourse structure information can help the system extract the subject of the article automatically.
引文
[1] Mani Inderjeet,Mark T Maybury.Advances in automatic text summarization[M].Cambridge,MA:MIT Press,1999.
    [2]徐凡,朱巧明,周国栋.篇章分析技术综述[J].中文信息学报,2013,27(3):20-33.
    [3] de Beaugrande R.Dressler W.Introduction to text linguistics[M].London,UK:Longman,1981.
    [4]殷习芳,刘明东.语篇连贯性研究综述[J].湖南第一师范学报,2006(3):124-127.
    [5] Marcu D.Discourse trees are good indicators of importance in text[J].Advances in Automatic Text Summarization,1999:123-136.
    [6] Yoshida Y,et al.Dependency-based discourse parser for single-document summarization[C]//Proceedings of the 2014EMNLP,2014:1834-1839.
    [7] Louis A,Joshi A,Nenkova A.Discourse indicators for content selection in summarization[C]//Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue.Association for Computational Linguistics,2010:147-156.
    [8] Goyal N,Eisenstein J.A joint model of rhetorical discourse structure and summarization[C]//Proceedings of the Workshop on Structured Prediction for NIP,2016:25-34.
    [9] Mithun S,Kosseim L.Discourse structures to reduce discourse incoherence in Blog summarization[C]//Proceedings of the RANLP,2011:479-486.
    [10] Barzilay R,Lapata M.Modeling local coherence:An entity-based approach[C]//Proceedings of the 43rd ACL,2005:141-148.
    [11] Barzilay R,Lapata M.Modeling local coherence:An entity-based approach[J].Computational Linguistics,2008,34(1):1-34.
    [12] Louis A,Nenkova A.A coherence model based on syntactic patterns[C]//Proceedings of the 2012Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,2012:1157-1168.
    [13] Lin Z,Ng H T,Kan M Y.Automatically evaluating text coherence using discourse relations[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies,2011,1:997-1006.
    [14] Li J,Hovy E.A model of coherence based on distributed sentence representation[C]//Proceedings of the2014 Conference on Empirical Methods in Natural Language Processing,2014:2039-2048.
    [15] Nguyen D T,Joty S.A neural local coherence model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics,2017:1320-1330.
    [16]林睿.基于神经网络的篇章一致性建模[D].哈尔滨:哈尔滨工业大学硕士学位论文,2015.
    [17] Xu F,et al.An entity-driven recursive neural network model for Chinese discourse coherence modeling[J].arXiv preprint arXiv:2017.8201,2017.
    [18] Strube M,Strube M.Extending the entity-grid coherence model to semantically related entities[C]//Proceedings of the 11th European Workshop on Natural Language Generation.Association for Computational Linguistics,2007:139-142.
    [19]李艳翠.汉语篇章结构表示体系及资源构建研究[D].苏州:苏州大学博士学位论文,2015.
    [20] Mann W C,Thompson S A.Rhetorical structure theroy:Toward a functional theory of text organization[J].Text,1988,8(3):243-281.
    [21] Prasad R,et al.The Penn Discourse TreeBank 2.0[C]//Proceedings of the 6th International Conference on Language Resourses and Evalution,2008,24(1):2961-2968.
    [22]孙静,等.汉语隐式篇章关系识别[J].北京大学学报(自然科学版),2014,50(1):111-117.
    [23]李艳翠,孙静,周国栋.汉语篇章连接词识别与分类[J].北京大学学报(自然科学版),2015,51(2):307-314.
    (1)http://nlp.stanford.edu/projects/glove

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700