摘要
方法多数停留在挖掘词与词之间的浅层语义关系,没有很好地利用词句之间的完整语义信息,为此,提出一种改进的语义子图预测摘要的算法。将原始文本转化为相应的抽象语义表示(AMR)图,融合成一个AMR总图,基于WordNet语义词典对其进行冗余信息的过滤。在此基础上利用综合统计特征对不具有权值的AMR图节点赋予权值,通过筛选重要性程度高的部分构成语义摘要子图,并基于ROUGE指标和Smatch指标综合衡量生成摘要的质量。实验结果表明,与仅挖掘浅层语义关系的文本摘要基准算法相比,该算法ROUGE值和Smatch值明显提高。
Most of the existing text abstract methods stay in the shallow semantic relationship between words and w ords,and do not make good use of the complete semantic information betw een w ords. Therefore,an improved algorithm for semantic subgraph predictive summary is proposed. The algorithm transforms the original text into corresponding Abstract M eaning Representation( AM R) graphs,merges them into an AM R total graph,and filters the redundant information based on the WordNet semantic dictionary. On this basis,using the comprehensive statistical features assigns w eights to the AM R graph nodes that do not have w eights,and constructs the semantic summary subgraphs by filtering the parts w ith high importance,and comprehensively measures the quality of the abstracts based on the ROUGE index and the Smatch index. Experimental results show that compared w ith the text abstraction benchmark algorithm w hich only mines shallow semantic relations,the ROUGE value and Smatch value of the algorithm are significantly improved.
引文
[1]NALLAPATI R,ZHOU B,SANTOS C N D,et al.Abstractive text summarization using sequence-tosequence RNNs and beyond[C]//Proceedings of CoNLL’16.Washington D.C.,USA:IEEE Press,2016:125-136.
[2]王萌,何婷婷,姬东鸿,等.基于HowNet概念获取的中文自动文摘系统[J].中文信息学报,2005,19(3):87-93.
[3]MILLER G A.WordNet:a lexical database for English[J].Communications of the ACM,1995,38(11):39-41.
[4]MIHALCEA R,TARAU P.TextRank:bringing order into texts[EB/OL].[2018-01-21].http://w w w.aclw eb.org/.
[5]李宝程.基于浅层语义分析的文本摘要方法研究与实现[D].成都:电子科技大学,2016.
[6]吴晓锋,宗成庆.一种基于LDA的CRF自动文摘方法[J].中文信息学报,2009,23(6):39-45.
[7]罗森林,白建敏,潘丽敏,等.融合句义特征的多文档自动摘要算法研究[J].北京理工大学学报,2016,36(10):1059-1064.
[8]BANARESCU L,BONIAL C,CAI S,et al.Abstract meaning representation for sembanking[C]//Proceedings of Linguistic Annotation Workshop on Interoperability w ith Discourse.Washington D.C.,USA:IEEE Press,2013:178-186.
[9]LIU F,FLANIGN J,THOMSO S,et al.Toward abstractive summarization using semantic representations[C]//Proceedings of Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Washington D.C.,USA:IEEE Press,2015:1077-1086.
[10]HERMANN K M,KOCISKY T.Teaching machines to read and comprehend[C]//Proceedings of International Conference on Neural Information Processing Systems.[S.1.]:MIT Press,2015:1693-1701.
[11]曲维光,周俊生,吴晓东,等.自然语言句子抽象语义表示AMR研究综述[J].数据采集与处理,2017,32(1):26-36.
[12]李斌,闻媛,宋丽,等.融合概念对齐信息的中文AMR语料库的构建[J].中文信息学报,2017,31(6):93-102.
[13]SONG L,PENG X,ZHANG Y,et al.AMR-to-text generation with synchronous node replacement Grammar[EB/OL].[2018-01-21].http://www.aclweb.org.
[14]KONSTAS I,IYER S,YATSKAR M,et al.Neural AMR:sequence-to-sequence models for parsing and generation[EB/OL].[2018-01-21].http://www.ikonstas.net.
[15]杜秀英.基于聚类与语义相似分析的多文本自动摘要方法[J].情报杂志,2017,36(6):167-172.
[16]宁可,孙同晶,徐洁洁.面向海量数据的改进最近邻优先吸收聚类算法[J].计算机工程,2018,44(4):35-40.
[17]GOLDSTEIN J,MITTAL V,CARBONELL J,et al.Multidocument summarization by sentence extraction[C]//Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization Association for Computational Linguistics.Washington D.C.,USA:IEEE Press,2000:40-48.
[18]孟令阁,马建芬,张雪英.基于主题的SVM与MMR融合的会议摘要技术[J].计算机工程与设计,2016,37(10):2695-2699.
[19]刘寒磊,关毅,徐永东.多文档文摘中基于语义相似度的最大边缘相关技术研究[C]//全国计算语言学联合学术会议论文集.南京:[出版社不详],2005.
[20]TAN J,WAN X,XIAO J.Abstractive document summarization with a graph-based attentional neural model[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.Washington D.C.,USA:IEEE Press,2017:1171-1181.
[21]SEE A,LIU P J,MANNING C D.Get to the point:summarization with pointer-generator networks[EB/OL].[2018-01-21].http://www.aclweb.org/.
[22]FLICK C.ROUGE:a package for automatic evaluation of summaries[C]//Proceedings of IEEE Workshop on Text Summarization Branches Out.Washington D.C.,USA:IEEE Press,2004:10.
[23]CAI S,KNIGHT K.Smatch:an evaluation metric for semantic feature structures[C]//Proceedings of IEEEMeeting of the Association for Computational Linguistics.Washington D.C.,USA:IEEE Press,2012:748-752.