基于关键体词抽取的韩国语自动文摘

英文篇名：Korean Automatic Summarization Based on Key-noun Extraction
作者：王琳 ; 刘伍颖
英文作者：WANG Lin;LIU Wuying;Xianda College of Economics and Humanities,Shanghai International Studies University;Laboratory of Language Engineering and Computing,Guangdong University of Foreign Studies;Engineering Research Center for Cyberspace Content Security,Guangdong University of Foreign Studies;
关键词：自动文摘 ; 韩国语 ; 体词 ; 谓词 ; ROUGE
英文关键词：automatic summarization;;Korean;;noun;;predicate;;ROUGE
中文刊名：MESS
英文刊名：Journal of Chinese Information Processing
机构：上海外国语大学贤达经济人文学院;广东外语外贸大学语言工程与计算实验室;广东外语外贸大学网络空间内容安全工程技术研究中心;
出版日期：2019-06-15
出版单位：中文信息学报
年：2019
期：v.33
基金：国家语委重点项目(ZDI135-26);; 广东省自然科学基金(2018A030313672);; 广东省高校特色创新项目(2015KTSCX035);; 广东省哲学社会科学重点实验室招标项目(LEC2017WTKT002);; 广州市人文社科重点研究基地(广州国际城市创新传播研究中心)重点项目(2017-IC-02)
语种：中文;
页：MESS201906007
页数：7
CN：06
ISSN：11-2325/N
分类号：55-61

摘要

非通用语言信息爆炸导致人们的时间更加稀缺且注意力更加发散。该文围绕韩国语文本的自动文摘问题,提出一种新的基于关键体词抽取的韩国语文摘算法。该文认为韩国语体词主要表示语义信息,而韩国语谓词更多地担负句法框架功能。实验结果表明基于关键体词抽取的文摘算法效果优于采用谓词或全词的效果,且新提出的韩国语文摘算法在韩国语文摘任务中能够达到最优性能,证明了体词主要表示语义信息的论断是有效的。
This paper addresses the issue of automatic summarization for Korean texts and presents a novel Korean summarization(KKS)method based on key-noun extraction.We deem that Korean nouns mainly represent semantic information,while Korean predicates are more responsible for syntactic frame function.The experimental results show that the performance of our KKS algorithm is better than that of predicate-based one or all-word-based one,and the KKS algorithm can achieve the best performance in the Korean summarization task,which also proves the effectiveness of our assertion for the semantic function of Korean nouns.

引文

[1]Horacio Saggion,Thierry Poibeau.Automatic Text summarization:Past,present and future[M].Multisource,Multilingual Information Extraction and Summarization,Springer,2013:3-21.
    [2]H P Luhn.The automatic creation of literature abstracts[J].IBM Journal of Research and Development,1958,2(2):159-165.
    [3]K S Jones,E-N Brigitte.Introduction:Automatic summarizing[J].Information Processing&Management,1995,31(5):625-630.
    [4]Yu Lei,Ren Fuji.A study on cross-language text summarization using supervised methods[C]//Proceedings of the International Conference on Natural Language Processing and Knowledge Engineering,2009.
    [5]Amini Massih-Reza,Gallinari Patrick.The use of unlabeled data to improve supervised learning for text summarization[C]//Proceedings of SIGIR Forum,2002:105-112.
    [6]Nomoto Tadashi,Matsumoto Yuji.An experimental comparison of supervised and unsupervised approaches to text summarization[C]//Proceedings of the IEEEInternational Conference on Data Mining,2001:630-632.
    [7]C Y Lin,E Hovy.From single to multi-document summarization:A prototype system and its evaluation[C]//Proceedings of the ACL,2002:457-464.
    [8]Wen-tau Yih,Joshua Goodman,Lucy Vanderwende,et al.Multi-document summarization by maximizing informative content-words[C]//Proceedings of the International Joint Conference on Artificial Intelligence,2007:1776-1782.
    [9]Belkebir Riadh,Guessoum Ahmed.A supervised approach to Arabic text summarization using Adaboost[J].Advances in Intelligent Systems and Computing,2015,353:227-236.
    [10]Gupta Vishal,Kaur Narvinder.A novel hybrid text summarization system for Punjabi Text[J].Cognitive Computation,2016,8(2):261-277.
    [11]Jae-Hoon Kim,Joon-Hong Kim,Dosam Hwang.Korean text summarization using an aggregate similarity[C]//Proceedings of the International Workshop on Information Retrieval with Asian Languages,2000:111-118.
    [12]Nenkova A,Vanderwende L.The impact of frequency on summarization[R].Technical Report,MSR-TR-2005-101,2005.
    [13]Sangwon Park,DongHyun Choi,Eun-kyung Kim,et al.A plug-in component-based Korean morphological analyzer[C]//Proceedings of HCLT 2010:2010,197-201.
    [14]Hyoungil Jeong,Youngjoong Ko,Jungyun Seo.Efficient keyword extraction and text summarization for reading articles on smart phone[J].Computing and Informatics,2015,34(4):779-794.
    [15]Jayashree R,Srikanta Murthy K,Sunny K.Keyword extraction based summarization of categorized Kannada Text documents[J].International Journal on Soft Computing,2011,2(4):81-93.
    [16]Kamal Sarkar.Automatic single document text summarization using key concepts in documents[J].Journal of Information Processing Systems,2013,9(4):602-620.
    [17]Lin Chin-Yew.ROUGE:A package for automatic evaluation of summaries[C]//Proceedings of the Workshop on Text Summarization,2004.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700