中文科技论文图表摘要设计研究——以图书情报领域为例
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Summarizing Figures of Chinese Scholarly Articles of Library and Information Science
  • 作者:包楚晗 ; 贾丹萍 ; 何琳 ; 马晓雯 ; 艾毓茜
  • 英文作者:Bao Chuhan;Jia Danping;He Lin;Ma Xiaowen;Ai Yuxi;College of Information Science and Technology, Nanjing Agricultural University;Research Center for Correlation of Domain Knowledge, Nanjing Agricultural University;
  • 关键词:图表标引 ; 中文摘要 ; 李克特量表
  • 英文关键词:Figure Indexing;;Abstract in Chinese;;Likert Scale
  • 中文刊名:XDTQ
  • 英文刊名:Data Analysis and Knowledge Discovery
  • 机构:南京农业大学信息科技学院;南京农业大学领域知识关联研究中心;
  • 出版日期:2017-10-17 15:01
  • 出版单位:数据分析与知识发现
  • 年:2017
  • 期:v.1;No.10
  • 基金:南京农业大学SRT计划基金项目“基于自然语言理解的科技论文图表自动标引研究——以生物医学领域疾病研究为例”(项目编号:201610307061)的研究成果之一
  • 语种:中文;
  • 页:XDTQ201710003
  • 页数:11
  • CN:10
  • ISSN:10-1478/G2
  • 分类号:25-35
摘要
【目的】探究与设计基于图书情报领域、中文科技论文图表摘要构建的结构,并制定构建规则。【方法】通过调研的方法,结合人工标注结果及图情领域中文科技论文、图表的特征,设计摘要框架并规定构建规则,最终设计评测系统,基于SPSS统计结果分析揭示该摘要系统的表现。【结果】本研究构建的图表摘要在图片信息理解程度、效率、确信度等维度上的表现均优于现有图片–文本组合模式。【局限】图片信息覆盖率有待提高、未考虑清楚图表类型所带来的差异、未完全实施自动化标引。【结论】依据本研究设计的中文科技论文图表摘要构建结构与规则所形成的图表摘要能有效提高用户对文献主要内容的准确理解度。
        [Objective]This paper studies the figures of Chinese articles in the field of library and information science(LIS),aiming to establish new principles to summarize them.[Methods]We proposed the framework and rules for figure summarization based on manual indexing and features of LIS papers.Then,we evaluated the performance of the new system with the help of SPSS.[Results]Compared with the existing figure-text model,our method could more effectively process information from the figures.[Limitations]We need to extract more information from the figures,analyze the influences of different charts,and add automatic indexing functions to the new system.[Conclusions]The proposed method could effectively summarize figures from the scholarly articles.
引文
[1]Kim D,Yu H.Figure Text Extraction in Biomedical Literature[J].PLo S One,2011,6(1):e15338.
    [2]Yu H,Lee M.Accessing Bioscience Images from Abstract Sentences[J].Bioinformatics,2006,22(14):547-556.
    [3]Agarwal S,Yu H.Figure Summarizer Browser Extensions for Pub Med Central[J].Bioinformatics,2011,27(12):1723-1724.
    [4]Futrelle R P.Handling Figures in Document Summarization Abstract[C]//Proceedings of Meeting of the Association for Computational Linguistics.2004.
    [5]Luhn H P.The Automatic Creation of Literature Abstracts[J].IBM Journal of Research and Development,1958,2(2):159-165.
    [6]Nakov P I,Schwartz A S,Hearst M A.Citances:Citation Sentences for Semantic Analysis of Bioscience Text[C]//Proceedings of the SIGIR’04 Workshop on Search and Discovery in Bioinformatics.2004.
    [7]周浪,张亮,冯冲,等.基于词频分布变化统计的术语抽取方法[J].计算机科学,2009,36(5):177-180.(Zhou Lang,Zhang Liang,Feng Chong,et al.Terminology Extraction Based on Statistical Word Frequency Distribution Variety[J].Computer Science,2009,36(5):177-180.)
    [8]Hirao T,Isozaki H,Maeda E,et al.Extracting Important Sentences with Support Vector Machines[C]//Proceedings of the 19th International Conference on Computational Linguistics.2002:1-7.
    [9]张帆,乐小虬.面向领域科技文献的句子级创新点抽取研究[J].现代图书情报技术,2014(9):15-21.(Zhang Fan,Le Xiaoqiu.Research on Innovation Points Extraction from Scientific Research Paper Based on Field Thesaurus[J].New Technology of Library and Information Service,2014(9):15-21.)
    [10]Brunn M,Chali Y,Pinchak C.Text Summarization Using Lexical Chains[C]//Proceedings of the Document Understanding Conference,2001:135-140.
    [11]王芳,史海燕,纪雪梅.我国情报学研究中理论的应用:基于《情报学报》的内容分析[J].情报学报,2015,34(6):581-591.(Wang Fang,Shi Haiyan,Ji Xuemei.The Use of Theory in Chinese Information Science Research Based on the Content Analysis of the Journal of the China Society for Scientific and Technical Information[J].Journal of the China Society for Scientific and Technical Information,2015,34(6):581-591.)
    [12]Dahl T.Contributing to the Academic Conversation:A Study of New Knowledge Claims in Economics and Linguistics[J].Journal of Pragmatics,2008,40(7):1184-1201.
    [13]Parkinson J.The Discussion Section as Argument:The Language Used to Prove Knowledge Claims[J].English for Specific Purposes,2011,30(3):164-175.
    [14]Ramesh B P,Sethi R J,Yu H.Figure-Associated Text Summarization and Evaluation[J].PLo S One,2015,10(2):e0115671.
    [15]Herbrich R,Graepel T,Obermayer K.Support Vector Learning for Ordinal Regression[C]//Proceedings of the 9th International Conference on Artificial Neural Networks.IET,DOI:10.1049/cp:19991091.
    [16]关鹏,王曰芬,傅柱.不同语料下基于LDA主题模型的科学文献主题抽取效果分析[J].图书情报工作,2016,60(2):112-121.(Guan Peng,Wang Yuefen,Fu Zhu.Effect Analysis of Scientific Literature Topic Extraction Based on LDA Topic Model with Different Corpus[J].Library and Information Service,2016,60(2):112-121.)
    [17]Radev D R,Jing H,Sty?M,et al.Centroid-based Summarization of Multiple Documents[J].Information Processing&Management,2004,40(6):919-938.
    [18]Agarwal S,Yu H.Fig Sum:Automatically Generating Structured Text Summaries for Figures in Biomedical Literature[C]//Proceedings of AMIA Annual Symposium.2009.
    [19]朱丽萍,李洪奇,杨中国,等.一种面向科技文献引言的信息抽取方法[J].山东大学学报:理学版,2015,50(7):23-30,37.(Zhu Liping,Li Hongqi,Yang Zhongguo,et al.An Information Extraction Method for Scientific Literature Introduction[J].Journal of Shandong University:Natural Science,2015,50(7):23-30,37.)
    [20]杜威,邹先霞.基于数据流的滑动窗口机制的研究[J].计算机工程与设计,2005,26(11):2922-2944.(Du Wei,Zou Xianxia.Research of Sliding Windows Scheme Based on Data Stream[J].Computer Engineering and Design,2005,26(11):2922-2944.)
    [21]Yu H,Agarwal S,Johnston M,et al.Are Figure Legends Sufficient?Evaluating the Contribution of Associated Text to Biomedical Figure Comprehension[J].Journal of Biomedical Discovery and Collaboration,2009,4(1).DOI:10.1186/1747-5333-4-1.
    [22]方宝.Likert等级量表调查结果有效性的影响因素探析[J].十堰职业技术学院学报,2009,22(2):25-28.(Fang Bao.An Analysis of the Factors Influencing the Effectiveness of Likert Rating Scale’s Investigation Result[J].Journal of Shiyan Technical Institute,2009,22(2):25-28.)
    [23]Lin C Y,Hovy E.Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics[C]//Proceedings of the 2003Conference of North American Chapter of the Association for Computational Linguistics on Human Language.2003:71-78.
    [24]傅间莲,陈群秀.一种新的自动文摘系统评价方法[J].计算机工程与应用,2006(18):176-177.(Fu Jianlian,Chen Qunxiu.A New Evaluation Method for Automatic Text Summarization[J].Computer Engineering and Applications,2006(18):176-177.)
    [25]Lin C Y.ROUGE:A Package for Automatic Evaluation of Summaries[C]//Proceedings of the Workshop on Text Summarization Branches out.2004:74-81.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700