用户名: 密码: 验证码:
基于知识图谱的科技大数据知识发现平台建设
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Constructing Big Data Platform for Sci-Tech Knowledge Discovery with Knowledge Graph
  • 作者:胡吉颖 ; 谢靖 ; 钱力 ; 付常雷
  • 英文作者:Hu Jiying;Xie Jing;Qian Li;Fu Changlei;National Science Library, Chinese Academy of Sciences;Department of Library, Information and Archives Management, University of Chinese Academy of Sciences;
  • 关键词:知识发现 ; 科技大数据 ; 知识图谱 ; 精准服务 ; 用户画像
  • 英文关键词:Knowledge Discovery;;S&T Big Data;;Knowledge Graph;;Precision Service;;User Portrait
  • 中文刊名:XDTQ
  • 英文刊名:Data Analysis and Knowledge Discovery
  • 机构:中国科学院文献情报中心;中国科学院大学图书情报与档案管理系;
  • 出版日期:2019-01-25
  • 出版单位:数据分析与知识发现
  • 年:2019
  • 期:v.3;No.25
  • 基金:中国科学院文献情报能力建设专项项目“基于大数据计算的知识发现服务平台建设”(项目编号:院1853)的研究成果之一
  • 语种:中文;
  • 页:XDTQ201901008
  • 页数:8
  • CN:01
  • ISSN:10-1478/G2
  • 分类号:59-66
摘要
【目的】突破基于关键词的传统文献检索方式,为用户打造科技大数据知识发现平台,实现文献检索到知识检索的转型升级。【方法】利用数据挖掘技术进行科研实体抽取与关系计算,基于实体知识图谱构建分布式索引,实现知识多维度检索呈现和关联导航。【结果】本文研发的知识发现平台,在论文、项目、学者、机构等10类科研实体构建的知识图谱上实现了智能语义搜索和多维知识集成检索发现。【局限】当前知识发现平台主要建立在实体级别上,语义检索有待进一步研究深化。【结论】基于知识图谱构建的知识发现平台实现了数据在知识层面的组织索引,满足了用户精准知识检索需求,提升了用户体验。
        [Objective] This paper tries to create a big data platform for sci-tech knowledge discovery, aiming to transform the keyword-based literature retrieval to knowledge retrieval. [Methods] First, we extracted and annotated scientific research entities and calculated their relationship with data mining techniques. Then, we created distributed indexes based on entity knowledge graph, which achieved multi-dimensional knowledge retrieval and correlated navigation. [Results] This study generated knowledge graphs for 10 research entities, such as papers, projects, scholars and institutions, etc. The proposed platform could conduct intelligent semantic search and multi-dimensional knowledge discovery with these knowledge graphs. [Limitations] Our study is at the entity level, and more research is needed for the semantic retrieval. [Conclusions] The proposed platform organizes data at the knowledge level, which meets user's precise knowledge retrieval demands and improves user experience.
引文
[1]Google Inside Search[EB/OL].[2016-02-10].https://www.google.com/intl/es419/insidesearch/features/search/knowledg e.html.
    [2]Wolfram Alpha.Computational Knowledge Engine[EB/OL].[2015-03-10].https://www.wolframalpha.com/.
    [3]Springer Nature.SN Sci Graph[EB/OL].[2018-08-18].https://www.springernature.com/gp/researchers/scigraph.
    [4]Taylor&Francis.Wizdom.ai[EB/OL].[2018-05-05].https://www.wizdom.ai/#about.
    [5]Tang J,Zhang J,Yao L M,et al.AMiner:Extraction and Mining of Academic Social Networks[C]//Proceedings of the14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(SIGKDD’2008).2008:990-998.
    [6]Kuc R,Rogozinski M.Elasticsearch Server[M].Birmingham:Packt Publishing Ltd.,2013.
    [7]王颖,张智雄,李传席,等.科技知识组织体系开放引擎系统的设计与实现[J].现代图书情报技术,2015(10):95-101.(Wang Ying,Zhang Zhixiong,Li Chuanxi,et al.The Design and Implementation of Open Engine System for Scientific&Technological Knowledge Organization Systems[J].New Technology of Library and Information Service,2015(10):95-101.)
    [8]孙坦,刘峥.面向外文科技文献信息的知识组织体系建设思路[J].图书与情报,2013(1):2-7.(Sun Tan,Liu Zheng.Methodology Framework of Knowledge Organization System for Scientific&Technological Literature[J].Library&Information,2013(1):2-7.)
    [9]李跃鹏,金翠,及俊川.基于Word2vec的关键词提取算法[J].科研信息化技术与应用,2015(4):54-59.(Li Yuepeng,Jin Cui,Ji Junchuan.A Keyword Extraction Algorithm Based on Word2vec[J].E-science Technology&Application,2015(4):54-59.)
    [10]余珊珊,苏锦细,李鹏飞.基于改进的Text Rank的自动摘要提取方法[J].计算机科学,2016,43(6):240-247.(Yu Shanshan,Su Jinxi,Li Pengfei.Improved Text Rank-based Method for Automatic Summarization[J].Computer Science,2016,43(6):240-247.)
    [11]顾益军,夏天.融合LDA与Text Rank的关键词抽取研究[J].现代图书情报技术,2014(7-8):41-47.(Gu Yijun,Xia Tian.Study on Keyword Extraction with LDA and Text Rank Combination[J].New Technology of Library and Information Service,2014(7-8):41-47.)

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700