摘要
【目的】突破基于关键词的传统文献检索方式,为用户打造科技大数据知识发现平台,实现文献检索到知识检索的转型升级。【方法】利用数据挖掘技术进行科研实体抽取与关系计算,基于实体知识图谱构建分布式索引,实现知识多维度检索呈现和关联导航。【结果】本文研发的知识发现平台,在论文、项目、学者、机构等10类科研实体构建的知识图谱上实现了智能语义搜索和多维知识集成检索发现。【局限】当前知识发现平台主要建立在实体级别上,语义检索有待进一步研究深化。【结论】基于知识图谱构建的知识发现平台实现了数据在知识层面的组织索引,满足了用户精准知识检索需求,提升了用户体验。
[Objective] This paper tries to create a big data platform for sci-tech knowledge discovery, aiming to transform the keyword-based literature retrieval to knowledge retrieval. [Methods] First, we extracted and annotated scientific research entities and calculated their relationship with data mining techniques. Then, we created distributed indexes based on entity knowledge graph, which achieved multi-dimensional knowledge retrieval and correlated navigation. [Results] This study generated knowledge graphs for 10 research entities, such as papers, projects, scholars and institutions, etc. The proposed platform could conduct intelligent semantic search and multi-dimensional knowledge discovery with these knowledge graphs. [Limitations] Our study is at the entity level, and more research is needed for the semantic retrieval. [Conclusions] The proposed platform organizes data at the knowledge level, which meets user's precise knowledge retrieval demands and improves user experience.
引文
[1]Google Inside Search[EB/OL].[2016-02-10].https://www.google.com/intl/es419/insidesearch/features/search/knowledg e.html.
[2]Wolfram Alpha.Computational Knowledge Engine[EB/OL].[2015-03-10].https://www.wolframalpha.com/.
[3]Springer Nature.SN Sci Graph[EB/OL].[2018-08-18].https://www.springernature.com/gp/researchers/scigraph.
[4]Taylor&Francis.Wizdom.ai[EB/OL].[2018-05-05].https://www.wizdom.ai/#about.
[5]Tang J,Zhang J,Yao L M,et al.AMiner:Extraction and Mining of Academic Social Networks[C]//Proceedings of the14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(SIGKDD’2008).2008:990-998.
[6]Kuc R,Rogozinski M.Elasticsearch Server[M].Birmingham:Packt Publishing Ltd.,2013.
[7]王颖,张智雄,李传席,等.科技知识组织体系开放引擎系统的设计与实现[J].现代图书情报技术,2015(10):95-101.(Wang Ying,Zhang Zhixiong,Li Chuanxi,et al.The Design and Implementation of Open Engine System for Scientific&Technological Knowledge Organization Systems[J].New Technology of Library and Information Service,2015(10):95-101.)
[8]孙坦,刘峥.面向外文科技文献信息的知识组织体系建设思路[J].图书与情报,2013(1):2-7.(Sun Tan,Liu Zheng.Methodology Framework of Knowledge Organization System for Scientific&Technological Literature[J].Library&Information,2013(1):2-7.)
[9]李跃鹏,金翠,及俊川.基于Word2vec的关键词提取算法[J].科研信息化技术与应用,2015(4):54-59.(Li Yuepeng,Jin Cui,Ji Junchuan.A Keyword Extraction Algorithm Based on Word2vec[J].E-science Technology&Application,2015(4):54-59.)
[10]余珊珊,苏锦细,李鹏飞.基于改进的Text Rank的自动摘要提取方法[J].计算机科学,2016,43(6):240-247.(Yu Shanshan,Su Jinxi,Li Pengfei.Improved Text Rank-based Method for Automatic Summarization[J].Computer Science,2016,43(6):240-247.)
[11]顾益军,夏天.融合LDA与Text Rank的关键词抽取研究[J].现代图书情报技术,2014(7-8):41-47.(Gu Yijun,Xia Tian.Study on Keyword Extraction with LDA and Text Rank Combination[J].New Technology of Library and Information Service,2014(7-8):41-47.)