用户名: 密码: 验证码:
基于DataCite的科学数据现状特征研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Status and Characteristics of Scientific Data Based on DataCite
  • 作者:罗鹏程 ; 崔海媛 ; 赵静茹
  • 英文作者:Luo Pengcheng;
  • 关键词:科学数据 ; 现状特征 ; 科学数据管理 ; DataCite
  • 英文关键词:Scientific data;;The status and characteristics;;Scientific data management;;DataCite
  • 中文刊名:图书情报知识
  • 英文刊名:Documentation,Information & Knowledge
  • 机构:北京大学图书馆;
  • 出版日期:2019-05-10
  • 出版单位:图书情报知识
  • 年:2019
  • 期:03
  • 语种:中文;
  • 页:82+103-114
  • 页数:13
  • CN:42-1085/G2
  • ISSN:1003-2797
  • 分类号:G353.1
摘要
[目的/意义]分析世界范围内海量科学数据特征,为科学数据的有效利用和管理提供参考。[研究设计/方法]采集DataCite中14,835,029条科学数据元数据,基于统计分析、社会网络分析、文本分析等方法,从时间、空间、主题、作者、版本、使用等维度对科学数据的现状特征进行分析。[结论/发现]科学数据呈指数增长态势;理工科数据占据主体,人文社科数据异军突起;数据中心两极分化严重;欧美国家占据开放数据优势;我国数据中心建设滞后于学者需求;不同学科作者合作差异显著;数据集版本数量遵从幂律分布;数据开放共享助力提升学者影响力。[创新/价值]从多个视角对现有海量科学数据全貌特征深入挖掘,总结优秀数据中心实践经验,探讨我国科学数据管理发展路径。
        [Purpose/Significance]This paper intends to analyze the characteristics of massive scientific data and provide a reference for effective utilization and efficient management of scientific data. [Design/Methodology]14,835,029 pieces of metadata for scientific data were collected from the DataCite. By using statistical analysis, social network analysis and text analysis,the status and characteristics of the collected scientific data were explored from six dimensions, including time, space, topic, author, version, and utilization. [Findings/Conclusion]It has been found that scientific data increases exponentially.And data of science and engineering accounts for the majority, while data of humanity and social science occupies a relatively small part. There is a serious polarization among scientific data centers. European and American countries possess advantages in the field of open data. The development of data centers in China can't meet the scholars' needs. Authors' collaborations vary a lot in different disciplines. The number of dataset versions follows the power-law distribution. Data opening and sharing can help improve scholars' impacts. [Originality/Value]This study explores the characteristics of massive scientific data comprehensively and deeply from several perspectives, summarizes the practical experience of excellent scientific data centers, and explores the development approach of scientific data management in China.
引文
1 NSF Data Management Plan Requirements [EB/OL].[2018-09-18].http://www.nsf.gov/eng/ general/dmp.jsp.
    2 Supporting Research Data Management Costs through Grant Funding [EB/OL].[2018-09-18].http://blogs.rcuk.ac.uk/2013/07/09/supporting-research-data-management-costs-through-grant-funding/.
    3 国务院办公厅.关于印发科学数据管理办法的通知 [EB/OL].[2018-09-18].http:// www.gov.cn/ zhengce/content/2018-04/02/content_5279272.htm.
    4 李亚京,马海群.基于文献计量的我国科学数据研究综述[J].图书馆研究与工作,2017(12):46-51.
    5 崔宇红,李伟绵.研究数据管理进展评述[J].图书馆杂志,2017(1):12-19.
    6 Perrier L,Blondal E,Ayala A P,et al.Research Data Management in Academic Institutions:A Scoping Review[J].PloS One,2017,12(5):e0178261.
    7 He L,Han Z.Do Usage Counts of Scientific Data Make Sense?An Investigation of the Dryad Repository[J].Library Hi Tech,2017,35(2):332-342.
    8 He L,Nahar V.Reuse of Scientific Data in Academic Publications:An Investigation of Dryad Digital Repository[J].Aslib Journal of Information Management,2016,68(4):478-494.
    9 Peters I,Kraker P,Lex E,et al.Zenodo in the Spotlight of Traditional and New Metrics[J].Frontiers in Research Metrics and Analytics,2017(2):1-14.
    10 Peters I,Kraker P,Lex E,et al.Research Data Explored:an Extended Analysis of Citations and Altmetrics[J].Scientometrics,2016,107(2):723-744.
    11 邢红梅,吕先竞,刘文君,等.基于DCI的社会学数据影响力分析[J].图书馆理论与实践,2016(2):43-46.
    12 丁楠,黎娇,李文雨泽,等.基于引用的科学数据评价研究[J].图书与情报,2014(5):95-99.
    13 Mongeon P,Robinsongarcia N,Jeng W,et al.Incorporating Data Sharing to the Reward System of Science:Linking DataCite Records to Authors in the Web of Science[J].Aslib Journal of Information Management,2017,69(5):545-556.
    14 Robinson-Garcia N ,Mongeon P,Jeng W,et al.DataCite as a Novel Bibliometric Source:Coverage,Strengths and Limitations[J].Journal of Informetrics,2017,11(3):841-854.
    15 王辉,Michael W.基于re3data的科研数据仓储全景分析[J].图书情报工作,2017,61(22):69-76.
    16 DataCite.DataCite Statistics[EB/OL].[2018-09-18].https://stats.datacite.org/?q=.
    17 DataCite.DataCite Metadata Schema Documentation for the Publication and Citation of Research Data[EB/OL].[2018-09-23].http://schema.datacite.org/meta/kernel-4.1/doc/ DataCite-MetadataKernel_v4.1.pdf.
    18 Blondel V D,Guillaume J L,Lambiotte R,et al.Fast Unfolding of Communities in Large Networks[J].Journal of Statistical Mechanics:Theory and Experiment,2008,2008(10):P10008.
    19 汉字转拼音[EB/OL].[2019-01-10].https://github.com/mozillazg/python-pinyin.
    20 über das Projekt[EB/OL].[2018-09-23].http://dhz.uni-passau.de/about.
    21 朱本军,聂华.数字人文:图书馆实践的新方向[J].大学图书馆学报,2017(4):23-29.
    22 GDCC Members[EB/OL].[2019-01-02].http://dataversecommunity.global/members.
    23 丁喆.科学文献老化的信息计量测度研究[D].杭州:浙江大学,2013.
    24 European Commission.Guidelines to the Rules on Open Access to Scientific Publications and Open Access to Research Data in Horizon 2020[EB/OL].[2019-02-26].http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf.
    25 Memorandum for the Heads of Executive Departments and Agencies[EB/OL].[2019-02-26].https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf.
    26 Economic and Social Research Council.Research Data Policy[EB/OL].[2019-02-26].https://esrc.ukri.org/funding/guidance-for-grant-holders/research-data-policy/.
    27 Policy on the Management of Research Data and Records[EB/OL].[2019-02-26].http://www.admin.ox.ac.uk/media/global/wwwadminoxacuk/localsites/researchdatamanagement/documents/Policy_on_the_Management_of_Research_Data_and_Records.pdf.
    28 Research Data Oxford [EB/OL].[2019-02-26].http://researchdata.ox.ac.uk.
    29 Availability of Data,Materials,Code and Protocols[EB/OL].[2019-02-26].https://www.nature.com/authors/policies/availability.html.
    30 中国科学院关于印发《中国科学院科学数据管理与开放共享办法(试行)》的通知[EB/OL].[2019-02-26].http://m.cas.cn/tzgg1/201902/P020190220358041915907.pdf.
    31 纪力强,乔慧捷,谢本贵,等.全球生物多样性信息网络(GBIF)介绍:组织、活动、项目和信息服务[C]//中国生物多样性保护与研究进展Ⅵ——第六届全国生物多样性保护与持续利用研讨会论文集.2004.
    32 Dryad.Look upYour Journal[EB/OL].[2019-02-25].https://datadryad.org/pages/journal Lookup.
    33 High-Energy Physics Data[EB/OL].[2019-02-26].https://www.hepdata.net/.
    34 Data Publishing Workflows with Dataverse[EB/OL].[2019-02-25].https://projects.iq.harvard.edu/files/ojs-dvn/files/rda2015-data-publishing-workflows-ecastro.pdf.
    35 Wilkinson M D,Dumontier M,Aalbersberg I J,et al.The FAIR Guiding Principles for Scientific Data Management and Stewardship[J].Scientific Data,2016:167-172.
    36 中国科学院计算机网络信息中心.数据中心概况[EB/OL].[2018-11-25].http://www.csdb.cn/aboutus/585.jhtml.
    37 科技部.科学数据共享工程[EB/OL].[2018-11-25].http://www.most.gov.cn/ztzl/kjzg60/ kjzg60hhcj/kjzg60jcyj/200909/t20090911_72832.htm.
    38 石蕾.加强和规范科学数据管理的考虑[C]//2018年中国开放获取周会议,北京,2018.
    39 雄心勃勃的S计划,建立一个更易获取、公平的学术出版体系[EB/OL].[2019-01-02].http://zhishifenzi.com/depth/depth/4778.html.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700