面向跨语言家谱服务的多源关联数据匹配研究——上海图书馆开放数据应用比赛作品Learn Chinese Surnames
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Matching Linked Data for Cross-Lingual Genealogical Services——Learn Chinese Surnames in Shanghai Library Open Data Application Contest
  • 作者:董行
  • 英文作者:Dong Hang;
  • 关键词:关联数据 ; 跨语言本体匹配 ; 中文家谱 ; 开放数据应用 ; 信息服务
  • 英文关键词:Linked Data;;Cross-lingual Ontology Matching;;Chinese Genealogies;;Open Data Applications;;Information Services
  • 中文刊名:DXTS
  • 英文刊名:Journal of Academic Libraries
  • 机构:利物浦大学计算机科学系;西交利物浦大学计算机科学和软件工程系;
  • 出版日期:2018-07-21
  • 出版单位:大学图书馆学报
  • 年:2018
  • 期:v.36;No.222
  • 语种:中文;
  • 页:DXTS201804009
  • 页数:9
  • CN:04
  • ISSN:11-2952/G2
  • 分类号:51-58+104
摘要
随着越来越多非英文关联数据集的发布,语言差异成了数据万维网中资源间相互链接的障碍。对于包括中文家谱在内的文化遗产资源,以关联数据技术为基础的跨语言信息服务有助于保障资源的获取,促进国际交流。此文介绍了关联数据中跨语言本体匹配的概念、代表项目和在文化遗产领域的相关实践,并以2016年上海图书馆开放数据应用开发竞赛作品Learn Chinese Surnames作为案例,展示了用关联数据的消费技术实现中文家谱数据与DBpedia、GeoNames、Wiktionary之间的跨语言匹配,以丰富中文家谱中的相关英文描述的实践,并总结了开发关联数据应用的经验。该研究有助于消除关联数据集之间的语言障碍,实现跨语言的家谱信息服务。
        With the growth of non-English linked datasets,discrepancy in language has become a major obstacle for cross-lingual linking of resources in the Web of Data.For cultural heritage resources including Chinese genealogies,cross-lingual information services based on linked data technologies can support access of these resources and thus promote international scholarly communication.In this article,we present the idea of Cross-Lingual Ontology Matching and representative projects especially in the Cultural Heritage domain.The work,Learn Chinese Surnames,winning the Shanghai Library Open Data Application Contest in 2016,is then introduced as a case study to demonstrate the practice of consuming Linked Data to establish Cross-Lingual Matching between Chinese Genealogical Data to DBpedia,GeoNames and Wiktionary.The experiences of developing linked data applications has also been summarized.The research is useful to overcome the language barriers among linked datasets and to realize cross-lingual genealogical information services.
引文
1 Schmachtenberg M,Bizer C,Paulheim H.Adoption of the linked data best practices in different topical domains.In:International Semantic Web Conference[C],Springer International Publishing,2014:245-260.
    2 刘炜.关联数据:概念,技术及应用展望[J].大学图书馆学报,2011(2):5-12.
    3 Smith-Yoshimura K.Analysis of international linked data survey for implementers[J].D-Lib Magazine.2016,22(7/8):1.
    4 Trojahn C,Fu B,Zamazal O,Ritze D.State-of-the-art in multilingual and cross-lingual ontology matching.In:Towards the Multilingual Semantic Web[C].Springer Berlin Heidelberg,2014:119-135.
    5 Gracia J,Montiel-Ponsoda E,Cimiano P,Gómez-Pérez A,Buitelaar P,McCrae J.Challenges for the multilingual web of data[J].Web Semantics:Science,Services and Agents on the World Wide Web,2012,11:63-71.
    6 夏翠娟,刘炜,陈涛,张磊.家谱关联数据服务平台的开发实践[J].中国图书馆学报,2016,(3):27-38.
    7 夏翠娟,刘炜,张磊,朱雯晶.基于书目框架(BIBFRAME)的家谱本体设计[J].图书馆论坛.2014,34(11):5-19.
    8 Spohr D,Hollink L,Cimiano P.A machine learning approach to multilingual and cross-lingual ontology matching.The Semantic Web-ISWC 2011:10th International Semantic Web Conference,Proceedings,Part I.[C].Springer Berlin Heidelberg,2011:665-680.
    9 Helou MA,Palmonari M,&Jarrar M.Effectiveness of automatic translations for cross-lingual ontology mapping[J].J.Artif.Intell.Res.(JAIR),2016,55:165-208.
    10 Cimiano P,Montiel-Ponsoda E,Buitelaar P,Espinoza M,Gómez-Pérez A.A note on ontology localization[J].Applied Ontology.2010,5(2):127-137.
    11 毕玉德,崔杞鲜,刘扬.多语种词汇语义网建设中的几个问题.见:全国第八届计算语言学联合学术会议论文集[C],南京.2008:253-259.
    12 Ngai G,Carpuat M,Fung P.Identifying concepts across languages:A first step towards a corpus-based approach to automatic ontology alignment.In:Proceedings of the 19th international conference on Computational linguistics-Volume 1[C].Association for Computational Linguistics,2002:1-7.
    13 Wang S,Englebienne G,Schlobach S.Learning concept mappings from instance similarity.In:International semantic web conference[C]Springer Berlin Heidelberg:2008:339-355.
    14 李娟子.跨语言知识图谱构建[EB/OL].第一届全国中文知识图谱研讨会.苏州大学.2013-10-12.[2017-11-16]http://bj.bcebos.com/cips-upload/kg/ljz.pdf.
    15 Wang Z,Li J,Wang Z,Li S,Li M,Zhang D,Shi Y,Liu Y,Zhang P,Tang J.Xlore:A large-scale english-chinese bilingual knowledge graph.In:Proceedings of the 2013th International Conference on Posters&Demonstrations Track-Volume 1035[C].CEUR-WS.Org,2013:121-124.
    16 Chen SJ,Chen HH.Mapping multilingual lexical semantics for knowledge organization systems[J].The Electronic Library.2012,30(2):278-294.
    17 Matusiak KK,Meng L,Barczyk E,Shih CJ.Multilingual metadata for cultural heritage materials:The case of the Tse-Tsung Chow Collection of Chinese Scrolls and Fan Paintings[J].The Electronic Library.2015,33(1):136-151.
    18 Isaac A.Case Study:Enriching and sharing cultural heritage data in Europeana[EB/OL].2012-06-13.[2017-11-16].https://www.w3.org/2001/sw/sweo/public/UseCases/Europeana/
    19 Stiller J,Petras V,Gde M,Isaac A.Automatic enrichments with controlled vocabularies in Europeana:challenges and consequences.5th International Conference,EuroMed 2014[C].Cham:Springer International Publishing;2014:238-247.
    20 Damova M,Dannélls D,Enache R,Mateva M,Ranta A.Multilingual natural language interaction with semantic web knowledge bases and linked open data.In:Towards the Multilingual Semantic Web[C].Springer Berlin Heidelberg,2014:211-226.
    21 夏翠娟,刘炜.关联数据的消费技术及实现[J].大学图书馆学报.2013(3):29-37.
    22 袁义达,邱家儒.中国四百大姓(上中下册)[M].南昌:江西人民出版社,2013:1-2.
    (1)http://lod-cloud.net/。
    (2)http://www.dogmazic.net/。
    (3)https://datahubi.o/dataset/geolinkeddata。
    (4)http://genl.ibrary.sh.cn:8080/ontology/view。
    (5)http://sgl.ibrary.sh.cn/ontology/view。
    (1)http://pcrcl.ibrary.sh.cn/zt/opendata/。
    (2)http://wordnet.princeton.edu/。
    (3)http://linguistic-lod.org/llod-cloud。
    (1)http://teldap.tw/index.html。
    (2)http://www.npm.gov.tw/digital/index2_2_8_ch.html。
    (3)http://collectionsl.ib.uwme.du/cdm/landingpage/collection/scroll。
    (4)http://www.europeana.eu/portal/en。
    (5)http://museum.ontotext.com/。
    (6)https://familysearch.org/search/collection/1787988。
    (7)App demo可在比赛官方网站下载。http://pcrc.library.sh.cn/zt/opendata/apk/Learn%20Chinese%20Surnames%20.apk。
    (1)https://www.writtenchinese.com/。
    (1)http://www.geonames.org/about.html,https://datahub.io/dataset/geonames-semantic-web。
    (2)http://factforge.net/。
    (3)http://wiki.dbpedia.org/about。
    (1)http://dbpedia.org/page/Liu。
    (2)http://dbpedia.org/page/Ou_(surname)。
    (3)https://en.wikipedia.org/wiki/Wu_(surname)。
    (4)http://dbpedia.org/page/Wu_(surname)。
    (5)https://www.wiktionary.org/。
    (6)https://en.wikipedia.org/wiki/Wiktionary。
    (7)http://wiktionary.dbpedia.org/sparql。
    (8)http://wiki.dbpedia.org/wiktionary-rdf-extraction。
    (9)https://en.wiktionary.org/w/api.php。
    (1)https://datahubi.o/dataset/babelnet。

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700