Knowledge Base Creation, Enrichment and Repair

详细信息查看全文

作者：Sebastian Hellmann (18)
Volha Bryl (19)
Lorenz Bühmann (18)
Milan Dojchinovski (21)
Dimitris Kontokostas (18)
Jens Lehmann (18)
Uro? Milo?evi? (20)
Petar Petrovski (19)
Vojtěch Svátek (21)
Mladen Stanojevi? (20)
Ond?ej Zamazal (21)
刊名：Lecture Notes in Computer Science
出版年：2014
出版时间：2014
年：2014
卷：1
期：1
页码：45-69
全文大小：8,671 KB
参考文献：1. Baader, F., Hollunder, B.: Embedding defaults into terminological knowledge representation formalisms. In: Nebel, B., Rich, C., Swartout, W.R., (eds.) KR, pp. 306-17. Morgan Kaufmann (1992)
2. Bizer, C., Eckert, K., Meusel, R., Mühleisen, H., Schuhmacher, M., V?lker, J.: Deployment of RDFA, microdata, and microformats on the web - a quantitative analysis. In: Proceedings of the In-Use Track of the 12th International Semantic Web Conference (2013)
3. Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing RDF graph summary with application to assisted SPARQL formulation. In: 23rd International Workshop on Database and Expert Systems Applications, DEXA 2012, pp. 261-66, Sept 2012
4. Campinas, S., Delbru, R., Tummarello, G.: Efficiency and precision trade-offs in graph summary algorithms. In: Proceedings of the 17th International Database Engineering and?Applications Symposium, IDEAS -3, pp. 38-7. ACM, New York (2013)
5. Dojchinovski, M., Kliegr, T.: Datasets and GATE evaluation framework for benchmarking wikipedia-based NER systems. In: Proceedings of 1st International Workshop on NLP and DBpedia, 21-5 October 2013, Sydney, Australia, volume 1064 of NLP & DBpedia 2013, Sydney, Australia, October 2013, CEUR Workshop Proceedings (2013)
6. Fernandez-Breis, J.T., Iannone, L., Palmisano, I., Rector, A.L., Stevens, R.: Enriching the gene ontology via the dissection of labels using the ontology pre-processor language. In: Cimiano, P., Pinto, H.S. (eds.) EKAW 2010. LNCS, vol. 6317, pp. 59-3. Springer, Heidelberg (2010) CrossRef
7. Kalyanpur, A.: Debugging and repair of OWL ontologies. Ph.D. thesis, University of Maryland, College Park, College Park, MD, USA (2006) (Adviser-James Hendler)
8. Kalyanpur, A., Parsia, B., Horridge, M., Sirin, E.: Finding all justifications of OWL DL entailments. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 267-80. Springer, Heidelberg (2007) CrossRef
9. Kontokostas, D., Bratsas, Ch., Auer, S., Hellmann, S., Antoniou, I., Metakides, G.: Internationalization of linked data: the case of the greek DBpedia edition. Web Semant. Sci. Serv. Agents World Wide Web 15, 51-1 (2012) CrossRef
10. Kontokostas, D., Brümmer, M., Hellmann, S., Lehmann, J., Ioannidis, L.: NLP data cleansing based on linguistic ontology constraints. In: Proceedings of the Extended Semantic Web Conference 2014 (2014)
11. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R.: Databugger: a test-driven framework for debugging the web of data. In: Proceedings of the Companion Publication of the 23rd International Conference on World Wide Web Companion, WWW Companion -4, pp. 115-18, Republic and Canton of Geneva, Switzerland, 2014, International World Wide Web Conferences Steering Committee
12. Kontokostas, D., Westphal, P., Auer, S., Hellmann, S., Lehmann, J., Cornelissen, R., Zaveri, A.: Test-driven evaluation of linked data quality. In: Proceedings of the 23rd International Conference on World Wide Web, WWW -4, pp. 747-58, Republic and Canton of Geneva, Switzerland, 2014, International World Wide Web Conferences Steering Committee
13. Lehmann, J., Bizer, Ch., Kobilarov, G., Auer, S., Becker, Ch., Cyganiak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. J. Web Semant. 7(3), 154-65 (2009) CrossRef
14. Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia - a large-scale, multilingual knowledge base extracted from wikipedia. Semant. Web J. (2014)
15. Mika, P., Potter, T.: Metadata statistics for a large web corpus. In: LDOW: Linked Data on the Web. CEUR Workshop Proceedings, vol. 937 (2012)
16. Mühleisen, H., Bizer, C., Web data commons - extracting structured data from two large web corpora. In: LDOW: Linked Data on the Web. CEUR Workshop Proceedings, vol. 937 (2012)
17. Nirenburg, S., Wilks, Y.: What’s in a symbol: ontology, representation and language. J. Exp. Theor. Artif. Intell. 13(1), 9-3 (2001) CrossRef
18. Schlobach, S., Cornet, R.: Non-standard reasoning services for the debugging of description logic terminologies. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, pp. 355-60. Morgan Kaufmann Publishers, San Francisco (2003)
19. Schober, D., Smith, B., Lewis, S.E., Kusnierczyk, W., Lomax, J., Mungall, C., Taylor, C.F., Rocca-Serra, P., Sansone, S.-A.: Survey-based naming conventions for use in OBO foundry ontology development. BMC Bioinform. 10(1), 125 (2009) CrossRef
20. Sirin, E., Tao, J.: Towards integrity constraints in OWL. In: Hoekstra, R., Patel-Schneider, P.F., (eds.) OWLED, volume 529 of CEUR Workshop Proceedings (2008). http://CEUR-WS.org
21. ?váb-Zamazal, O., Dudá?, M., Svátek, V.: User-friendly pattern-based transformation of OWL ontologies. In: ten Teije, A., V?lker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 426-29. Springer, Heidelberg (2012) CrossRef
22. ?váb-Zamazal, O., Svátek, V.: Analysing ontological structures through name pattern tracking. In: Gangemi, A., Euzenat, J. (eds.) EKAW 2008. LNCS (LNAI), vol. 5268, pp. 213-28. Springer, Heidelberg (2008)
23. Zamazal, O., Svátek, V.: Patomat - versatile framework for pattern-based ontology transformation. Comput. Inf. (2014) (Accepted)
作者单位：Sebastian Hellmann (18)
Volha Bryl (19)
Lorenz Bühmann (18)
Milan Dojchinovski (21)
Dimitris Kontokostas (18)
Jens Lehmann (18)
Uro? Milo?evi? (20)
Petar Petrovski (19)
Vojtěch Svátek (21)
Mladen Stanojevi? (20)
Ond?ej Zamazal (21)

18. University of Leipzig, Leipzig, Germany
19. University of Mannheim, Mannheim, Germany
21. University of Economics Prague, Prague, Czech Republic
20. Institute Mihajlo Pupin, Belgrade, Serbia
ISSN：1611-3349

文摘

This chapter focuses on data transformation to RDF and Linked Data and furthermore on the improvement of existing or extracted data especially with respect to schema enrichment and ontology repair. Tasks concerning the triplification of data are mainly grounded on existing and well-proven techniques and were refined during the lifetime of the LOD2 project and integrated into the LOD2 Stack. Triplification of legacy data, i.e. data not yet in RDF, represents the entry point for legacy systems to participate in the LOD cloud. While existing systems are often very useful and successful, there are notable differences between the ways knowledge bases and Wikis or databases are created and used. One of the key differences in content is in the importance and use of schematic information in knowledge bases. This information is usually absent in the source system and therefore also in many LOD knowledge bases. However, schema information is needed for consistency checking and finding modelling problems. We will present a combination of enrichment and repair steps to tackle this problem based on previous research in machine learning and knowledge representation. Overall, the Chapter describes how to enable tool-supported creation and publishing of RDF as Linked Data (Sect.?1) and how to increase the quality and value of such large knowledge bases when published on the Web (Sect.?2).

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700