On Learnability of Constraints from RDF Data
详细信息    查看全文
  • 关键词:RDF constraints ; Linked data mining ; Data quality ; Data semantics
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9678
  • 期:1
  • 页码:834-844
  • 全文大小:448 KB
  • 参考文献:1.Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases: The Logical Level, 1st edn. Addison-Wesley, Boston (1995)
    2.Akhtar, W., Cortés-Calabuig, A., Paredaens, J.: Constraints in RDF. In: 4th International Workshops on Semantics in Data and Knowledge Bases, SDKB, pp. 23–39 (2010)
    3.Arenas, M., Daenen, J., Neven, F., Ugarte, M., den Bussche, J.V., Vansummeren, S.: Discovering XSD keys from XML data. ACM Trans. Database Syst. 39(4), 28:1–28:49 (2014)MathSciNet CrossRef
    4.Atencia, M., et al.: Defining key semantics for the RDF datasets: experiments and evaluations. In: Hernandez, N., Jäschke, R., Croitoru, M. (eds.) ICCS 2014. LNCS, vol. 8577, pp. 65–78. Springer, Heidelberg (2014)
    5.Atencia, M., David, J., Scharffe, F.: Keys and pseudo-keys detection for web datasets cleansing and interlinking. In: ten Teije, A., Völker, J., Handschuh, S., Stuckenschmidt, H., d’Acquin, M., Nikolov, A., Aussenac-Gilles, N., Hernandez, N. (eds.) EKAW 2012. LNCS, vol. 7603, pp. 144–153. Springer, Heidelberg (2012)CrossRef
    6.Brown, P., Link, S.: Probabilistic keys for data quality management. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 118–132. Springer, Heidelberg (2015)CrossRef
    7.Cortés-Calabuig, A., Paredaens, J.: Semantics of constraints in RDFS. In: Proceedings of the 6th Alberto Mendelzon International Workshop on Foundations of Data Management, pp. 75–90 (2012)
    8.Ferrarotti, F., Hartmann, S., Link, S., Marin, M., Muñoz, E.: The finite implication problem for expressive XML keys: foundations, applications, and performance evaluation. In: Hameurlain, A., Küng, J., Wagner, R., Liddle, S.W., Schewe, K.-D., Zhou, X. (eds.) TLDKS X. LNCS, vol. 8220, pp. 60–94. Springer, Heidelberg (2013)CrossRef
    9.Ferrarotti, F., Hartmann, S., Link, S., Marin, M., Muñoz, E.: Soft cardinality constraints on XML data. In: Lin, X., Manolopoulos, Y., Srivastava, D., Huang, G. (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 382–395. Springer, Heidelberg (2013)CrossRef
    10.Grahne, G., Zhu, J.: Discovering approximate keys in XML data. In: Proceedings of the 2002 ACM CIKM, pp. 453–460 (2002)
    11.Hartmann, S.: Soft constraints and heuristic constraint correction in entity-relationship modelling. In: Bertossi, L., Katona, G.O.H., Schewe, K.-D., Thalheim, B. (eds.) Semantics in Databases 2001. LNCS, vol. 2582, pp. 82–99. Springer, Heidelberg (2003)CrossRef
    12.Hogan, A.: Skolemising blank nodes while preserving isomorphism. In: Proceedings of the 24th WWW, pp. 430–440 (2015)
    13.Hogan, A., Arenas, M., Mallea, A., Polleres, A.: Everything you always wanted to know about blank nodes. Web Semant. Sci. Serv. Agents World Wide Web 27–28, 42–69 (2014). Semantic Web Challenge 2013CrossRef
    14.Lausen, G., Meier, M., Schmidt, M.: SPARQLing constraints for RDF. In: Proceeding of the 11th EDBT, pp. 499–509 (2008)
    15.Liddle, S.W., Embley, D.W., Woodfield, S.N.: Cardinality constraints in semantic data models. Data Knowl. Eng. 11(3), 235–270 (1993)CrossRef MATH
    16.Motik, B., Horrocks, I., Sattler, U.: Bridging the gap between OWL and relational databases. Web Semant. Sci. Serv. Agents World Wide Web 7(2), 74–89 (2009)CrossRef
    17.Muñoz, E.: Learning content patterns from linked data. In: Proceedings of the Linked Data for Information Extraction (LD4IE) Workshop, ISWC, CEUR Workshop Proceedings, vol. 1267, pp. 21–32. CEUR-WS.org (2014)
    18.Paulheim, H.: Knowledge graph refinement: a survey of approaches and evaluation methods. Semant. Web - Interoperability Usability Appl. IOS Press J. (2016, to appear). http://​www.​semantic-web-journal.​net/​
    19.Schmachtenberg, M., Bizer, C., Paulheim, H.: Adoption of the linked data best practices in different topical domains. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 245–260. Springer, Heidelberg (2014)
    20.Schmidt, M., Lausen, G.: Pleasantly consuming linked data with RDF data descriptions. In: Proceedings of the 4th COLD Workshop (2013)
    21.Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: Proceedings of the 13th ICDT, pp. 4–33. ACM (2010)
    22.Seaborne, A.: SPARQL 1.1 Property Paths (2010). http://​www.​w3.​org/​TR/​sparql11-property-paths/​ . Accessed Nov 2015
    23.Soru, T., Marx, E., Ngomo, A.N.: ROCKER: a refinement operator for key discovery. In: Proceedings of the 24th WWW, pp. 1025–1033 (2015)
    24.Soru, T., Marx, E., Ngonga Ngomo, A.-C.: Enhancing dataset quality using keys. In: Proceedings of the 14th ISWC, Posters & Demonstrations Track (2015)
    25.Stickler, P.: CBD - Concise Bounded Description (2005). http://​www.​w3.​org/​Submission/​CBD/​ . Accessed Oct 2015
    26.Strong, D.M., Lee, Y.W., Wang, R.Y.: Data quality in context. Commun. ACM 40(5), 103–110 (1997)CrossRef
    27.Symeonidou, D., Armant, V., Pernelle, N., Saïs, F.: SAKey: scalable almost key discovery in RDF data. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 33–49. Springer, Heidelberg (2014)
    28.Tao, J., Sirin, E., Bao, J., McGuinness, D.L.: Extending OWL with integrity constraints. In: Description Logics, CEUR Workshop Proceedings, vol. 573. CEUR-WS.org (2010)
    29.Thalheim, B.: Fundamentals of cardinality constraints. In: Pernul, G., Tjoa, A.M. (eds.) ER 1992. LNCS, vol. 645, pp. 7–23. Springer, Heidelberg (1992)
    30.Töpper, G., Knuth, M., Sack, H.: DBpedia ontology enrichment for inconsistency detection. In: Proceedings of the 8th International Conference on Semantic Systems, I-SEMANTICS 2012, pp. 33–40. ACM, New York (2012)
    31.Völker, J., Fleischhacker, D., Stuckenschmidt, H.: Automatic acquisition of class disjointness. Web Semant. Sci. Serv. Agents World Wide Web 35(Part 2), 124–139 (2015). Machine Learning and Data Mining for the Semantic Web (MLDMSW)CrossRef
    32.Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manage. Inf. Syst. 12(4), 5–33 (1996)CrossRef
  • 作者单位:Emir Muñoz (19) (20)

    19. Fujitsu Ireland Limited, Dublin, Ireland
    20. Insight Centre for Data Analytics, National University of Ireland, Galway, Ireland
  • 丛书名:The Semantic Web. Latest Advances and New Domains
  • ISBN:978-3-319-34129-3
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9678
文摘
RDF is structured, dynamic, and schemaless data, which enables a big deal of flexibility for Linked Data to be available in an open environment such as the Web. However, for RDF data, flexibility turns out to be the source of many data quality and knowledge representation issues. Tasks such as assessing data quality in RDF require a different set of techniques and tools compared to other data models. Furthermore, since the use of existing schema, ontology and constraint languages is not mandatory, there is always room for misunderstanding the structure of the data. Neglecting this problem can represent a threat to the widespread use and adoption of RDF and Linked Data. Users should be able to learn the characteristics of RDF data in order to determine its fitness for a given use case, for example. For that purpose, in this doctoral research, we propose the use of constraints to inform users about characteristics that RDF data naturally exhibits, in cases where ontologies (or any other form of explicitly given constraints or schemata) are not present or not expressive enough. We aim to address the problems of defining and discovering classes of constraints to help users in data analysis and assessment of RDF and Linked Data quality.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700