Automatic Creation and Analysis of a Linked Data Cloud Diagram
详细信息    查看全文
  • 关键词:Linked data cloud analysis ; Automatic clustering ; Domain identification ; Community detection algorithms
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:10041
  • 期:1
  • 页码:417-432
  • 全文大小:519 KB
  • 参考文献:1.Ngomo, A.-C.N., Auer, S.: LIMES - a time-efficient approach for large-scale link discovery on the web of data. In: Presented at the 22nd International Joint Conference on Artificial Intelligence (2011)
    2.Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.: SILK - a link discovery framework for the web of data. In: Presented at the Workshop on Linked Data on the Web Colocated with the 18th International World Wide Web Conference (2009)
    3.Jentzsch, A., Cyganiak, R., Bizer, C.: State of the LOD Cloud. http://​lod-cloud.​net/​state/​
    4.Schmachtenberg, M., Bizer, C., Paulheim, H.: adoption of the linked data best practices in different topical domains. In: Mika, P., Tudorache, T., Bernstein, A., Welty, C., Knoblock, C., Vrandečić, D., Groth, P., Noy, N., Janowicz, K., Goble, C. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 245–260. Springer, Heidelberg (2014). doi:10.​1007/​978-3-319-11964-9_​16
    5.Ertöz, L., Steinbach, M., Kumar, V.: Finding clusters of different sizes, shapes, and densities in noisy, high dimensional data. In: Presented at the SIAM International Conference on Data Mining, San Francisco, CA (2003)
    6.Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. PNAS 99, 7821–7826 (2002)MathSciNet CrossRef MATH
    7.Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. In: Presented at the 4th International Workshop on Social Network Mining and Analysis Colocated with the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2010)
    8.Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12, 103018 (2010)CrossRef
    9.Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRef MATH
    10.Pereira Nunes, B., Mera, A., Casanova, M.A., Fetahu, B., Paes Leme, L.A.P., Dietze, S.: Complex matching of RDF datatype properties. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part I. LNCS, vol. 8055, pp. 195–208. Springer, Heidelberg (2013)CrossRef
    11.Kawase, R., Siehndel, P., Nunes, B.P., Herder, E., Nejdl, W.: Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources. In: Presented at the 25th ACM Conference on Hypertext and Social Media, New York, New York, USA (2014)
    12.Fortunato, S.: Community detection in graphs. Physics Reports, vol. 486 (2010)
    13.Xie, J., Kelley, S., Szymanski, B.K.: Overlapping community detection in networks: the state-of-the-art and comparative study. In: CSUR, vol. 45 (2013)
    14.Rodriguez, M.A.: A Graph Analysis of the Linked Data Cloud. ArXiv e-prints (2009)
    15.Fetahu, B., Dietze, S., Pereira Nunes, B., Antonio Casanova, M., Taibi, D., Nejdl, W.: A scalable approach for efficiently generating structured dataset topic profiles. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds.) ESWC 2014. LNCS, vol. 8465, pp. 519–534. Springer, Heidelberg (2014)CrossRef
    16.Lalithsena, S., Hitzler, P., Sheth, A.P., Jain, P.: Automatic domain identification for linked open data. In: Presented at the International Conference on Web Intelligence and Conference on Intelligent Agent Technology (2013)
    17.Emaldi, M., Corcho, O., López-de-Ipiña, D.: Detection of related semantic datasets based on frequent subgraph mining. In: Presented at the Workshop on Intelligent Exploration of Semantic Data Colocated with the 14th International Semantic Web Conference (2015)
    18.Rabello Lopes, G., Paes Leme, L.A.P., Pereira Nunes, B., Casanova, M.A., Dietze, S.: Two approaches to the dataset interlinking recommendation problem. In: Benatallah, B., Bestavros, A., Manolopoulos, Y., Vakali, A., Zhang, Y. (eds.) WISE 2014, Part I. LNCS, vol. 8786, pp. 324–339. Springer, Heidelberg (2014)
    19.Caraballo, A.A.M., Nunes, B.P., Lopes, G.R., Paes Leme, L.A.P., Casanova, M.A., Dietze, S.: TRT - a tripleset recommendation tool. In: Presented at the 12th International Semantic Web Conference (2013)
    20.Leme, L.A.P., Lopes, G.R., Nunes, B.P., Casanova, M.A., Dietze, S.: Identifying candidate datasets for data interlinking. In: Daniel, F., Dolog, P., Li, Q. (eds.) ICWE 2013. LNCS, vol. 7977, pp. 354–366. Springer, Heidelberg (2013)CrossRef
    21.Lopes, Giseli Rabello, Leme, Luiz André PPaes, Nunes, Bernardo Pereira, Casanova, Marco Antonio, Dietze, Stefan: Recommending tripleset interlinking through a social network approach. In: Lin, Xuemin, Manolopoulos, Yannis, Srivastava, Divesh, Huang, Guangyan (eds.) WISE 2013, Part I. LNCS, vol. 8180, pp. 149–161. Springer, Heidelberg (2013)CrossRef
  • 作者单位:Alexander Arturo Mera Caraballo (19)
    Bernardo Pereira Nunes (19) (22)
    Giseli Rabello Lopes (20)
    Luiz André Portes Paes Leme (21)
    Marco Antonio Casanova (19)

    19. Department of Informatics, Pontifical Catholic University of Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
    22. Federal University of the State of Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
    20. Federal University of Rio de Janeiro, Rio de Janeiro, Rio de Janeiro, Brazil
    21. Fluminense Federal University, Niterói, Rio de Janeiro, Brazil
  • 丛书名:Web Information Systems Engineering ¨C WISE 2016
  • ISBN:978-3-319-48740-3
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:10041
文摘
Datasets published on the Web and following the Linked Open Data (LOD) practices have the potential to enrich other LOD datasets in multiple domains. However, the lack of descriptive information, combined with the large number of available LOD datasets, inhibits their interlinking and consumption. Aiming at facilitating such tasks, this paper proposes an automated clustering process for the LOD datasets that, thereby, provide an up-to-date description of the LOD cloud. The process combines metadata inspection and extraction strategies, community detection methods and dataset profiling techniques. The clustering process is evaluated using the LOD diagram as ground truth. The results show the ability of the proposed process to replicate the LOD diagram and to identify new LOD dataset clusters. Finally, experiments conducted by LOD experts indicate that the clustering process generates dataset clusters that tend to be more descriptive than those manually defined in the LOD diagram.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700