知识图谱的Top-k摘要模式挖掘方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Mining Top-k summarization patterns for knowledge graphs
  • 作者:罗之皓 ; 李劲 ; 岳昆 ; 毛钰源 ; 刘琰
  • 英文作者:LUO Zhihao;LI Jin;YUE Kun;MAO Yuyuan;LIU Yan;School of Software,Yunnan University;School of Information,Yunnan University;
  • 关键词:知识图谱 ; 摘要模式挖掘 ; 次模函数 ; 图匹配
  • 英文关键词:knowledge graph;;summarization pattern mining;;submodular function;;graph matching
  • 中文刊名:QHXB
  • 英文刊名:Journal of Tsinghua University(Science and Technology)
  • 机构:云南大学软件学院;云南大学信息学院;
  • 出版日期:2018-09-21 14:03
  • 出版单位:清华大学学报(自然科学版)
  • 年:2019
  • 期:v.59
  • 基金:国家自然科学基金资助项目(61562091,61472345);; 第二批“云岭学者”培养项目(C6153001);; 云南省应用基础研究计划面上项目(2016FB110);; 云南大学中青年骨干教师培养计划项目;云南大学青年英才培育计划项目(WX173602);云南大学数据驱动的软件工程科技创新团队项目(2017HC012)
  • 语种:中文;
  • 页:QHXB201903004
  • 页数:9
  • CN:03
  • ISSN:11-2223/N
  • 分类号:29-37
摘要
知识图谱数据具有体量大、内容丰富、类型多样、缺乏统一模式描述等特点。提取知识图谱模式信息并形成摘要模式,对于提升知识检索、挖掘质量具有重要研究意义。该文首先给出了摘要模式的判定准则以及摘要模式质量的度量标准,提出了面向知识图谱的Top-k摘要模式挖掘问题,并将该问题建模为一个次模函数优化问题;其次,为高效判定摘要模式及度量模式的覆盖质量,提出了基于Pregel编程模型的并行化摘要模式判定和质量度量算法;然后,给出了高效求解Top-k摘要模式挖掘问题的贪心算法;最后,在真实知识图谱数据上对本文方法进行了验证。实验结果表明:该方法在摘要模式的覆盖度和算法执行效率方面优于已有方法。
        Knowledge graph data has large volumes,rich content,diverse types,and lacks a unified model description.Pattern information needs to be extracted from knowledge graphs to improve the quality of knowledge graph retrieval and mining.This paper presents a knowledge graph summarization pattern and quality metrics.This method is used in an algorithm for mining Top-k summarization patterns(Top-k SPM)formulated as a submodular function optimization problem. Then,a Pregel based parallel algorithm is used to validate the algorithm and measure the qualities of summarization patterns.Two efficient greedy algorithms are also presented to solve the Top-k SPM.The efficiency and effectiveness of the method is then verified on real knowledge graph datasets.The tests show that the method outperforms the existing methods in terms of coverage and algorithm execution time.
引文
[1]QIAN J W,LI X Y,ZHANG C H,et al.Social network de-anonymization and privacy inference with knowledge graph model[J].IEEE Transactions on Dependable and Secure Computing,2017.DOI:10.1109/TDSC.2017.2697854.
    [2]SHI B X,WENINGER T.Discriminative predicate path mining for fact checking in knowledge graphs[J].Knowledge-Based Systems,2016,104:123-133.
    [3]SHI L X,LI S J,YANG X R,et al.Semantic health knowledge graph:Semantic integration of heterogeneous medical knowledge and services[J].BioMed Research International,2017,2858423.
    [4]王萍.网络环境下的领域知识挖掘[D].上海:华东师范大学,2010.WANG P.Domain knowledge mining in network environments[D].Shanghai:East China Normal University,2010.(in Chinese)
    [5]陈池,王宇鹏,李超,等.面向在线教育领域的大数据研究及应用[J].计算机研究与发展,2014,51(S1):67-74.CHEN C,WANG Y P,LI C,et al.The research and application of big data in the field of online education[J].Journal of Computer Research and Development,2014,51(S1):67-74.(in Chinese)
    [6]SANG S T,YANG Z Z,WANG L,et al.SemaTyP:Aknowledge graph based literature mining method for drug discovery[J].BMC Bioinformatics,2018,19(1):193-193.
    [7]KEMMAR A,LEBBAH Y,LOUDNI S.Interval graph mining[J].International Journal of Data Mining,Modelling and Management,2018,10(1):1-22.
    [8]高俊平,张晖,赵旭剑,等.面向维基百科的领域知识演化关系抽取[J].计算机学报,2016,39(10):2088-2101.GAO J P,ZHANG H,ZHAO X J,et al.Evolutionary relation extraction for domain knowledge in Wikipedia[J].Chinese Journal of Computers,2016,39(10):2088-2101.(in Chinese)
    [9]SONG Q,WU Y H,DONG X L.Mining summaries for knowledge graph search[C]//Proceedings of 2016 IEEEInternational Conference on Data Mining.Barcelona,Spain:IEEE,2016:1215-1220.
    [10]BABAI L.Graph isomorphism in quasipolynomial time[extended abstract][C]//Proceedings of the 48th Annual ACM Symposium on Theory of Computing.Cambridge,USA:ACM,2016:684-697.
    [11]SAMSI S,GADEPALLY V,HURLEY M,et al.Static graph challenge:Subgraph isomorphism[C]//Proceedings of2017 IEEE High Performance Extreme Computing Conference.Waltham,USA:IEEE,2017:1-6.
    [12]KRAUSE A,GOLOVIN D.Submodular function maximization[M]//BORDEAUX L,HAMADI Y,KOHLIP.Tractability.Cambridge:Cambridge University Press,2014:71-104.
    [13]DVORˇK W,HENZINGER M,WILLIAMSON D P.Maximizing a Submodular function with viability constraints[J].Algorithmica,2017,77(1):152-172.
    [14]MA S,CAO Y,FAN W F,et al.Capturing topology in graph pattern matching[J].Proceedings of the VLDBEndowment,2011,5(4):310-321.
    [15]MALEWICZ G,AUSTERN M H,BIK A J C,et al.Pregel:A system for large-scale graph processing[C]//Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data.Indianapolis,USA:ACM,2010:135-146.
    [16]ELSEIDY M E,ABDELHAMID P,SKIADOPOULOS S,et al.GraMi:Frequent subgraph and pattern mining in a single large graph[J].Proceedings of the VLDB Endowment,2014,7(7):517-528.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700