用户名: 密码: 验证码:
基于图分割的大规模本体分块与映射研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本体映射是解决语义Web发展瓶颈的关键技术。但是,随着语义Web的发展,出现了一类概念数目庞大,概念之间关系复杂的大规模本体。由于大规模本体和普通本体在所包含的实体数目和映射难度上存在着不同,因而应当针对它们采用不同的映射方法。本文将着重对大规模本体分块与映射进行研究。
     首先,简要介绍了课题的研究背景,总结了本体映射技术当前的研究现状,并给出了未来的发展方向。
     其次,针对传统的单个本体中语义相似度计算未充分利用本体中的语义信息等不足,提出了一种基于概念特征的语义相似度计算方法。该方法首先根据概念在本体中的所处的层次结构来确定其特征集合,并引入概念的宽度因素对各个特征赋予不同权值,然后采用计算集合相似度的方法来计算概念的相似度,最后引入深度影响因子,并对相似度计算公式进行修正,转换成一种更直观的形式。理论分析和实验结果表明,该方法计算简便,结果准确。
     再次,针对当前的大规模本体映射方法存在的自动化程度不高,分块大小不均匀等问题,提出一种基于图分割的大规模本体分块与映射方法。该方法首先对本体进行预处理,将待匹配的大规模本体转换成有向无环图,从而将大规模本体分块问题转换成图分割问题,然后采用基于遗传算法的GPO算法分别对这两个本体图进行分割,将本体划分成本体块集合,最后通过采用基于参考点策略和基于本体块结构策略相结合的方法识别正确的块映射。
     最后,根据上述研究,本文设计并实现了的大规模本体分块与映射系统LSOPM,并将其和当前的大规模本体映射系统进行了比较。实验结果表明,该系统分块结果好,块映射准确,且在查全率和查准率方面都有明显提高。
Ontology mapping is the key technology of solving the bottlenecks of semantic web development. However, with the developing of semantic web, the large-scale ontologies which have a lot number of concepts and complex relationship between concepts have appeared. Since there are some difference on entity number and mapping difficulty between large-scale ontologies and general ontologies, we should use different mapping method to deal with them. This thesis will focus on the issue of large-scale ontologies mapping.
     Firstly, the research background of the thesis is briefly introduced, after which the state of the art of the ontology mapping technology is elaborated, as well as the development trend of the mapping technology.
     Secondly, aiming at the problem of current semantic similarity metric of a single ontology doesn't make full use of the semantic information of ontologies, a new semantic similarity metric based on the feature set of concepts is proposed. It first expresses each concept as a set of features according to the hierarchy of each concept in ontology, and introduces a width influencing factor as the coefficient of each feature. Then, it obtains the concept similarity through calculating the similarity between two sets. At last, we introduce a depth influencing factor, and amend the semantic metric to a more understandable form. Theoretical analysis and experimental results show that the metric is simple, but the results close to human judgment.
     Thirdly, aiming at the problem of low degree of automation and not uniform in block size for the current large-scale ontology mapping, a new method for large-scale ontology partition and mapping method based on graph partitioning is proposed. It first converts the two ontologies to be matched to DAG structures by preprocessing, which convert the ontologies partition problem into graph partitioning problem, and then partition the two ontologies graphs separately to a set of blocks by using the GPO algorithm which based on genetic algorithm. At last, blocks from different ontology are matched by combining two methods of ontology blocks structure as well as predefined anchors.
     Finally, LSOPM system has been designed and implemented according to the works above, and compared with the current large-scale ontology mapping system. Experimental results show that this system has a good quality of partition and block mapping and an obvious improvement on both precision and recall.
引文
[1]Berners-Lee T. Semantic Web Roadmap. http://www.w3.org/Designlssues/Semantic.html,1998
    [2]Berners-Lee T, Hendler J, Lassila O. The semantic Web. Scientific American, 2001,284 (5):29-37
    [3]Berners-Lee T. Semantic Web. XML 2000 Conference, Washington. D. C, USA, December 3rd-8th,2000.
    [4]杜小勇,李曼,王珊.本体学习研究综述.软件学报,2006,17(9):1837-1847
    [5]J. McCarthy. Circumscription-a form of non-monotonic reasoning. Artificial Intelligence,1980,13(1-2):27-39
    [6]Gruber T R. A Translation Approach to Portable ontology specifications. Knowledge Acquisition,1993,5(2):199~220
    [7]Giuseppe Pirro, Domenico Talia. UFOme:An ontology mapping system with strategy prediction capabilities. Data & Knowledge Engineering,2010,69(5): 444~471
    [8]A Jimeno-Yepes, R Berlanga-Llavor, D Rebholz-Schuhmann. Ontology refinement for improved information retrieval. Information Processing & Management,2010,46(4):426-435
    [9]Yi Zhao, Wolfgang Halang, Xia Wang. Rough Ontology Mapping in E-Business Integration. Studies in Computational Intelligence,2007,37:75~93
    [10]Janet Kelso, Robert Hoehndorf, Kay Prufer. Theory and Applications of Ontology:Computer Applications.2010.341-371
    [11]Lee Feigenbaum, Ivan Herman, Tonya Hongsermeier, et al. Scientific American, 2007,297(6):90~97
    [12]E Edison Matos, F Campos, R Braga, et al. CelOWS:An ontology based framework for the provision of semantic web services related to biological models. Journal of Biomedical Informatics,2010,43(1):125-136
    [13]Avesani P, Giunchiglia F, Yatskevich M. A Large Scale Taxonomy Mapping Evaluation. Proceedings of the 4th International Semantic Web Conference, 2005,67~81
    [14]GALEN. http://www.opengalen.org.2008
    [15]吴雅娟,陈尧,尚福华.一种新的基于相似度计算的本体映射算法.计算机应用与研究,2009,26(3):870-972
    [16]Amjad Farooq, Syed Ahsan, Abad Shah. An Efficient Technique for Similarity Identification between Ontologies. Journal of Computing,2010,2(6):147~155
    [17]管庆华.基于基层本体映射的本体映射研究:[硕士学位论文].长沙:中南大学,2009
    [18]Jia Liu, HuaYu Wang. The Computation of Concept Similarity Based on PSO Algorithm. Proceeding of 2nd IEEE International Conference on Information and Financial Engineering (ICIFE).2010.6-10
    [19]Bemers-Lee T, Fisechetti M L. Waving the web:The original design and ultimate destiny of the World Wide Web, San Facncisco, Harper,1999:74~80
    [20]Najam Anjum, Jenny Harding, Bob Young, et al. Gap Analysis of Ontology Mapping Tools and Techniques. Enterprise Interoperability Ⅳ.2010, Part V, 303-312
    [21]WU Ya-Juan, LANG Ji-Sheng, SHANG Fu-Hua. A Similarity-Based Approach for Ontology Mapping. Proceedings of 2009 4th International Conference on Computer Science & Education.2009.165~169
    [22]Rujuan Wang, Jingyi Wu, Lei Liu. Strategies Prediction and combination of Multi-strategy Ontology Mapping. Information Computing and Applications, 2010,106:220-227
    [23]Sergey Melnik, Hector Garcia-Molina, Erhard Rahm. Similarity Flooding:A Versatile Graph Matching Algorithm, In:The 18th International Conference on Data Engineering,2002,112-126
    [24]J. Li, J. Tang, Y. Li, et al. RiMOM:A dynamic multi-strategy ontology alignment framework. IEEE Transactions on Knowledge and Data Engineering, 2009,21(8):1218~1232
    [25]Yves R. Jean-Mary, E. Patrick Shironoshita, Mansur R. Kabuka. Ontology matching with semantic verification. Web Semantics:Science, Services and Agents on the World Wide Web,2009,7(3):235~251
    [26]Wei Hu, Ningsheng Jian, Yuzhong Qu, et al. GMO:A Graph Matching for Ontologies. In:Proceedings of K-CAP Workshop on Integrating Ontologies. 2005,41~48
    [27]庞雄文,鲍苏苏.基于领域学习的本体映射方法.计算机科学,2009,36(3):134-137
    [28]徐德智,吴军庆,陈建二,等.一种基于概念信息量的相似度传播算法.计 算机科学,2009,36(6):174-177
    [29]Jesus Oliva, Jose Ignacio Serrano, Maria Dolores del Castillo, et al. SyMSS:A syntax-based measure for short-text semantic similarity. Data & Knowledge Engineering,2011,70(4):390~405
    [30]Yunqing Xia, Taotao Zhao, Jianmin Yao, et al. Measuring Chinese-English Cross-Lingual Word Similarity with HowNet and Parallel Corpus. Computational Linguistics and Intelligent text processing,2011,6609:221-233
    [31]Md. Hanif Seddiqui, Masaki Aono. An efficient and scalable algorithm for segmented alignment of ontologies of arbitrary size. Web Semantics:Science, Services and Agents on the World Wide Web,2009,7(4):344~356
    [32]Ningsheng Jian, Wei Hu, Gong Cheng, et al. Falcon-AO:Aligning Ontologies with Falcon. In:Proceedings of K-CAP Workshop on Integrating Ontologies. 2005.85~91
    [33]Yuzhong Qu, Wei Hu, Gong Cheng. Constructing Virtual Documents for Ontology Matching. World Wide Web Conference, in:Proceedings of the 15th International World Wide Web Conference, ACM Press,2006,23~31
    [34]Ming Mao, Yefei Peng, Michael Spring. An adaptive ontology mapping approach with neural network based constraint satisfaction. Web Semantics: Science, Services and Agents on the World Wide Web.2010,8(1):14~25
    [35]F Hamdi, B Safar, NB Niraula. TaxoMap alignment and refinement modules: Results for OAEI 2010. Workshop on Ontology Matching,2010,212~219
    [36]Heiko Paulheim. On Applying Matching Tools to Large-Scale Ontologies. Proceedings of the 7th International Semantic Web Conference.2008.1-5
    [37]E Rahm, HH Do, Sabine MaBmann. Matching Large XML Schemas. ACM SIGMOD Record,2004,33(4):26~31
    [38]Wei Hu, Yuanyuan Zhao, Yuzhong Qu. Partition-Based Block Matching of Large Class Hierarchies. In:Asian Semantic Web Conference.2006.72~83
    [39]Wei Hu, Yuzhong Qu. Block Matching for Ontologies. In:International Semantic Web Conference.2006.300~313
    [40]Wei Hu, Yuzhong Qu, Gong Cheng. Matching large ontologies:A divide-and-conquer approach. Data & Knowledge Engineering,2008,67 (1): 140~160
    [41]H. Do, E. Rahm. Matching large schemas:approaches and evaluation. Information Systems,2007,32 (6):857~885
    [42]Stuckenschmidt, H., Klein, M. Structure-Based Partitioning of Large Concept Hierarchies. In:Proceedings of the 3rd International Semantic Web Conference. 2004.289~303
    [43]J. Seidenberg, A. Rector. Web ontology segmentation:analysis classification and use. In:Proceedings of the 15th International World Wide Web Conference, ACM Press.2006.13~22
    [44]徐德智,杨冠军.基于混合聚类的本体分块与映射.计算机工程与应用,2010,46(01):116-118
    [45]Bernardo Cuenca Grau, Bijian Parsia, Evren Sirin. Automatic Partitioning of OWL Ontologies Using ε-Connections. In:Proceedings of the 2005 International Workshop on Description Logics.2005.1-22
    [46]Ying Wang, Weiru Liu, David Bell. A Concept Hierarchy Based Ontology Mapping Approach. Knowledge Science, Engineering and Management,2010, 6291:10~113
    [47]Anna Formic. Concept similarity in Formal Concept Analysis:An information content approach. Knowledge-Based Systems,2008,21(1):80~87
    [48]Lidong Wang, Xiaodong Liu. A new model of evaluating concept similarity. Knowledge-Based Systems,2008,21(8):842~846
    [49]Wu Kui, Guo Ling, Zhou Xianzhong, et al. A Concept Semantic Similarity Algorithm Based on Bayesian Estimation. Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing,2009,209: 135~144
    [50]M Batet, D Sanchez, AidaValls. An ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics.2011, 44(1):118~125
    [51]Ming Che Lee. A novel sentence similarity measure for semantic-based expert systems. Expert Systems with Applications,2011,38(5):6392~6399
    [52]Wu Z, Palmer M. Verb semantics and lexical selection. In:Proceedings of the 32nd annual meeting of the association for computational linguistics. New Mexico, USA:asociation for Computational Linguistics.1994.133-38.
    [53]Resnik P. Using information content to evaluate semantic similarity in a taxonomy. In:Proceedings of the 14th international joint conference on artificial intelligence (IJCAI 95). Montreal, Canada; 1995,448-53
    [54]J. Jiang, D. Conrath. Semantic similarity based on corpus statistics and lexical taxonomy. In:Proceedings of Research on Computational Linguistics. 1997.1-15
    [55]Giuseppe Pirro, Jerome Euzenat. A Feature and Information Theoretic Framework for Semantic Similarity and Relatedness. The Semantic Web,2010, 6496:615-630
    [56]David Sanchez, Montserrat Batet, David Isern. Ontology-based information content computation. Knowledge-Based Systems,2011,24(2):297~303
    [57]徐德智,郑春卉,K. Passi基于SUMO的概念语义相似度研究.2006,26(01):180-183
    [58]Ian Niles, Adam Pease. Towards a Standard Upper Ontology. Proceedings of the international conference on Formal Ontology in Information Systems.2001. 2-9
    [59]Tania Tudorache, Natalya F. Noy, Samson Tu, et al. Supporting Collaborative Ontology Development in Protege. The Semantic Web.2008,5318:17-32
    [60]A. Hliaoutakis. Semantic Similarity Measures in the MESH Ontology and their Application to Information Retrieval on Medline, Technical Report,2005, 18~26
    [61]C.H.Q. Ding, X. He, H. Zha, et al. A Min-max Cut Algorithm for Graph Partitioning and Data Clustering. Proceedings of the IEEE International Conference on Data Mining.2001.107~114
    [62]Lin Yu Tseng, Shiueng Bien Yang. A genetic approach to the automatic clustering problem.Pattern Recognition,2001,34(2):415-424
    [63]Keiko Kohmoto, Kengo Katayama, Hiroyuki Narihisa. Performance of a genetic algorithm for the graph partitioning problem. Mathematical and Computer Modelling.2003,38(11~13):1325~1332
    [64]徐德智,吴军庆,陈建二.本体映射中名称策略与结构策略的改进算法.小型微型计算机系统.2010,31(1):124-129
    [65]Jena-A Java API for RDF. http://jena.sourcefoge.net/.2008
    [66]J. Euzenat A Ferrara, C Meilicke et al. First Results of the Ontology Alignment Evaluation Initiative 2010, Workshop on Ontology Matching,2010.85~120

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700