链接数据网构建的关键问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
链接数据是一种基于语义技术在互联网上发布和共享数据的方法。语义网不仅仅是将互联网上的数据以一种机器可理解的方式进行表达,它还需要将数据进行链接,构建规模巨大且链接丰富的数据网(The Web of Data),使人们在计算机辅助下获取信息和知识的过程更加智能化和精细化。链接数据网区别于传统互联网的最大特点是链接的对象及类型不同。在数据网中,链接的对象由HTML文档变为指代某个具体事物的URI,而随着链接对象粒度的减小,对象之间的链接也由传统的超文本链接变为包含明确语义信息的RDF链接。链接数据能够解决现有互联网中信息共享的粒度过粗和语义缺失问题,促进传统文档链接网络向数据网络的演进。
     虽然目前已经有多个领域基于链接数据技术构建了链接数据网,但链接数据的深入发展和应用仍然面临着诸多问题和瓶颈。首先是链接数据网的数据来源匮乏导致其规模增长缓慢;其次,异构语义数据集中广泛存在的对象共指现象阻碍了数据集之间丰富链接的自动化构建。本文分别从应用模式和共指分析技术两方面对上述问题进行研究,并提出了相应的解决方法,最后对云计算环境下的共指分析系统进行了研究与实现。
     论文的主要工作和研究成果主要包括:
     (1)将云计算引入链接数据网的构建,提出了一种基于云计算的链接数据应用模式,并设计了支持这种创新模式的链接数据云平台的架构。链接数据云平台提供链接数据共享所需要的各种服务,能够有效降低普通数据用户参与链接数据网构建的技术门槛,促使数据拥有者将数据融合到链接数据网中,支持互联网范围内的链接数据共享社区的建设。
     (2)对基于相似度模型的共指分析方法进行了研究,针对传统方法在计算属性权重和处理多值属性方面的不足,提出使用Renyi熵描述属性值的分布特征,并设计了相应的权重计算模型,同时还改进了多值属性的属性值相似度计算方法。通过基于开源语义数据集的实验,证明本章提出的基于相似度模型的共指分析方法能够取得更高的准确率。
     (3)提出了基于Marko逻辑网的共指分析方法,解决了链接数据共指分析中属性值相似度信息与语义约束信息有机结合的问题。设计了语义数据模式到Markov逻辑网的转换模型,以及相应的闭Markov逻辑网的构造方法。此外,还针对大规模数据集由于规模过大无法直接用于构造闭Markov逻辑网的问题,设计了优化的预匹配方法。预匹配可以大幅缩小闭Markov逻辑网的规模,提高共指分析的速度。实验表明,基于Markov逻辑网的共指分析方法在处理包含丰富语义约束信息的数据集时能够更全面地发现数据集中的共指关系。
     (4)对云计算环境中的资源弹性伸缩机制进行了研究,并基于上述两种共指分析方法设计实现了面向云计算环境的弹性共指分析引擎,作为链接数据云平台中的核心功能组件。该系统能够根据数据集中是否包含语义约束信息自动选择适合的共指分析方法,同时实现了共指分析作业的并行优化。基于动态集群和缓冲池的机制设计了系统的动态资源调度模型,以及相应的资源伸缩策略和作业调度算法,保证了系统的弹性。最后基于开源的云平台管理软件OpenStack部署共指分析引擎,并对其性能进行了测试和验证。
Linked data is a method to publish and share data in Internet based on semantictechnology. Semantic web not only expresses the data on the Internet in amachine-understandable way, but also makes links between the data to construct ahuge web of data with rich links. The web of data enables people to obtain knowledgeand information form Internet more intelligently and refinedly. The biggest differencebetween the web of data and the traditional internet are the object being linked and thetype of links. In the web of data, the object being linked changes from HTMLdocument to the URI referring to a specific thing, and the hypertext links also turninto typed RDF links containing explicit semantics. During the information sharing intraditional Internet, the granularity is always too coarse and the semantics of data aremissed. Linked data can be a good solution to the above problems, and will promotethe traditional Internet linking documents evolving into the web of data.
     Although there are already web of data being constructed by multiple areas basedon linked data technology, the in-depth development and application of linked data isstill faced with many problems and bottlenecks. First, the lack of data sources leads tothe slow growth of data scale. Second, the phenomenon of object coreferencewidespread in the heterogeneous semantic datasets hinders the automated building ofrich links between datasets. In this paper, the research work is focused on theapplication model of linked data and the technology of coreference resolution.
     The main work and research results include:
     (1)By introducing cloud computing into the building of the web of data, anapplication model of linked data based on cloud computing is proposed and thearchitecture of cloud based linked data platform is designed as the support of theinnovative model. Cloud based linked data platform supplies a variety of servicesneeded by the sharing of linked data to effectively reduce the technical threshold forordinary data owner sharing data based on linked data and support the building oflinked data sharing community across the Internet.
     (2)The study of the method of coreference resolution based on similarity modelindicates that the traditional methods have deficiencies in the computing of propertyweight and the processing of multi-valued properties. A new weight calculationmethod based on the distribution characteristics of property values described by Renyi entropy is proposed, and the similarity calculation method between the values ofmulti-valued properties is improved. Through the experiment based on the opensource RDF datasets, the advantage of the method presented by this chapter is proved.(3)A coreference resolution method based on Markov logic network is proposed. Theconversion model from the schema of semantic data to Markov Logic Network andthe corresponding ground method are designed. In addition, there are some datasetscan not be used directly to construct the ground Markov logic network because of thelarge-scale. This paper presented an optimized method of pre-match to narrow thematching range. The experiment show that Markov based method can perform betterwhen processing the dataset containing rich semantic constraints.(4)By studying the elastic telescopic mechanism of resource in cloud computingenvironment, an elastic coreference resolution system for cloud computingenvironment is designed based on the methods proposed by the above two chapters.The system can select automatically the appropriate method for coreference resolutionaccording to the characteristics of the dataset. The jobs in the system are optimizedbased on parallel mechanism to make full use of the computing resources. Thedynamic resource scheduling model is designed based on the mechanism of dynamiccluster and buffer pool. In addition, the corresponding elastic stretch strategy ofresource and job scheduling algorithm are also presented. Finally, the system isdeployed based on OpenStack which is an opensource management software forcloud computing, and the performance of the system is validated through some tests.
引文
[1] Grigoris Antoniou,语义网基础教程[M],机械工业出版社,2008
    [2] Berners-Lee, T.: Linked Data-Design Issues.http://www.w3.org/DesignIssues/LinkedData.html
    [3] Graham Klyne and Jeremy J. Carroll. Resource Description Framework (RDF): Conceptsand Abstract Syntax-W3C Recommendation,2004.
    [4] Tom Heath, Christian Bizer. Linked Data: Evolving the Web into a Global Data Space.Synthesisi Lectures on the Semantic Web: Theory and Technology
    [5] Linking Open Data(LOD),http://lod-cloud.net/
    [6] CKAN,http://datahub.io/
    [7] Alani H, Brewster C. Ontology ranking based on the analysis of concept structures. In: Proc.of the3rd Int’l Conf. on Knowledge Capture. Banff:ACM,2005.51-58.
    [8] Berners-lee, T. Semantic web road map. IW3C Design Issues. Cambridge, MA: W3C.Cambridge, MA: W3C.1998. Available at: http://www.w3.org/DesignIssues/Semantic.html.
    [9] Tim Berners-Lee. Semantic web-XML2000.2009.http://www.w3.org/2000/Talks/1206-xml2k-tbl/Overview.html
    [10] Signore O. Representing knowledge in the semantic web. Open Culture: Accessing andSharing Knowledge,2005.
    [11] Obitko M. Ontologies and semantic web.http://www.obitko.com/tutorials/ontologies-semantic-web/.2009.
    [12] Tim Berners-Lee. Axioms, architecture and aspirations, W3C all-working group plenarymeeting. http://www.w3.org/2001/Talks/0228-tbl/slide5-0.html.2009.
    [13] Berners-Lee T., Fielding R., Masinter L. Uniform Resource Identifier (URI): Generic Syntax.Internet Engineering Task Force,1998
    [14] The Unicode Standard: A Technical Introduction. http://unicode.org/standard/principles.html
    [15] John Cowan, Editor. Extensible Markup Language (XML)1.1. W3C CandidateRecommendation15October2002.
    [16] Frank Manola and Eric Miller. RDF Primer. W3C, http://www.w3c.org/TR/rdf-primer/,February2004.
    [17] D. Beckett. RDF/XML Syntax Specifcation (Revised)-W3C Recommendation.http://www.w3.org/TR/rdf-syntax-grammar/,2004.
    [18] E. Prud’hommeaux and A. Seaborne. SPARQL Query Language for RDF. W3CCandidate Rec.6April2006. http://www.w3.org/TR/rdf-sparql-query/.
    [19] D. Brickley and R. V. Guha. RDF Vocabulary Description Language1.0: RDF Schema-W3C Recommendation. http://www.w3.org/TR/rdf-schema/,2004.
    [20] Deborah L. McGuinness and Frank van Harmelen. OWL Web Ontology LanguageOverview-W3C Recommendation.http://www.w3.org/TR/2004/REC-owl-features-20040210/,2004.
    [21] P. F. Patel-Schneider, P. Hayes, and I. Horrocks. OWL Web Ontology Language Semanticsand Abstract Syntax-W3C Recommendation. http://www.w3.org/TR/owl-semantics/,2004.
    [22] M. Uschold and M. Gr¨uniger. Ontologies: Principles, methods and applications. KnowledgeEngineering Review,11(2):93–155,1996.
    [23] Ian Horrocks, Peter F. Patel-Schneider, and Harold Boley. SWRL: A Semantic Web RuleLanguage Combining OWL and RuleML-W3C Recommendation.2004
    [24] Michael Kifer, Harold Boley. RIF Overview-W3C Recommendation.2010
    [25] Giovanni Tummarello, Renaud Delbru, and Eyal Oren. Weaving the Open Linked Data. InProceedings of the6th International Semantic Web Conference,2007.
    [26] Chris Bizer, Tom Heath, Danny Ayers, and Yves Raimond. Interlinking open data on the web.In Demonstrations Track,4th European Semantic Web Conference, Innsbruck, Austria,2007.
    [27] Java, JRuby, Scala, and Clojure Edition. Practical Semantic Web and Linked DataApplications.2011
    [28] Shen Zhihong, Zhang Xiaolin. Linked Data and Its Applications: An Overview. NewTechnology of Library and Information Service,2010(11):1-9
    [29] Tim Berners-Lee. Putting government data online. http://www.w3.org/DesignIssues/GovData.html,2009.
    [30] John Sheridan and Jeni Tennison. Linking uk government data. In Proceedings of theWWW2010Workshop on Linked Data on the Web,2010.
    [31] L. Ding, et al., The Data-gov Wiki: A Semantic Web Portal for Linked Government Data, inISWC,2009.
    [32] Li Ding, Dominic DiFranzo, Alvaro Graves, James R. Michaelis, Xian Li, Deborah L.McGuinness, and JimHendler. Data-gov Wiki: Towards Linking Government Data.In:[BCH+10].(Cit.onp.27).
    [33] Jentzsch A, Zhao J, Hassanzadeh O, Cheung KH, Samwald M, Andersson B: Linking OpenDrug Data. Proceedings of the Second Triplification Challenge2009.
    [34] Zhao J. Publishing Chinese medicine knowledge as linked data on the web. ChineseMedicine2010,5:27
    [35] Maria Rüther, Joachim Fock, Joachim Hübener. Linked Environment Data.http://www.w3.org/egov/wiki/Linked_Environment_Data.2010
    [36] M. Ruther, T. Bandholtz, and A. Logean. Linked environment data for the life sciences.Arxiv preprint arXiv:1012.1620,2010.
    [37] Goodwin, J.; Dolbear, C.; and Hart, G.2009. Geographical linked data: The administrativegeography of great Britain on the semantic web. Transactions in GIS12.
    [38] Martin Malmsten. Exposing Library Data as Linked Data. the IFLA satellite preconferencesponsored by the Information Technology Section "Emerging trends in technology:librariesbetween Web2.0, semantic web and search technology".2009
    [39] Huang Yongwen. Research on Linked Data-driven Library Applications. New Technology ofLibrary and Information Service,2010(5):1-7
    [40] Bizer, C.&Cyganiak, R. D2R Server–Publishing Relational Databases on the SemanticWeb, Poster at the5th International Semantic Web Conference (ISWC2006).2006.
    [41] Fuseki: serving RDF data over HTTP,http://jena.apache.org/documentation/serving_data/index.html
    [42] Cyganiak, R.&Bizer, C. Pubby-A Linked Data Front-end for SPARQL Endpoints
    [EB/OL].[2012-1-5].http://www4.wiwiss.fu-berlin.de/pubby/.
    [43] Dave Reynolds, Ian Dickinson, Chris Dollin, et al, Elda–An implementation of the LinkedData API, http://code.google.com/p/elda/
    [44] R. Isele, J. Umbrich, C. Bizer, and A. Harth. Ldspider: An open-source crawling frameworkfor the web of linked data. In9th International Semantic Web Conference (ISWC2010),November2010.
    [45] Schultz, A., Matteini, A., Isele, R., Mendes, P., Bizer, C., Becker, C.: LDIF-A Frameworkfor Large-Scale Linked Data Integration.21st International World Wide Web Conference(WWW2012),2012.
    [46] Krotzsch, M., Vrandecic, D. and Volkel, M.2006. Semantic MediaWiki. Proceedings of theFifth International Semantic Web Conference, pp935-942, Springer, November2006.
    [47] Berners-Lee, T.; Chen, Y.; Chilton, L.; et al. Tabulator: Exploring and analyzing linked dataon the semantic web, In Procedings of the3rd International Semantic Web User InteractionWorkshop (SWUI06).2006.
    [48] OpenLink RDF Browser, http://demo.openlinksw.com/DAV/JS/rdfbrowser/index.html
    [49] G. Cheng, W. Ge, Y. Qu, Falcons: searching and browsing entities on the semantic web, in:WWW,2008.
    [50] G. Tummarello, R. Delbru, E. Oren. Sindice.com: weaving the open linked data. In:Proceedings of the ISWC/ASWC.(2007)
    [51] L. Ding, T. Finin, A. Joshi, R. Pan, R.S. Cost, Y. Peng, P. Reddivari, V. Doshi, J. Sachs.Swoogle: a search and metadata engine for the semantic web. In: Proceedings of the CIKM.2004.
    [52] Kelli de Faria Cordeiro, Fabricio Firmino de Faria, Bianca de Oliveira Pereira, et al., Anapproach for managing and semantically enriching the publication of Linked OpenGovernmental Data.2011.
    [53] Payam Barnaghi, Mirko Presser. Publishing Linked Sensor Data. ISWC2010
    [54] H Glaser and I C Millard, Rkb explorer: Application and infrastructure. Proceedings ofSemantic Web Challenge2007.
    [55] W. Alvey and B. Jamerson, editors. Record Linkage Techniques–1997. Federal Committeeon Statistical Methodology, Washington, D.C.,1997.
    [56] Winkler, W. Overview of Record Linkage and Current Research Directions. US Bureau of theCensus, Technical Report.2006
    [57]王厚峰.指代消解的基本方法和实现技术[J].中文信息学报,2002,16,(6):9-17
    [58] Azzam, Saliha, Kevin Humphreys&Robert Gaizauskas. Coreference resolution in amultilingual information extraction. Proceedings of the Workshop on Linguistic Coreference.Granada, Spain.1998
    [59] Jacobs I, Walsh N. Architecture of the World Wide Web. Vol.1. W3C Recommendation.15December2004. Lastest version: http://www.w3.org/TR/webarch/
    [60] Hogan, A., Polleres, A., Umbrich, J., and Zimmermann, A. Some entities are more equal thanothers: statistical methods to consolidate Linked Data. In4th International Workshop on NewForms of Reasoning for the Semantic Web: Scalable and Dynamic (NeFoRS2010).
    [61] Ferrara A, Lorusso D, Montanelli S. Automatic identity recognition in the semantic Web. In:Bouquet P, Halpin H, Stoermer H, Tummarello G, eds. Proc. of the1st Int’l Workshop onIdentity and Reference on the Semantic Web. Tenerife,2008.
    [62] Nikolov, A., et al. Integration of Semantically Annotated Data by the KnoFuss Architecture.In:16th International Conference on Knowledge Engineering and Knowledge Management,265-274,2008.
    [63] Raimond, Y., Sutton, C., Sandler, M. Automatic Interlinking of Music Datasets on theSemantic Web. In: Linked Data on the Web Workshop (LDOW2008),2008.
    [64] Volz, J., Bizer, C., Gaedke, M., Kobilarov, G.(2009): Silk–A Link Discovery Framework forthe Web of Data. Proceedings of the2nd Workshop on Linked Data on the Web(LDOW2009).
    [65] Song, D., Hefin, J.: Domain-independent entity coreference in RDF graphs. In: Proceedingsof the19th ACM Conference on Information and Knowledge Management (CIKM). pp.1821–1824(2010).
    [66] Nikolov, A., Uren, V., Motta, E. and Roeck, A. Overcoming Schema Heterogeneity betweenLinked Semantic Repositories to Improve Coreference Resolution,4th Asian Semantic WebConference (ASWC2009),2009.
    [67] Udrea O, Getoor L, Miller RJ. Leveraging data and structure in ontology integration. In:Chan CY, Ooi BC, Zhou A, eds. Proc. of the ACM SIGMOD Int’l Conf. on Management ofData. Beijing: ACM,2007.449460.
    [68] Glaser H, Jaffri A, Millard IC. Managing co-reference on the semantic Web. In: Proc. of the2nd Workshop on Linked Data on the Web. Madrid,2009.
    [69] Hogan A, Harth A, Decker S. Performing object consolidation on the semantic Web datagraph. In: Proc. of the1st Workshop on I3: Identity, Identifiers, Identification,Entity-Centric Approaches to Information and Knowledge Management on the Web. Banff,2007.
    [70] Oren E, Delbru R, Catasta M, Cyganiak R, Stenzhorn H, Tummarello G. Sindice.com: Adocument-oriented lookup index for open linked data. Int’l Journal of Metadata, Semanticsand Ontologies,2008,3(1):3752.
    [71] Tummarello G, Cyganiak R, Catasta M, Danielczyk S, Delbru R, Decker S. Sig.ma: Liveviews on the web of data. Journal of Web Semantics,2010,8(4):355364.
    [72] Sa s F, Pernelle N, Rousset M. L2R: A logical method for reference reconciliation. In: Proc.of the22ndAAAI Conf. onArtificial Intelligence. Vancouver:AAAI Press,2007.329334.
    [73] Nikolov A, Uren V, Motta E, de Roeck A. Refining instance coreferencing results using beliefpropagation. In: Domingue J, Anutariya C, eds. Proc. of the3rd Asian Semantic Web Conf.LNCS5367, Heidelberg: Springer-Verlag,2008.405419.
    [74]胡伟,柏文阳,瞿裕忠.语义Web中对象共指的消解研究[J].软件学报,2012,23(7):17291744
    [75] Euzenat J, Shvaiko P. Ontology Matching. Heidelberg: Springer-Verlag,2007.
    [76] Hu W, Chen JF, Qu YZ. A self-training approach for resolving object coreference on thesemantic Web. In: Proc. of the20th Int’l Conf. on World Wide Web. Hyderabad: ACM,2011.8796.
    [77] Ngomo AN, Auer S. LIMES—A time-efficient approach for large-scale link discovery on theWeb of data. In: Proc. of the22nd Int’l Joint Conf. on Artificial Intelligence. Barcelona:AAAI Press,2011.23122317.
    [78] Sa s F, Pernelle N, Rousset M. Combining a logical and a numerical method for datareconciliation. Journal on Data Semantics,2009, XII:6694.
    [79] Auer, S., et al.: Triplify–Light-Weight Linked Data Publication from Relational Databases.In: Proceedings of the18th International World Wide Web Conference,2009
    [80] S. Harris, N. Lamb, N. Shadbolt,4store: The design and implementation of a clustered RDFstore, in:5th International Workshop on Scalable Semantic Web Knowledge Base Systems(SSWS2009),2009.
    [81] d’Aquin, M., Baldassarre, C., Gridinoc, L., Angeletou, S., Sabou, M., Motta, E.: Watson: Agateway for next generation semantic web applications. In: Poster session at the InternationalSemantic Web Conference, ISWC2007,2007
    [82] M. P. Consens. Managing Linked Data on the Web: The LinkedMDB Showcase. In LA-WEB2008: Proceedings of the6th Latin-American Web Congress. IEEE Computer Society,2008.
    [83] Becker, C., Bizer, C., Erdmann, M., Greaves, M.: Extending smw+with a linked dataintegration framework. In: Proceedings of ISWC (2010)
    [84] Boss G, Malladi P, Quan D et al. Cloud Computing. IBM WhitePaper[EB/OL].http://download.boulder.ibm.com/ibmdl/pub/software/dw/wes/hipods/Cloud_computing_wp_final_80ct.
    [85] Buyya R, Yeo C S, Venugopal S. Market-oriented Cloud Computing: Vision, Hype andReality for Delivering it Services as Computing Utilities[C].10th IEEE InternationalConference on High Performance Computing and Communications,2008. HPCC'08,2008:5-13.
    [86]刘鹏.云计算(第二版)[M].北京:电子工业出版社,2011.
    [87]维基百科.云计算[EB/OL].2012. http://zh.wikipedia.org/wiki/
    [88] NIST. Special Publication800-145, A NIST Definition of Cloud Computing[EB/OL].2011.http://csrc.nist.gov/publications/PubsSPs.html#800-145
    [89] Fang Liu, Jin Tong, Jian Mao et al. NIST Cloud Computing Reference Architecture,Recommendations of the National Institute of Standards and Technology [J],2011
    [90] Michael Behrendt, Bernard Glasner, Petra Kopp et al. IBM Cloud Computing ReferenceArchitecture2.0[EB/OL].2011
    [91]张玉芳,彭时名,吕佳.基于文本分类TFIDF方法的改进与应用〔J].计算机工程,2006,32(19):76一78
    [92] Yang Yiming, Pederson J.O. A Comparative Study on Feature Selection in TextCategorization [A]. Proceedings of the14th International Conference on Machinelearning[C], Na-shville: Morgan Kaufmann,1997:412.420.
    [93] Kenneth,Ward Church and Patrick Hanks. Word association norms, mutual information andlexicography[C]. In: Proceedings of ACL27, pages76.83,Vancouver,Canada,1989.
    [94]周荫清.信息理论基础[M].北京航空航天大学出版社,1993.
    [95] Alfréd Rényi. On Measures of Entropy and Information[C]. In Proc. Fourth Berkeley Symp.on Math. Statist. And Prob., Univ. of Calif. Press,1961,1:547-561.
    [96]刁兴春,谭明超,曹建军.一种融合多种编辑距离的字符串相似度计算方法[J].计算机应用研究,2010.27(12):4523-4525.
    [97] Bernard M, Janodet J C, Sebban M. Learning Conditional Transducers for Estimating theDistribution of String Edit Costs[C].Grammatical Inference: workshop on open problems andnew directions.2006.
    [98] Oncina J, Sebban M. Learning stochastic edit distance: Application in handwritten characterrecognition[J]. Pattern Recognition,2006,39(9):1575-1587.
    [99] Hassanzadeh O, Miller R J. Creating probabilistic databases from duplicated data[J]. TheVLDB Journal—The International Journal on Very Large Data Bases,2009,18(5):1141-1166.
    [100] Jaro, M. A. Advances in record linkage methodology as applied to the1985census ofTampa Florida. Journal of the American Statistical Society,1989,84(406):414–20.
    [101] Cohen W W, Ravikumar P, Fienberg S E. A comparison of string distance metrics forname-matching tasks[C].Proceedings of the IJCAI-2003Workshop on InformationIntegration on the Web (IIWeb-03).2003:73-78.
    [102] Hadjieleftheriou M, Srivastava D. Weighted set-based string similarity[J]. IEEE Data Eng.Bull,2010,33(1):25-36.
    [103] Winkler, W.: The state of record linkage and current research problems[R]. Technical report,statistics of Income Division, Internal Revenue Service Publication (1999).
    [104]王海英,黄强,李传涛,等.图论算法及其Matlab实现[M].北京:北京航空航天大学出版社,2010:101-102.
    [105] Richardson.M, Domingos.P. Markov logic networks. Machine Learning.2006,62:107-136.
    [106] Domingos, P. and Richardson, M. Markov logic: A unifying framework for statisticalrelational learning. In Proceedings of the ICML-2004Workshop on Statistical RelationalLearning and its Connections to Other Fields, pp.2004.49-54
    [107] Pearl, J. Probabilistic reasoning in intelligent systems: Networks of plausible inference, SanFrancisco: Morgan Kaufmann.1988
    [108]徐从富,郝春亮,苏保君,楼俊杰.马尔可夫逻辑网络研究.软件学报.2011,22(8):1699-1713.
    [109] Richardson, M. Learning and Inference in Collective Knowledge Bases. In ComputerScience and Engineering (Vol. PhD Thesis), University of Washington.2004
    [110] F. Niu, C. Re, A. Doan, and J. Shavlik. Tuffy: scaling up statistical inference in Markovlogic networks using an RDBMS. PVLDB,4(6),2011
    [111] Gilks, W. R., Richardson, S.,&Spiegelhalter, D. J.(Eds.).(1996). Markov chain MonteCarlo in practice. London, UK: Chapman and Hall.
    [112] Walsh, B.(2004, April26). Markov Chain Monte Carlo and Gibbs Sampling. Retrieved July22,2011
    [113] Marinari, E., and Parisi, G.1992. Simulated tempering: A new Monte Carlo scheme.Europhysics Letters,19,451-458.
    [114] H. Poon and P. Domingos. Sound and efficient inference with Probabilistic anddeterministic dependencies. In Proeeedings of the twenty-First National Conference onArtificial Intelligence. Boston. AAAI Press.2006.458-463
    [115] Damien, P.; Wakefeld, J.; Walker, S.1999. Gibbs sampling for Bayesian non-conjugate andhierarchical models by auxiliary variables. Journal of the Royal Statistical Society B,61:2.
    [116] Liu, D. C.,&Nocedal, J.(1989). On the limited memory BFGS method for large scaleoptimization. Mathematical Programming,45,503–528.
    [117]楼俊杰,基于马尔科夫逻辑网的实体解析改进算法[J].计算机科学.2010年8月.
    [118] Lavraˇc, N.,&Dˇzeroski, S.(1994). Inductive logic programming: Techniques andapplications. Chichester, UK: Ellis Horwood.
    [119]孙舒杨,刘大有,孙成敏,黄冠利.统计关系学习模型Markov逻辑网综述[J].计算机应用研究.2007,2
    [120] Domingos, P., Richardson, M.: Markov Logic: A Unifying Framework for StatisticalRelational Learning. In: Dietterich, T.G., Getoor, L., Murphy, K.(eds.) Proceedings of theICML-2004Workshop on Statistical Relational Learning and its Connections to Other Fields(SRL2004), Banff, Alberta, Canada, July8,2004, pp.49–54(2004)
    [121] Singla, P.,&Domingos, P.(2005). Discriminative training of Markov logic networks. Proc.of20th Natl. Conf. onArtifcial Intelligence (AAAI-2005)(pp.868–873)
    [122] H. B. Newcombe, J. M. Kennedy, S. J. Axford, and A. P. James. Automatic linkage of vitalrecords. Science,130:954–959,1959
    [123] Fellegi, I. P., and Sunter, A. B.1969. A theory for record linkage. Journal of the AmericanStatistical Society64:1183–1210.
    [124] Singla, P., Domingos, P.: Entity resolution with Markov logic. In: Proc. ICDM-2006, pp.572–582. IEEE Computer Society Press, Los Alamitos,2006
    [125] L. Gravano, P. G. Ipeirotis, H. Jagadish, N. Koudas, S. Muthukrishnan, L. Pietarinen, and D.Srivastava. Using q-grams in a DBMS for approximate string processing. IEEE DataEngineering Bulletin,24(4):28–34, Dec.2001
    [126] Hayes, J.: A Graph Model for RDF. Diploma thesis, Technische Universit t Darmstadt,Department of Computer Science, Germany,2004
    [127] Sleeman J, Finin T. A machine learning approach to linking foaf instances[C]. SpringSymposium on Linked Data Meets AI. AAAI (January2010).2010.
    [128] Volz J, Bizer C, Gaedke M, et al. Discovering and maintaining links on the web of data[J].The Semantic Web-ISWC2009,2009:650-665.
    [129] Song D, Heflin J. Automatically generating data linkages using a domain-independentcandidate selection approach [J]. The Semantic Web–ISWC2011,2011:649-664.
    [130] Bondi A B. Characteristics of scalability and their impact on performance[C]. Proc. SecondInt’l Workshop on Software and Performance. ACM Press,2000:195-203
    [131]陈斌,白晓颖,马博等,分布式系统可伸缩性研究综述[J].计算机科学.2011年8月
    [132]叶枫,王志坚,徐新坤等,一种基于QoS的云负载均衡机制的研究[J].小型微型计算机系统.22(10).2012年10月
    [133]曾东海,刘海,金士尧,集群负载调度算法性能评价[J].计算机工程.32(11).2006年6月
    [134]周集良,彭小宁,王正华,基于集群的负载平衡调度算法研究与实现[J].计算机工程.31(12).2005年6月