贝叶斯网络中的因果推断

英文题名：Causal Reasoning in Bayesian Networks
作者：辛国福
论文级别：硕士
学科专业名称：应用数学
中文关键词：贝叶斯网络 ; 因果识别 ; d-分割 ; uud-分割
英文关键词：Bayesian network (BN) ; Identifiability for causal effect ; D-separation ; Ud-separation
学位年度：2011
导师：杨有龙
学科代码：070104
学位授予单位：西安电子科技大学
论文提交日期：2011-01-01

摘要

贝叶斯网络是概率论和图论有机融合的概率图形模型,已被广泛应用于人工智能、统计学习、因果识别等领域.如何有效地分析数据中的因果关系进行因果推断是数据分析中的研究热点问题.本文针对贝叶斯网络中的因果推断展开研究,主要工作有:
     首先,分析了描述贝叶斯网络中独立关系的d-分割和ud-分割的内在关系,得到d-分割是ud-分割的充分不必要条件,给出了d-分割和ud-分割同时成立的条件,利用分层排序可以得到一个贝叶斯网络的拓扑序,而且对于某些特殊的贝叶斯网络可以很方便的得到d-分割和ud-分割集来识别因果效应.
     其次,研究因果图模型中变量间的因果效应的识别和估计问题,并对前门准则和后门准则之间的关系进行了分析.
     最后,描述了SGS和PC结构学习算法,对它们的稳定性和复杂性进行了分析.在分析结构学习算法CI和FCI的基础上,通过一个构建的网络说明FCI算法存在缺陷,并对FCI算法进行了改进.
Bayesian networks (BNs) are a marriage between probability theory and graphtheory, and thus are probabilistic graphical models. They are widely used in artificialintelligence, statistics learning and identifiability for causal effect. How to efficientanalyzing the relationship among nodes and then to causal reasoning is a central issuein dada analysis. This paper deals with causal reasoning in BNs, the main task is asfollows:
     Firstly, the relationship between d-separation and ud-separation which are twoimportant graph criteria in BNs is discussed in detail. The condition that directionalseparation is the sufficient condition for the unidirectional separation is obtained. Wepropose one condition that both directional separation and unidirectional separationhold. By using layer sorting the nodes of a Bayesian network, we can get a Bayesiannetwork’s topological sequence and find d-separation and ud-separation sets to indentifydirect causal effect quickly.
     What’s more, we use causal model to research the identifiability for causal effectamong nodes. The relationship between front-door and back-door criterion is analyzed.
     In addition, two structure learning algorithms for causal discovery are introduced,the correctness complexity and stability of SGS and PC are analyzed. It is proved by aspecially constructed network to show that the FCI algorithm is defected based on theanalysis of CI and FCI, a remedy of this algorithm is proposed.

引文

[1]张连文,郭海鹏.贝叶斯网引论[M].科学出版社,2006.
    [2]宫秀军.贝叶斯学习理论及其应用研究．博士论文．中国科学院研究院．2002．
    [3] Friedman N, Geiger D. Bayesian Network Classifiers. Machine Learning.1997,29:131-161．
    [4] Wong Man Leung, Lee Shing Yan. Data mining of Bayesian networks usingcooperative coevolution [J]. Decision Support Systems.2004, 38:451-472.
    [5] Pernkopf, Franz．Bayesian network classifiers versus selective Formula Not ShownNN classifier [J]. PaRern Recognition.2005, 38:l-10.
    [6]王利民.贝叶斯学习理论中若干问题的研究.博士论文.吉林大学.2005.
    [7] E Rigdon. A necessary and sufficient identication rule for structural modelsestimated in practice [J]. Multivariate Behavioral Research. 1995,30:359-383.
    [8] McDonald, R P McDonald. Haldane's lungs: A case study in path analysis [J].Multivariate Behavioral Research. 1997, 32 (1):1-38.
    [9] Pearl J. Graphs, causality, and structural equation models [J]. Socioligical Methodsand Research. 1998, 27: 226-284.
    [10]P Spirtes, T Richardson, C Meek, R Scheines, and C Glymour. Using path diagramsas a structural equation modeling tool [J]. Socioligical Methods and Research.1998, 27:182-225.
    [11] Pearl J. Causality: Models, Reasoning, and Inference [M]. Cambridge UniversityPress. NewYork, 2000.
    [12] R J Bowden, D A Turkington. Insturmental variables [M]. Cambridge UniversityPress, Cambridge, England. 1984.
    [13] J Tian. Identifying linear causal effects. In Proceedings of the National Conferenceon Artificial Intelligence (AAAI), AAAI Press/The MIT Press, 2004. 104-110.
    [14] C Brito and J Pearl. Generalized instrumental variables. In Proceedings of theEighteenth Conference on Uncertainty in Artificial Intelligence(UAI), San Francisco,CA, 2002. 85-93.
    [15] C Brito, J Pearl. A graphical criterion for the identification of causal effects inlinear models. In Proceedings of the Eighteenth National Conference on ArtificialIntelligence (AAAI), Menlo Park, CA.AAAI Press/ The MIT Press. 2002. 533-538.
    [16] C Brito, J Pearl. A new identification condition for recursive models with correlatederrors. Structural Equation Modeling, 2002, 9(4):459-474.
    [17]J. Tian. Identifying direct causal effects in linear models. In Proceedings of theNational Conference on Artificial Intelligence (AAAI). AAAI Press/The MIT Press,2005.
    [18]C Brito and J Pearl. Graphical condition for identification in recursive SEM. InProceedings of the Conference on Uncertainty in Artificial Intelligence (UAI),Corvallis, OR, 2006. AUAI Press.
    [19]S Wright. The method of path coefficients. Ann. Math. Statist, 1934, 5:161-215.
    [20]Tian J. A criterion for parameter identification in structural equation models.Technical Report T-2007-1, Department of Computer Science, Iowa State University,2007.
    [21]Nihat Ay, Daniel Polani. Information Flows IN Causal Networks [J]. Advances inComplex Systems, 2008, 11(1): 17-41.
    [22]David Maxwell Chickering, Christopher Meek. On the incompatibility offaithfulness and monotone DAG faithfulness [J]. Artificial Intelligence, 2006, 170(8): 653– 666.
    [23]Geng Z, He Y B, Wang X L. Relationship of causal effects in a causal chain andrelated inference. Science in China[J]. 2004, 47A: 730–740.
    [24]J. Pearl. Causality: Models, Reasoning and Inference [M]. Cambridge UniversityPress, Cambridge, 2000.
    [25]Hei Chan, Manabu Kuroki. Using Descendants as Instrumental Variables for theIdentification of Direct Causal Effects in Linear SEMs(C). International Conferenceon Artificial Intelligence and Statistics (AISTATS). Chia Laguna Resort, Sardinia,Italy. 2010.
    [26]杨有龙,刘蔚,吴艳.贝叶斯网络的非忠实性分布[J].智能系统学报,2009,4(4):335-338.
    [27]赵慧,郑忠国. Gauss因果模型中因果效应识别方法的比较[J].数学物理学报.2008, 28 (5) :808–817.
    [28]Levitz M, Perlman M D, Madigan D. Separation and completeness properties forampchain graph Markov [J]. The Annals of Statistics, 2001, 29(6):1751-1784.
    [29]A Gopnik, C Glymour, D M Sobel, L E Schulz, T Kushnir, D Danks. A theory ofcausal learning in children: Causal maps and Bayes nets. Psychological review, inpress. NewYork.
    [30]P Spirtes, C Glymour, R Scheines. Causation, Prediction, and Search[M].MIT Press,Cambridge, MA, 2001.
    [31]Zheng Z G, Zhang Y Y, Tong X W. Identifiability of causal effect for a simplecausal model [J]. Science in China, 2002, 45A: 335–341.
    [32]J Zhang, P Spirtes. Strong faithfulness and uniform consistency in causal inference.In UAI, 2003, 632–639.
    [33]P Spirtes, C Meek, T Richardson. Causal inference in the presence of latentvariables and selection bias. In Philippe Besnard and Steve Hanks, editors,Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence. SanMateo, CA, Morgan Kaufmann. 1995, 491–498.
    [34]Cooper G F, Herkovits E. A Bayesian method for the induction of probabilisticnetworks from data [J]. Machine Learning. 1992, (9):309-347.
    [35]Tian J, Pearl J. A general identification condition for causal effects. In Proceedingsof the eighteenth national conference on artificial intelligence 470 (AAAI-02), 2002,567–573.
    [36]Tian J, Pearl J. On the testable implications of causal models with hiddenvariables. In Proceedings of the eighteenth annual conference on uncertainty inartificial intelligence (UAI-02), 2002, 519–527.
    [37] T Verma. Graphical aspects of causal models. Cognitive Systems Laboratory,UCLA.1993, Technical Report R-191.
    [38]G Elidan, N Friedman. Learning the dimensionality of hidden variables. InProceedings of the 17th Conference in Uncertainty in Artificial Intelligence,2001,144–151.
    [39]R Silva, R Scheines, C Glymour, P Spirtes. Learning the structure of linear latentvariable models [J]. Journal of Machine Learning Research, 2006, (7):191–246.
    [40]X Boyen, N Friedman, D Koller. Discovering the hidden structure of complexdynamic systems. In Proceedings of the 15th Conference on Uncertainty inArtificial Intelligence. 1999.
    [41]T Verma. Graphical aspects of causal models. Technical Report R-191, CognitiveSystems Laboratory, UCLA, 1993.
    [42]Lauritzen S L. Causal Inference from Graphical Models[M]. In Complex StochasticSystems.2001, 63–107.
    [43] R Silva, R Scheines, C Glymour, P Spirtes. Learning the structure of linear latentvariable models [J]. Journal of Machine Learning Research. 2006 (7): 191–246.
    [44] Tian. J, Pearl. J. On the identification of causal effects, Technical report 475290-L, tech. rep. Cognitive Systems Laboratory, University of California at LosAngeles. 2003.
    [45] Klopotek M A. Learning Belief Network Structure Under Causal Insufficienty.Sperverlag.1994, 379-382.
    [46]Spirtes P, Glymour C, Scheines R. Causation, Prediction and Search. Lecture Notesin Statics 81.Springer-Verlag.1993.
    [47]Cooper G F, Herskovits E. A Bayesian method for the induction of probabilisticnetworks from data [J]. Machine Learning 9, 1992, 309-347.
    [48]Schter R D. Evidence absorption and propagation through evidence reversals.Elsevier Science Publishers B.V.1990, 173-190.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700