强化属性依赖关系的K阶贝叶斯分类模型

英文篇名：K-dependence Bayesian classifiers for strengthening attribute dependencies
作者：王利民 ; 姜汉民
英文作者：WANG Li-min;JIANG Han-min;School of Computer Science and Technology,Jilin University;
关键词：智能推理 ; 贝叶斯网络 ; 属性依赖强化 ; 属性次序优化 ; 属性约简 ; 冗余属性
英文关键词：intelligent reasoning;;Bayesian networks;;attribute-dependent reinforcement;;attributes order optimization;;attributes reduction;;redundant attributes
中文刊名：KZYC
英文刊名：Control and Decision
机构：吉林大学计算机科学与技术学院;
出版日期：2018-04-16 09:32
出版单位：控制与决策
年：2019
期：v.34
基金：国家自然科学基金项目(61272209);; 吉林省自然科学基金项目(20150101014JC)
语种：中文;
页：KZYC201906014
页数：7
CN：06
ISSN：21-1124/TP
分类号：116-122

摘要

经典K阶贝叶斯分类模型(KDB)进行属性排序时,仅考虑类变量与决策属性间的直接相关,而忽略以决策属性为条件二者之间的条件相关.针对以上问题,在KDB结构的基础上,以充分表达属性间的依赖信息为原则,强化属性间的依赖关系,提升决策属性对分类的决策表达,利用类变量与决策属性间的条件互信息优化属性次序,融合属性约简策略剔除冗余属性,降低模型结构复杂带来的过拟合风险,根据贪婪搜索策略选择最优属性并构建模型结构.在UCI机器学习数据库中数据集的实验结果表明,该模型相比于KDB而言,具有更好的分类精度和突出的鲁棒性.
When the attributes are ordered, the K-dependence Bayesian(KDB) classifiers only consider direct dependence between the classes and the decision attributes, but neglect the conditional correlation between them with the decision attribute as a condition. To fully express the dependency relationship between the classes and attributes and improve the expression of decision attributes for classification, we use conditional relationship between the classes as and the decision attributes to find an optimal attribute order based on the KDB structure. Besides, attribute selection is used to remove redundant attributes and prevent over-fitting. The structure learning and attribute selection are carried out based on the greedy search strategy. The proposed algorithm is tested on several UCI datasets and compared with other classical methods. The results show that the proposed approach obtains higher classification accuracy and outstanding robustness.

引文

[1]Zhao Y,Chen Y,Tu K,et al.Learning Bayesian network structures under incremental construction curricula[J].Neurocomputing,2017,258(1):30-40.
    [2]Fan X B,Li X.Network tomography via sparse Bayesian learning[J].IEEE Communications Letters,2017,21(4):781-784.
    [3]肖蒙,张友鹏.基于因果影响独立模型的贝叶斯网络参数学习[J].控制与决策,2015,30(6):1007-1013.(Xiao M,Zhang Y P.Parameters learning of Bayesian networks based on independence of causal influence model[J].Control and Decision,2015,30(6):1007-1013.)
    [4]Zhao Y,Chen Y,Tu K,et al.Curriculum learning of Bayesian network structures[C].Asian Conf on Machine Learning.Hong Kong:ACML,2015:269-284.
    [5]Hagmayer Y.Causal Bayes nets as psychological theories of causal reasoning:Evidence from psychological research[J].Synthese,2016,193(4):1107-1126.
    [6]Pearl J.Probabilistic reasoning in intelligent systems:Networks of plausible inference[C].Networks of Plausible Inference.San Francisco:Morgan Kaufmann,1988:383-408.
    [7]Zhang H.The optimality of naive Bayes[C].Proc of 17th Int Florida Artificial Intelligence Research Society Conf.Miami Beach:AAAI Press,2004:562-567.
    [8]Jiang L,Li C,Wang S,et al.Deep feature weighting for naive Bayes and its application to text classification[J].Engineering Applications of Artificial Intelligence,2016,52(6):26-39.
    [9]Jiang L,Cai Z,Wang D,et al.Improving Tree augmented Naive Bayes for class probability estimation[J].Knowledge-Based Systems,2012,26(2):239-245.
    [10]de Campos C P,Corani G,Scanagatta M,et al.Learning extended tree augmented naive structures[J].Int J of Approximate Reasoning,2016,68(1):153-163.
    [11]Rodriguez J D,Perez A,Lozano J A.Sensitivity analysis of k-fold cross validation in prediction error estimation[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2010,32(3):569-575.
    [12]Sahami M.Learning limited dependence Bayesian classifiers[C].Proc of the 2nd Int Conf on Knowledge Discovery and Data Mining.Portland:ACM,1996:335-338.
    [13]Martnez A M,Webb G I,Chen S,et al.Scalable learning of Bayesian network classifiers[J].J of Machine Learning Research,2016,17(44):1-35.
    [14]Naghibi T,Hoffmann S,Pfister B.A semidefinite programming based search strategy for feature selection with mutual information measure[J].IEEE Trans on Pattern Analysis and Machine Intelligence,2015,37(8):1529-1541.
    [15]Tabar V R,Mahdavi M,Heidari S,et al.Learning bayesian network structure using genetic algorithm with consideration of the node ordering via principal component analysis[J].J of the Iranian Statistical Society,2016,15(2):45-62.
    [16]Zeng Z,Zhang H,Zhang R,et al.A hybrid feature selection method based on rough conditional mutual information and naive Bayesian classifier[J].Isrn Applied Mathematics,2015,2014(1/2/3/4):36-46.
    [17]Wang L,Zhao H.Learning a flexible K-dependence Bayesian classifier from the chain rule of joint probability distribution[J].Entropy,2015,17(6):3766-3786.
    [18]Vinh N X,Zhou S,Chan J,et al.Can high-order dependencies improve mutual information based feature selection?[J].Pattern Recognition,2016,53(C):46-58.
    [19]Ziarko W.Attribute reduction in the Bayesian version of variable precision rough set model[J].Electronic Notes in Theoretical Computer Science,2003,82(4):263-273.
    [20]Shannon C E,Weaver W.The mathematical theory of communication[M].Illinois:University of Illinois Press,1949:81-108.
    [21]Chickering D M,Heckerman D,Meek C.Large-sample learning of Bayesian networks is NP-hard[J].J of Machine Learning Research,2004,5(10):1287-1330.
    [22]Zhang S,Mccullagh P,Callaghan V.An efficient feature selection method for activity classification[C].Proc of the 10th Int Conf on Intelligent Environments.Shanghai:IEEE,2014:16-22.
    [23]Kriegel H P,Kunath P,Renz M.Probabilistic nearest-neighbor query on uncertain objects[C].Int Conf on Database Systems for Advanced Applications.Bangkok:DBLP,2007:337-348.
    [24]Murphy S L,Aha D W.UCI repository of machine learning databases[EB/OL].[2017-10-10].http://archive.ics.uci.edu/ml/datasets.html.
    [25]Fayyad U M,Irani K B.Multi-interval discretization of continuous-valued attributes for classification learning[C].Proc of the 13th Int Joint Conf on Artificial Intelligence.Chambéry:Morgan Kaufmann,1993:1022-1029.
    [26]Hu B,RaKthanmanon T,Hao Y,et al.Towards discovering the intrinsic cardinality and dimensionality of time series using MDL[M].Berlin Heidelberg:Springer,2013:184-197.
    [27]Duan Z,Wang L.K-dependence Bayesian classifier ensemble[J].Entropy,2017,19(12):651.
    [28]Li L,Zhang L,Li G,et al.Probabilistic classifier chain inference via gibbs sampling[C].Proc of the 23rd ACMInt Conf on Information and Knowledge Management.Shanghai:ACM,2014:1855-1858.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700