基于三支决策的二阶段分类模型研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Research on two-stage classification model based on three-way decisions
  • 作者:徐久成 ; 徐战威 ; 李梦凡 ; 王楠
  • 英文作者:Xu Jiucheng;Xu Zhanwei;Li Mengfan;Wang Nan;College of Computer and Information Engineering;Henan Technology Research Center for Computational Intelligence and Data Mining,Henan Normal University;
  • 关键词:三支决策 ; 二阶段 ; 增量信息 ; 边界域
  • 英文关键词:three-way decisions;;two-stage;;incremental information;;boundary domain
  • 中文刊名:HNSX
  • 英文刊名:Journal of Henan Normal University(Natural Science Edition)
  • 机构:河南师范大学计算机与信息工程学院;河南省高校计算智能与数据挖掘工程技术研究中心;
  • 出版日期:2019-04-10 13:14
  • 出版单位:河南师范大学学报(自然科学版)
  • 年:2019
  • 期:v.47;No.206
  • 基金:国家自然科学基金(61370169;61402153;60873104);; 中国博士后科学基金项目(2016M602247);; 河南省科技攻关重点项目(162102210261)
  • 语种:中文;
  • 页:HNSX201903005
  • 页数:8
  • CN:03
  • ISSN:41-1109/N
  • 分类号:34-40+130
摘要
当对三支决策边界域进一步划分时,边界域知识存在划分信息不足,从而导致分类精度不高,针对上述问题提出一种新的基于三支决策的二阶段分类模型(TWD-TP).第一阶段根据贝叶斯规则构建三支决策中样本的条件概率,通过求解最优化损失函数得到所需阈值,然后按照三支决策规则对数据集进行划分.三支决策是基于最小风险贝叶斯决策理论的划分,在其正域、负域中包含一定的误分类样本;在第二阶段通过类标签索引分别将正域、负域中误分样本作为增量信息引入延迟决策域,形成重构边界域,最后对重构边界域进行划分.实验结果表明:所提出的TWD-TP模型不仅能在三支决策划分中筛选出高误分类特征的样本,同时其重构边界域中不能被划分的样本得到正确划分,分类精度进一步提高.
        Aiming at the further division of the three-way decisions boundary domains,the problem of insufficient classification accuracy of the boundary knowledge of the three-way decisions caused by insufficient information,this paper proposes a new two-stage classification model based on three-way decisions(TWD-TP).In the first stage,the conditional probabilities of samples in three-way decisions are constructed by Bayesian rule,the required thresholds are obtained by solving the optimal loss function.Then the data sets are divided according to the three decision rules.However,the three-way decisions are based on the division of least-risk Bayes decision theory,including some misclassified samples in positive and negative domains.In the second phase,the samples of misclassification in positive domain and negative domain are introduced into the delayed decision domain as incremental information by class label index to construct new boundary domain,that is,reconstruction boundary domain.Finally,the classifier is used to perform classification verification on the reconstruction boundary domain objects.The experimental results show that the TWD-TP model proposed in this paper can not only filter out the samples with high misclassification features in the three-way decisions division,but also can correctly divide the previously undivided samples in the reconstruction boundary and improve the classification accuracy.
引文
[1]Yao Y Y.An outline of a theory of three-way decisions[C]//Rough Sets and Current Trends in Computing 8th international conference.Berlin:Springer,2012:1-17.
    [2]王思华,杨桐,段启凡,等.基于DT法和粗糙集理论的接地网安全性状态评定[J].电力系统保护与控制,2017,45(2):48-54.
    [3]Jia X Y,Liao W H,Tang Z M,et al.Minimum cost attribute reduction in decision-theoretic rough set models[J].Information Sciences,2013,219(1):151-167.
    [4]Li W,Miao D Q,Wang W L,et al.Hierarchical rough decision theoretic framework for text classification[C]//Proceedings of The 9th International Conference on Cognitive Informatics.piscataway:IEEE Press,2010:484-489.
    [5]Li H X,Zhou X Z,Zhao J B,et al.Cost-sensitive classification based on decision-theoretic rough set model[C]//Proceedings of The 7th International Conference on Rough Sets and Knowledge Technology.Berlin:Springer,2012:379-388.
    [6]Yao Y Y.Three-way decisions with probabilistic rough sets[J].Information Sciences,2010,180(3):341-353.
    [7]Yao Y Y.The superiority of three-way decisions in probabilistic rough set models[J].Information Sciences,2011,181(6):1080-1096.
    [8]Huang J J,Wang J,Yao Y Y,et al.The cost sensitive three-way recommendations by learning pair wise preferences[J].International Journal of Approximate Reasoning,2017,86:28-40.
    [9]仇国芳,王小宁.基于三支决策的医院分级诊疗决策研究[J].河南师范大学学报(自然科学版),2018,46(3):106-111.
    [10]陈夏艳,陈洁.基于代价敏感边界域处理的社团发现算法[J].数码设计,2017,6(3):1672-9129.
    [11]Li Y F,Zhang L B,Xu Y,et al.Enhancing binary classification by modeling uncertain boundary in three-way decisions[J].IEEE Transactions on Knowledge and Data Engineer,2017,29(7):1438-1451.
    [12]徐久成,刘洋洋.基于三支决策的支持向量机增量学习方法[J].计算机科学,2015,42(6):82-87.
    [13]徐存东,张锐,王荣荣,等.基于改进支持向量机的盐碱地信息精确提取方法研究[J].灌溉排水学报,2018,37(9):62-68.
    [14]Li W W,Huang Z Q,Jia X Y.Two-phase classification based on three-way decisions[C]//International Conference on Rough Sets&Knowledge Technology.Berlin:Spinger,2013:338-345.
    [15]Li W W,Huang Z Q,Li Q.Three-way decisions based software defect prediction[J].Knowledge-Based Syetems,2016,91:263-274.
    [16]刘盾,姚一豫,李天瑞.三支决策粗糙集[J].计算机科学,2011,38(1):246-250.
    [17]Liu D,Li T R,Liang D C.Incorporating logistic regression to decision-theoretic rough sets for classification[J].International Journal of Approximate Reasoning,2013,559(1):197-210.
    [18]Yao Y Y.Three-way decisions with probabilistic rough sets[J].Information Sciences,2010,180(3):341-353.
    [19]Jia X Y,Shang L.Three-way decisions versus two-way decisions on filtering spam email[M].London:Transctions on Rough Sets,2014:69-91.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700