基于动态误分类代价下的代价敏感属性选择
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Cost-Sensitive Feature Selection Based on Dynamic Misclassification Costs
  • 作者:牛军霞 ; 王敬前
  • 英文作者:NIU Junxia;WANG Jingqian;Shanxi Fashion Engineering University;
  • 关键词:代价敏感学习 ; 属性选择 ; 动态误分类代价 ; 模拟退火算法
  • 英文关键词:cost-sensitive learning;;feature selection;;dynamic misclassification cost;;simulated annealing algorithm
  • 中文刊名:SMSE
  • 英文刊名:Peak Data Science
  • 机构:陕西服装工程学院;
  • 出版日期:2016-12-15
  • 出版单位:数码设计
  • 年:2016
  • 期:v.5
  • 基金:国家自然科学基金面上项目(61379049,61379089);; 陕西省教育厅自然科学专项项目(16JK2015);; 陕西服装工程学院专项科研项目(2016KY019)资助
  • 语种:中文;
  • 页:SMSE201603005
  • 页数:6
  • CN:03
  • ISSN:11-5292/TP
  • 分类号:26-31
摘要
当前代价敏感算法普遍采用静态的误分类代价,而静态的误分类代价局限性很强。这主要表现在:过度拟合、不能反映数据集真实的类分布特征。针对静态误分类代价的不足,本文首先提出一种动态误分类代价机制。该机制根据不同的测试代价自适应生成四种不同的动态误分类代价函数,并以最小总代价为目标。其次,我们在动态误分类下重新定义了最小总代价的属性选择问题。最后我们提出了一个模拟退火算法解决了该问题。实验结果证明,该设计方案可有效地选出最优误分类代价,以保证所选属性集合具有最小的平均总代价。
        The current cost sensitive algorithm generally used static misclassification cost. However, the limitations of the static misclassification cost are so strong, which mainly displays in excessive fitting and cannot reflect the actual class distribution characteristics of the datasets. First, aiming at the shortcomings of the stationary misclassification cost, this paper puts forward a dynamic mechanism of misclassification cost. With this in mind, in order to minimize average total cost, four different misclassification cost functions are adaptively computed according to different test costs. And then, we redefine the minimal total cost feature selection problem on the dynamic misclassification cost. Finally, a simulated annealing algorithm is designed to deal with this problem. The experiment alresults show that the designed algorithm can select the feature set with the optimal misclassification cost, which leads to the lowest average total costs.
引文
[1]Turney P D.Cost-sensitive classification:empirical evaluation of a hybrid genetic decision tree induction algorithm[J].Journal of Artificial Intelligence Research,1995,2:369-409.
    [2]Li Y F,Kwok J T,Zhou Z H.Cost-Sensitive Semi-Supervised Support Vector Machine[C].AAAI.2010,10:500-505.
    [3]Qian Y,Liang J,Pedrycz W,Dang C.Positive approximation:An accelerator for attribute reduction in rough set theory[J].Artificial Intelligence,2010,174(9):597-618.
    [4]Zhang W X,Mi J S,Wu W Z.Knowledge reductions in consistent systems[J].Chinese Journal of Computers,2003,26(1):12-18.
    [5]Ziarko W.Variable precision rough set model[J].Journal of Computer and System Sciences,1993,46(1):39-59.
    [6]苗夺谦,李道国.粗糙集理论、算法与应用[M].北京:清华大学出版社,2008.
    [7]Yao Y.Decision-theoretic rough set models[M].Rough Sets and Knowledge Technology.Springer Berlin Heidelberg,2007:1-12.
    [8]Yao Y,Wong S K M.A decision theoretic framework for approximating concepts[J].International Journal of Man-machine Studies,1992,37(6):793-809.
    [9]Yao Y,Zhao Y.Attribute reduction in decision-theoretic rough set models[J].Information sciences,2008,178(17):3356-3373.
    [10]Cornelis C,Jensen R,Hurtado G.Attribute selection with fuzzy decision reducts[J].Information Sciences,2010,180(2):209-224.
    [11]Zhao H,Min F,Zhu W.Test-cost-sensitive attribute reduction based on neighborhood rough set[J].Gr C,2011,802-806.
    [12]Zhao H,Min F,Zhu W.Test-cost-sensitive attribute reduction of data with normal distribution measurement errors[J].Mathematical Problems in Engineering 2013,2013:1-12.
    [13]Zhao H,Min F,Zhu W.Cost-sensitive feature selection of numeric data with measurement errors[J].Journal of Applied Mathematics,2013,2013.
    [14]Zhou Z H,Liu X Y.Training cost-sensitive neural networks with methods addressing the class imbalance problem[J].Knowledge and Data Engineering,IEEE Transactions on,2006,18(1):63-77.
    [15]林姿琼,赵红.代价敏感最优误差边界选择[J].计算机科学与探索,2013,7(12):1146-1152.
    [16]陈晓林.基于动态代价敏感的机器学习研究[D].华中科技大学,2010.
    [17]Min F,Liu Q.A hierarchical model for test-cost-sensitive decision systems[J].Information Sciences,2009,179(14):2442-2452.
    [18]Min F,He H P,Qian Y,et al.Test-cost-sensitive attribute reduction[J].Information Sciences,2011,181(22):4928-4942.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700