基于引力模型的多标签分类算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Multi-label classification algorithm based on gravitational model
  • 作者:李兆玉 ; 王纪超 ; 雷曼 ; 龚琴
  • 英文作者:LI Zhaoyu;WANG Jichao;LEI Man;GONG Qin;College of Communication and Information Engineering, Chongqing University of Posts and Telecommunications;
  • 关键词:多标签分类 ; 标签相关性 ; 引力模型 ; 近邻密度 ; 近邻权重
  • 英文关键词:multi-label classification;;label correlation;;gravitation model;;neighbor density;;neighbor weight
  • 中文刊名:JSJY
  • 英文刊名:Journal of Computer Applications
  • 机构:重庆邮电大学通信与信息工程学院;
  • 出版日期:2018-07-25 08:26
  • 出版单位:计算机应用
  • 年:2018
  • 期:v.38;No.338
  • 基金:长江学者和创新团队发展计划项目(IRT_16R72)~~
  • 语种:中文;
  • 页:JSJY201810011
  • 页数:6
  • CN:10
  • ISSN:51-1307/TP
  • 分类号:61-65+75
摘要
针对多标签分类算法不能充分利用标签相关性的问题,通过建立标签的正、负相关性矩阵来挖掘标签间不同的相关关系,提出一种基于引力模型的多标签分类算法(MLBGM)。首先,遍历训练集中所有样本并分别求取每个训练样本的k个近邻样本,组成该样本的近邻集合;其次,根据每个样本的近邻集合中所有近邻样本的标签分布情况,分别为每个训练样本建立正、负相关矩阵来获取标签间的相关性;然后,为每个训练样本的近邻集合计算其近邻密度和近邻权重;最后,采用计算数据粒子间相互作用力的方式构建多标签分类模型。实验结果显示,MLBGM与5种未考虑标签负相关的对比算法相比,汉明损失(Hamming Loss)平均降低了15. 62%,微平均F1值(MicroF1)平均提升了7. 12%,子集准确率(Subset Accurary)平均提升了14. 88%。MLBGM充分利用了标签间不同的相关性,获得了有效的实验结果且分类效果优于未考虑标签负相关的对比算法。
        Aiming at the problem that multi-label classification algorithms cannot fully utilize the correlation between labels, a new multi-label classification algorithm based on gravitational model namely MLBGM was proposed, by establishing the positive and negative correlation matrices of labels to mine different correlations among labeled. Firstly, by traversing all samples in the training set, k nearest neighbors for each training sample were obtain. Secondly, according to the distribution of labels in all neighbors of each sample, positive and negative correlation matrices were established for each training sample.Then, the neighbor density and neighbor weights for each training sample were calculated. Finally, a multi-label classification model was constructed by calculating the interaction between data particles. The experimental results show that the Hamming Loss of MLBGM is reduced by an average of 15. 62% compared with 5 contrast algorithms that do not consider negative correlation between labels; on the MicroF1, the average increase is 7. 12%; on the Subset Accuracy, the average increase is 14. 88%. MLBGM obtains effective experimental results and outperforms comparison algorithms as it makes full use of the different correlations between labels.
引文
[1]SOUZA A F D,PEDRONI F,OLIVEIRA E,et al.Automated multi-label text categorization with VG-RAM weightless neural networks[J].Neurocomputing,2009,72(10/11/12):2209-2217.
    [2]WU B,LYU S,HU B G,et al.Multi-label learning with missing labels for image annotation and facial action unit recognition[J].Pattern Recognition,2015,48(7):2279-2289.
    [3]ALVES R T,DELGADO M R,FREITAS A A.Multi-label hierarchical classification of protein functions with artificial immune systems[C]//Proceedings of the Third Brazilian Symposium on Bioinformatics.Berlin:Springer,2008:1-12.
    [4]TSOUMAKAS G,KATAKIS I,TANIAR D.Multi-label classification:an overview[J].International Journal of Data Warehousing&Mining,2008,3(3):1-13.
    [5]BOUTELL M R,LUO J,SHEN X,et al.Learning multi-label scene classification[J].Pattern Recognition,2004,37(9):1757-1771.
    [6]ZHANG M L,ZHANG K.Multi-label learning by exploiting label dependency[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2010:999-1008.
    [7]ZHANG M L,ZHOU Z H.ML-kNN:a lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [8]ZHANG M L,ZHOU Z H.Multi-label neural networks with applications to functional genomics and text categorization[J].IEEETransactions on Knowledge&Data Engineering,2006,18(10):1338-1351.
    [9]WRIGHT W E.Gravitational clustering[J].Pattern Recognition,1977,9(3):151-166.
    [10]WANG C,CHEN Y Q.Improving nearest neighbor classification with simulated gravitational collapse[C]//Proceedings of the First International Conference on Advances in Natural Computation.Berlin:Springer,2005:845-854.
    [11]YANG B,PENG L,CHEN Y,et al.A DGC-based data classification method used for abnormal network intrusion detection[C]//Proceedings of the 13th International Conference on Neural Information Processing.Berlin:Springer,2006:209-216.
    [12]PENG L,ZHANG H,YANG B,et al.A new approach for imbalanced data classification based on data gravitation[J].Information Sciences,2014,288:347-373.
    [13]REYES O,MORELL C,VENTURA S.Effective lazy learning algorithm based on a data gravitation model for multi-label learning[J].Information Sciences,2016,340/341:159-174.
    [14]檀何凤,刘政怡.基于标签相关性的K近邻多标签分类方法[J].计算机应用,2015,35(10):2761-2765.(TAN H F,LIUZ Y.K-nearest neighbor multiple tag classification method based on tag relevance[J].Journal of Computer Applications,2015,35(10):2761-2765.)
    [15]ZHANG B,WANG Y,WANG W.Batch mode active learning for multi-label image classification with informative label correlation mining[C]//Proceedings of the 2012 IEEE Workshop on the Applications of Computer Vision.Piscataway,NJ:IEEE,2012:401-407.
    [16]LEE J,KIM H,KIM N,et al.An approach for multi-label classification by directed acyclic graph with label correlation maximization[J].Information Sciences,2016,351:101-114.
    [17]LI C,LI H.Correlation weighted heterogeneous Euclidean-overlap metric[J].International Journal of Computers&Applications,2011,33(4):341-346.
    [18]TSOUMAKAS G,KATAKIS I,VLAHAVAS I.Random k-labelsets for multilabel classification[J].IEEE Transactions on Knowledge&Data Engineering,2011,23(7):1079-1089.
    [19]TSOUMAKAS G,KATAKIS I,VLAHAVAS I.Effective and efficient multilabel classification in domains with large number of labels[C]//Proceedings of the ECML/PKDD 2008 Workshop on Mining Multidimensional Data.Berlin:Springer,2008:30-44.
    [20]CHENG W,HLLERMEIER E.Combining instance-based learning and logistic regression for multilabel classification[J].Machine Learning,2009,76(2/3):211-225.
    [21]SPYROMITROS E,TSOUMAKAS G,VLAHAVAS I.An empirical study of lazy multilabel classification algorithms[C]//Proceedings of the 5th Hellenic Conference on AI.Berlin:Springer,2008:401-406.
    [22]ZHANG M L,ZHOU Z H.A review on multi-label learning algorithms[J].IEEE Transactions on Knowledge&Data Engineering,2014,26(8):1819-1837.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700