基于粒子群的多标记阈值自适应极限学习机

英文篇名：An Extreme Learning Machine of Multi-label Threshold Adaptation Based on Particle Swarm Optimization
作者：许二戗 ; 于化龙
英文作者：XU Er-qiang;YU Hua-long;School of Computer,Jiangsu University of Science and Technology;
关键词：多标记分类 ; 类别不平衡 ; 粒子群优化 ; 极限学习机 ; 阈值技术
英文关键词：multi-label classification;;class imbalance;;particle swarm optimization;;extreme learning machine;;threshold technique
中文刊名：WJFZ
英文刊名：Computer Technology and Development
机构：江苏科技大学计算机学院;
出版日期：2018-12-20 15:21
出版单位：计算机技术与发展
年：2019
期：v.29;No.264
基金：国家自然科学基金(61305058,61572242);; 中国博士后特别资助计划项目(2015T80481);; 中国博士后科学基金(2013M540404);; 江苏省自然科学基金(BK20130471);; 江苏省博士后基金(1401037B)
语种：中文;
页：WJFZ201904010
页数：6
CN：04
ISSN：61-1450/TP
分类号：53-58

摘要

多标记学习考虑单个样例与多个类别标记相关联的情况,类别不平衡主要研究样本不均衡带给算法的影响,两者均是当前机器学习研究领域的热点。在多标记数据集中普遍存在类别不平衡现象,虽然目前已经提出了大量的多标记学习,但对于数据集的内在特点却鲜有研究。针对这一问题,提出了一种基于粒子群的多标记阈值自适应极限学习机算法(MLTA-ELM)。该算法充分结合了极限学习机学习速度快、泛化性能好的优点及类别不平衡学习中的阈值自适应选择策略。首先利用极限学习机构建一个单隐层前馈神经网络模型,其次利用该模型实现多标记初步预测,然后采用粒子群优化算法作为阈值自适应选择策略,以此获得判断标记类别的最优阈值组合。最后,通过12个基准的多标记数据集,对MLTA-ELM算法的可行性及有效性进行了验证。实验结果表明,该算法与其他几种流行的方法相比,具有更好的预测能力。
Multi-label learning investigates the case of single object related to multiple labels,while class imbalanced learning mainly studies the impact of unbalancedly distrubuted samples on the algorithm. Both of them are the hot spots in the field of machine learning research. Class imbalance is the common phenomenon in multi-label datasets. Though a large number of multi label learning algorithms have been put forward,there is little research on the intrinsic characteristics of dataset. To address the problem,we present a PSO-based multi-label threshold adaptation extreme learning machine(MLTA-ELM). This algorithm fully combines the advantages of extreme learning machine such as fast learning speed,strong generalization and the adaptive selection strategy of threshold value in class unbalance learning. First,a single hidden layer feed forward neural network is built by extreme learning machine,and the multi labels are predicted preliminarily by this model. Then the particle swarm optimization algorithm is taken as the threshold adaptive selection strategy to obtain the optimal threshold combination for label prediction. Lastly,we conduct experiments on 12 baseline multi-label datasets to verify the feasibility and effectiveness of the proposed algorithm. The experiment indicates that the proposed algorithm outperforms several state-of-the-art ones.

引文

[1] 吴磊,张敏灵.基于类属属性的多标记学习算法[J].软件学报,2014,25(9):1992-2001.
    [2] 付博,刘挺.社会媒体中用户的隐式消费意图识别[J].软件学报,2016,27(11):2843-2854.
    [3] 纪亚亮,郑阳.慢性乙型肝炎用药推荐系统的设计与实现[J].医疗卫生装备,2017,38(7):48-51.
    [4] 胡海峰,郑茂,吴伟坚,等.基于多示例多标记迁移学习的蛋白质功能预测[J].中国科学:信息科学,2017,47(11):1538-1550.
    [5] CHARTE F,RIVERA A J,JESUS M J D,et al.MLSMOTE:approaching imbalanced multilabel learning through synthetic instance generation[J].Knowledge-Based Systems,2015,89:385-397.
    [6] CHARTE F,RIVERA A J,JESUS M J D,et al.Addressing imbalance in multilabel classification:measures and random resampling algorithms[J].Neurocomputing,2015,163:3-16.
    [7] ZHANG Minling,LI Yukun,LIU Xuying.Towards class-imbalance aware multi-label learning[C]//International conference on artificial intelligence.Buenos Aires,Argentina:AAAI Press,2014:4041-4047.
    [8] HUANG Guangbin,ZHU Qinyu,SIEW C K.Extreme learning machine:theory and applications[J].Neurocomputing,2006,70(1-3):489-501.
    [9] HUANG Guangbin,WANG Dianhui,LAN Yuan.Extreme learning machines:a survey[J].International Journal of Machine Learning & Cybernetics,2011,2(2):107-122.
    [10] ZHANG Minling,ZHOU Zhihua.ML-KNN:a lazy learning approach to multi-label learning[J].Pattern Recognition,2007,40(7):2038-2048.
    [11] 张敏灵.一种新型多标记懒惰学习算法[J].计算机研究与发展,2012,49(11):2271-2282.
    [12] ZHANG Minling,ZHOU Zhihua.Multilabel neural networks with applications to functional genomics and text categorization[J].IEEE Transactions on Knowledge & Data Engineering,2006,18(10):1338-1351.
    [13] TSOUMAKAS G,VLAHAVAS I.Random k-labelsets:an ensemble method for multilabel classification[C]//European conference on machine learning.[s.l.]:[s.n.],2007:406-417.
    [14] 于化龙.类别不平衡学习:理论与算法[M].北京:清华大学出版社,2017.
    [15] ZHOU Zhihua,LIU Xuying.Training cost-sensitive neural networks with methods addressing the class imbalance problem[J].IEEE Transactions on Knowledge & Data Engineering,2006,18(1):63-77.
    [16] YU Hualong,SUN Changyin,YANG Xibei,et al.ODOC-ELM:optimal decision outputs compensation-based extreme learning machine for classifying imbalanced data[J].Knowledge-Based Systems,2016,92:55-70.
    [17] KENNEDY J,EBERHART R.Particle swarm optimization[C]//IEEE international conference on neural networks.[s.l.]:IEEE,1995:1942-1948.
    [18] 纪震.粒子群算法及应用(计算机理论基础与应用丛书)[M].北京:科学出版社,2009.