用户名: 密码: 验证码:
基于成对差异性度量的选择性集成学习方法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
集成学习是一种机器学习范式,它通过使用多个学习器来解决同一问题可以有效地提高学习系统的泛化能力,因此它是国际机器学习界的一个研究热点。目前,集成学习技术已经在行星探测、地震波分析、文本分类、生物特征识别、遥感信息处理、计算机辅助医疗诊断等众多领域得到了应用。但是集成学习技术还不够成熟,集成学习的理论研究中还存在着大量未能解决的问题,集成学习的实际应用研究也有待进一步的拓展和深入。
     一般认为,有效地产生泛化能力强、差异大的个体学习器,是集成学习算法的关键。但对于如何有效地度量差异、以及更进一步如何有效地获取和利用这种差异,仍然是一个未能完全解决的问题。选择性集成方法从集成学习算法产生的个体学习器中选择一部分来集成,研究结果表明该方法可能比使用所有个体学习器来组成集成效果更好。因此选择性集成已成为集成学习的一个重要研究方向,其更好的选择策略以及算法运算速度的提高有待更多研究人员的深入研究。
     本文以集成学习为研究对象,介绍了集成学习的概念、理论基础、构成及两种经.典的集成学习算法(Boosting和Bagging).接着将集成学习应用到了人脸识别领域中,并与人脸识别中常用的几种学习器进行了对比实验。然后对选择性集成学习方法做了深入的研究,首先介绍了选择性集成的基本思想、理论基础,其次介绍了基于遗传算法的选择性集成算法(GASEN)和选择性集成的发展,最后在基于分类器成对差异性度量方法的基础上提出了一种新的选择性集成算法(PDMSEN)及其改进算法(PDMSEN-b)。本文的创新研究工作如下:
     (1)将集成学习(Boosting RBF神经网络)应用到人脸识别中,并与人脸识别中常用的几种学习器进行了对比实验。实验结果表明集成学习和SVM构建的学习器在本次实验中取得了较好的性能,更适合用于人脸识别中特征分类器,为以后在人脸识别中选择一个合适的分类器提供了参考。
     (2)为了提高学习器的差异性和精度,本文提出了一种基于成对差异性度量的选择性集成算法(PDMSEN).同时研究了一种改进算法(PDMSEN-b),进一步提高了算法的运算速度,且支持并行计算。最后通过使用BP神经网络作为基学习器,在UCI数据集上进行实验,并与Bagging、GASEN算法进行了比较。实验结果表明,改进算法(PDMSEN-b)在性能上与GASEN算法相近的前提下,训练速度得到了大幅的提高。
Ensemble learning is a kind of machine learning paradigm. Through using multiple learners to solve one problem, it can effectively improve the generalization ability of learning systems. Therefore it becomes an international hot topic in the field of machine learning. At present, ensemble learning has been successfully applied in many fields, such as planet exploration, seismic wave analysis, text categorization, biological feature recognition, remote sensing information processing and computer aided medical diagnostics. However, ensemble learning technique is still immature, its theory has many unsolved problems, and its applications need to be further expanded and improved.
     It is generally recognized that the key in ensemble learning is to effectively generate individual learners with strong generalization ability and great diversity. However, the question still remains open about how to effectively measure and acquire and utilize this diversity. Selective ensemble by using parts of the individual learners in ensemble could be better in performance than that by using all individual learners in ensemble. So it has become an important research topic of ensemble learning. A better selection strategy and improvement of the speed of algorithm need more researches.
     In this thesis on ensemble learning, we introduce its related concepts, theoretical basis, constitution and two classical methods of ensemble learning algorithm (Boosting and Bagging). We apply ensemble learning to face recognition and compare with several common classifiers in face recognition. Then we do an in-depth study on selective ensemble. First, we introduce its essential concept and theoretical basis. Second, we introduce the algorithm of Genetic Algorithm based Selected Ensemble (GASEN) and the development of selective ensemble. Finally, based on pairwise diversity measures, a new algorithm (Pairwise Diversity Measures based Selective ENsemble, PDMSEN) and its improved algorithm (PDMSEN-b) are proposed. The main contributions of this dissertation are summarized as follows:
     (1) We apply ensemble learning (Boosting RBF neural network) to face recognition and compare with several kinds of classifier widely-used in face recognition. Experimental results demonstrate that ensemble learning and SVM methods have achieved relatively better performance and are most suitable for feature classifier in face recognition. The research provides a reference for selecting an appropriate classifier in face recognition.
     (2) In order to improve the diversity and accuracy of learners, Pairwise Diversity Measures based Selective Ensemble (PDMSEN) is proposed. Furthermore, a new method (PDMSEN-b) is presented to improve the speed of the algorithm which also supports parallel computing. At last, through applying BP neural networks as base learners, we test on selected UCI database and compare with Bagging and GASEN (Genetic Algorithm based Selected Ensemble) algorithms. Experimental results demonstrate that the learning speed of the proposed algorithm (PDMSEN-b) is superior to that of the GASEN algorithm in the same learning performance.
引文
[1]Mjolsness E, Decoste D. Machine Learning for Science:State of the Art and Future Prospects [J]. Science,2001,293(14):2051-2055.
    [2]Dietterich T G. Machine learning research:four current directions [J]. AI Magazine, 1997,18(4):97-136.
    [3]Zhou Z-H, Wu J, Tang W. Ensembling neural networks:many could be better than all [J]. Artificial Intelligence,2002,137(1-2):239-263.
    [4]Krogh A, Vedelsby J. Neural network ensembles, cross validation, and active learning[C]. In:Tesauro G, Touretzky D S; Leen T K, eds. Advances in Neural Information Processing Systems 7, Cambridge, MA:MITPress,1995,231-238.
    [5]Hansen L K, Salamon P. Neural network ensembles [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1990,12(10):993-1001.
    [6]Kearns M, Valiant L G. Learning Boolean formulae or factoring [R]. Technical Report TR-1488, Aiken Computation Laboratory, Havard University, Cambridge, MA,1988.
    [7]Schapire R E. The strength of weak learnability [J]. Machine Learning,1990, 5(2):197-227.
    [8]Freund Y, Schapire R E. A decision-theoretic generalization of on-line learning and an application to boosting [C]. In:Proceedings of the 2nd European Conference on Computational Learning Theory, Barcelona, Spain,1995,23-37.
    [9]Kuncheva L I, Whitaker C J. Measures of diversity in classifier ensembles [J]. Machine Learning,2003,51(2):181-207.
    [10]Freund Y, Schapire R E.A decision-theoretic generalization of on-line learning and an Application to boosting [J]. Journal of Computer and System Sciences,1997, 55(1):119-139.
    [11]Breiman L. Bagging predictors [J]. Machine Learning,1996,24(2):123-140.
    [12]Bauer E, Kohavi R. An empirical comparison of voting classification algorithms: bagging, boosting, and variants [J]. Machine Learning,1999,36(1-2):105-139.
    [13]Breiman L. Bias, variance, and arcing classifiers[R]. Technical Report 460, Statistics Department, University of California, Berkeley, CA,1996.
    [14]Wolpert D H, Macready W G. No free lunch theorems for search[R]. Technical Report SFI-TR-05-010, Santa Fe Institute, Santa Fe, NM,1995.
    [15]Wolpert D H, Macready W G. No free lunch theorems for optimization [J]. IEEE Transactions on Evolutionary Computation,1997,1(1):67-82.
    [16]Hansen L K, Liisberg L Salamon P. Ensemble methods for handwritten digit recognition [C]. In:Proc. IEEE Workshop on Neural Networks for Signal Processing, Helsingor, Denmark:IEEE Press, Piscataway, NJ,1992.333-342.
    [17]Gutta S, Huang J R J, Jonathon P, Wechsler H. Mixture of experts for classification of gender, ethnic origin, and pose of human faces [J]. IEEE Trans Neural Networks, 2000,11(4):948-960.
    [18]周志华,皇甫杰,张宏江,陈祖翰.基于神经网络集成的多视角人脸识别[J].计算机研究与发展,2001,38(10):1204-1210.
    [19]Schapire R E, Singfer Y. BoosTexter:A boosting-based system for text categorization [J].Machine Learning,2000,39(2-3):135-168.
    [20]Cherkauer K J. Human expert level performance on a scientific image analysis task by a system using combined artificial neural networks[C]. In:Proc the 13th AAA I Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms, Portland, OR,1996.15-21.
    [21]Hampshire J, Waibel A. A novel objective function for improved phoneme recognition using time delay neural networks [J]. IEEE Transactions on Neural Networks,1990,1 (2):216-228.
    [22]Shimshoni Y, Intrator N. Classification of seismic signals by integrating ensembles of neural networks [J]. IEEE Trans Signal Processing,1998,46(5):1194-1201.
    [23]刘悦.神经网络集成及其在地震预报中的应用研究(博士学位论文)[D].上海:上海大学,2005.
    [24]Schapire R E, Singer Y, Singhal A. Boosting and Rocchio applied to text filtering[C].In:Proc the 21st Annual ACM SIGIR International Conference on Research and Development in Information Retrieval, NY,1998,215-223.
    [25]Ceccarelli M, Petrosino A. Multi-feature adaptive classifiers for SAR image segmentation [J]. Neurocomputing,1997.14(4):345-363.
    [26]Sharkey A J C, Sharkey N E, Cross S S. Adapting an ensemble approach for the diagnosis of breast cancer[C]. In:Proc. International Conference on Artificial Neural Networks, Skovde, Sweden,1998,281-286.
    [27]Castro P D, Coelho G P, Caetano M F, Von Zuben F J. Designing ensembles of fuzzy classification systems:an immune-inspired approach[C]. Proceedings of 4th International Conference on Artificial Immune Systems,2005,469-482.
    [28]Li K, Huang H-K, Ye X-C, Cui L-J. A selective approach to neural network ensemble based on clustering technology[C]. Proceedings of ICMLC 2004, Banff, Canada, 2004:3229-3233.
    [29]Sollich P, Krogh A. Learning with ensembles:how over-fitting can be useful[C]. In: Touretzky D S, Mozer M C, Hasselmo M E, eds. Advances in Neural Information Processing Systems 8, Cambridge, MA:MIT Press,1996,190-196.
    [30]李烨.基于支持向量机的集成学习研究(博士学位论文)[D].上海:上海交通大学,2007.
    [31]王珏,周志华,周傲英.机器学习及应用[M].北京:清华大学出版社,2006:170-187.
    [32]Anthony M. Probabilistic analysis of learning in artificial neural networks:The PAC model and its variants [J].Neural Computing Surveys,1997,1:1-47.
    [33]Kearns M, Valiant L G. Cryptographic limitation on learning boolean formulas and finite automata [C].In: Proceedings of the 21st Annual ACM Symposium on Theory of Computing, New York, NY:ACM press,1989,433-444.
    [34]T.G Dietterich. Ensemble learning[M].In The Handbook of Brain Theory and Neural Networks,2nd Edition MIT Press,2002.
    [35]Kuncheva L I. Genetic algorithm for feature selection for parallel classifiers [J].Information Processing Letters,1993,46(4):163-168.
    [36]Opitz D, Shavlik J. A Genetic Algorithm Approach for Creating Neural Network Ensembles [J].Amanda Sharkey (ed.).Combining Artificial Neural Nets,1999:79-97.
    [37]Ho T K. The random subspace method for constructing decision forests [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(8):832-844.
    [38]Skurichina M, Duin R P W. Bagging, boosting and the random subspace method for linear classifiers [J]. Pattern Analysis and Applications.2002,5:121-135.
    [39]Oza N C, Turner K. Input Decimation Ensembles:Decorrelation through Dimensionality Reduction[C].//Roli J K A F. Second International Workshop on Multiple Classifier Systems. Cambridge, UK. Lecture Notes in Computer Science. Springer-Verlag,2001:238-247.
    [40]Park H S, Lee S W. Off-line recognition of large sets handwritten characters with multiple Hidden-Markov models [J]. Pattern Recognition,1996,29(2):231-244.
    [41]Cherkauker K J. Human expert-level performance on a scientific image analysis task by a system using combined artificial neural networks [C].//C P. Working notes of the AAAI Workshop on Integrating Multiple Learned Models,1996:15-21.
    [42]Opitz D. Feature selection for ensembles[C].Proceedings of 16th National Conference on Artificial Intelligence (AAAI),1999:379-384.
    [43]R Bryll, R Gutierrez-Osuna, F Quek. Attribute bagging:Improving accuracy of classifier ensembles by using random feature subsets [J].Pattern Recognition,2003, 36(6):1291-1302.
    [44]Sullivan J, Langford J, Caruana R, et al. Featureboost:a meta-learning algorithm that improves model robustness [C]. Proceedings of the Seventeenth International Conference on Machine Learning,2000.
    [45]刘天羽.基于特征选择技术的集成学习方法及其应用研究(博士学位论文)[D].上海:上海大学,2006.
    [46]Turner K, Ghosh J. Error correlation and error reduction in ensemble classifiers [J].Connection Science,1996,8(3-4):385-404.
    [47]Kuncheva L I, Whitaker C J. Feature Subsets for Classifier Combination:An Enumerative Experiment[C].Second International Workshop on Multiple Classifier Systems. Cambridge, UK. Lecture Notes in Computer Science. Springer-Verlag,2001: 228-237.
    [48]Kolen J, Pollack J. Back propagation is sensitive to initial conditions [M].//Lippmann R P, Moody J E, Touretzky D S. Advances in Neural Information Processing Systems. San Francisco, CA:Morgan Kauffman,1991,3:860-867.
    [49]Parmanto B, Munro P W, Doyle H R. Improving committee diagnosis with resampling techniques[M].//Touretzky D S, Mozer M C, Hesselmo M E. Advances in Neural Information Processing Systems. Cambridge, MA:MIT Press,1996, 8:882-888.
    [50]Dietterich T G, Kong E B. Machine learning bias, statistical bias, and statistical variance of decision tree algorithms[R]. Technical Report. Corvallis, Oregon: Department of Computer Science, Oregon State University,1995.
    [51]Kuncheva L.I., Jain L.C.2000.Designing classifier fusion systems by genetic algorithms [J]. IEEE Transactions on Evolutionary Computation,4(4):327-336.
    [52]Sharkey A, Sharkey N, Gerecke U, et al. The test and select approach to ensemble combination[C].//Kittler J, Roli F. First International Workshop on Multiple Classifier Systems. Cagliari, Italy. Lecture Notes in Computer Science. Springer-Verlag,2000: 30-44.
    [53]Raviv Y, Intrator N. Bootstrapping with noise:an effective regularization technique [J].Connection Science,1996,8(3-4):355-372.
    [54]刘文瑶.多分类器系统中的组合方法及差异性度量研究(硕士学位论文)[D].杭州:浙江大学,2005.
    [55]王鹏.基于差异性度量的多分类器融合研究(硕士学位论文)[D].南京:江苏大学,2007.
    [56]Huang Y S, Suen C Y. A method of combining multiple experts for the recognition of unconstrained handwritten numerals [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1995,17(1):90-94.
    [57]寇忠宝,张长水.基于Multi-Agent的分类器融合[J].计算机学报,2003,26(2):174-179.
    [58]T.K. Ho, J.J. Hull, and S.N. Srihari. Decision Combination in Multiple Classifiers Systems [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,1994, 16(1):66-75.
    [59]Verlinde P, Chollet G. Comparing decision fusion paradigms using k-NN based classifiers, decision trees and logistic regression in a multi-modal identity verification application[C]. In:Proceedings of the Second International Conference on Audio and Video-based Biometric Person Authentication, Washington DC, USA:1999,188-193.
    [60]Kittler J, Hatef M, Duin R P W, Matas J. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence,1998,20(3):226-239.
    [61]Ross A, Jain A, Qian J.Z. Information fusion in biometrics[C]. In:Proc.3nd International Conference on Audio-and Video-Based Biometric Person Authentication, Sweden,2001,354-359.
    [62]Andrew Teoh, S.A. Samad, A. Hussain. Theoretic Evidence k-Nearest Neighbourhood Classifiers in a Bimodal Biometric Verification system[C]. In: Proc.4th International Conference on Audio-and Video-Based Biometric Person Authentication. Guildford,2003,778-786.
    [63]Dempster A P. Upper and lower probabilities induced by multivalued mapping [J].Ann Math Statist,1967,38(3):325-339.
    [64]Shafer G.A mathematical theory of evidence [M]:Princeton University Press,1976.
    [65]Xu L, Krzyzak C, Suen C Y. Methods of combining multiple classifiers and their applications to handwriting recognition [J]. IEEE Transactions on Systems, Man and Cybernetics,1992,22(3):418-435.
    [66]孙怀江,杨静宇.一种相关证据合成方法[J].计算机学报,1999,22(9):1004-1007.
    [67]孙怀江,胡钟山,杨静宇.基于证据理论的多分类器融合方法研究[J].计算机学报,2001,24(3):231-235.
    [68]Sugeno M. Theory of fuzzy integrals and its applications [D].Tokyo, Japan:Tokyo Institute of Technology,1974.
    [69]Murofushi T, Sugeno M. An interpretation of fuzzy measure and the Choquet integral as an integral with respect to a fuzzy measure[J].Fuzzy Sets and Systems, 1989,29:201-337.
    [70]Weber S.(?)-decomposable measures and integrals for Archimedean t-conorm(?) [J]. J. Mathematical Analysis and Application,1984,101:114-138.
    [71]Tahani H, Keller J M. Information fusion in computer vision using the fuzzy integral [J]. IEEE Transactions on Systems, Man and Cybernetics,1990,20 (3):733-741.
    [72]David Opitz, Richard Maclin. Popular ensemble methods:an empirical study [J]. Journal of Artificial Intelligence Research,1999,11:169-198.
    [73]杨长盛,陶亮.几种机器学习方法在人脸识别中的性能比较[J].计算机工程与应用,2009,45(4):169-172.
    [74]陶亮,庄镇泉.基于小波分解和支持向量机的准正面人脸识别方法[J].电路与系统学报,2003,8(6):107-112.
    [75]Hecht-Nielson R. Theory of the back-propagation neural network [C]//IJCNN,1989, I:583-604.
    [76]Er M J, Wu S, Lu J, et al. Face Recognition with Radial Basis Function (RBF) Neural Networks[J]. IEEE Transactions on Neural Networks,2002,13 (3):697-710.
    [77]Chen S, Cowan C F N, Grant P M. Orthogonal Least Squares Learning Algorithm for Radial Basis Function Networks [J]. IEEE Transactions on Neural Networks,1991, 2(2):302-309.
    [78]Cortes C. and Vapnik V. Support vector networks [J]. Machine Learning,1995, 20:1-25.
    [79]边肇祺,张学工等.模式识别(第二版)[M].北京:清华大学出版社,2000.
    [80]陶亮,庄镇泉.复杂背景下人眼自动定位[J].计算机辅助设计与图形学学报.2003(1):38-42.
    [81]Wu J-X, Zhou Z-H, Chen Z-Q. Ensembles of GA based selective neural network ensembles[C]. In:Proceedings of the 8th International Conference on Neural Information Processing, Shanghai, China,2001,3:1477-1482.
    [82]Granitto P M, Verdes P F, Ceccatto H A. Neural network ensembles:evaluation of aggregation algorithms [J]. Artificial Intelligence,2005,163(2):139-162.
    [83]王丽丽.集成学习算法研究(硕士学位论文)[D].南宁:广西大学,2006.
    [84]傅强.选择性神经网络集成算法研究(博士学位论文)[D].杭州:浙江大学,2007.
    [85]唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502.
    [86]Chiewchanwattana S, Lursinsap C, Chu C-H H. Time-series data prediction based on reconstruction of missing samples and selective ensembling of fir neural networks [C]. In:Proceedings of the 9th International Conference on Neural Information Processing, Singapore,2002,5:2152-2156.
    [87]T. Dietterieh. An experimental comparison of three methods for construction ensembles of decision trees:bagging, boosting, and randomization [J]. Machine Learning.2000,40(2):139-157.
    [88]Skalak D. The sources of increased accuracy for two proposed boosting algorithms[C]. In:Proceedings of American Association for Artificial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop, Portland OR,1996. https://eprints.kfupm.edu.sa/71328/1/71328.pdf
    [89]Giacinto G, Roli F. Design of effective neural network ensembles for image classification purposes. Image and Vision Computing,2001,19(9-10):699-707.
    [90]Partridge D, Krzanowski, W J. Software diversity:Practical statistics for its measurement and exploitation [J].Information& Software Technology.1997, 39:707-717.
    [91]周志华,陈世福.神经网络集成.计算机学报,2002,25(1):1-8.
    [92]Kuncheva L I. Combining Pattern Classifiers:Methods and Algorithms [M]. Wiley Interscience,2004.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700