用户名: 密码: 验证码:
多视角的构建及其在单任务学习和多任务学习中的应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
根据获取方式或来源的不同,模式可由多组不同类别的信息集组成,即由多组彼此不同的属性集构成。在特定的应用领域中,立足于不同的视觉角度往往可得到不同的视觉效果,即对同一模式可形成多组不同方位或多层次的信息。在不同视角下的数据可能存在着较强的信息互补性,充分利用它们,能够达到从不同的角度来刻画待解决问题的目的。
     这种思想在半监督学习中得到充分的体现。在半监督学习中,存在大量未标记样本和少量的已标记样本,需要利用未标记数据辅助已标记数据来获得较好性能的学习器。协同训练是一种简单有效的半监督学习方法,且是基于多视角的,通过不同的视角可以得到多个学习器。在学习过程中,学习器选出置信度较高的未标记样本进行相互标记,从而用于更新模型。目前,这个算法在广度及深度上都得到了扩展,基于协同训练设定下,也提出了很多新的多视角半监督学习算法。它们的成功应用展现了多视角学习方法的可行性。但在现实应用中,很多数据集仅仅只包含一个视角的描述,这在很大程度上限制了多视角学习方法的应用。因此,研究从原始数据中生成多个视角描述的方法是十分有意义的。
     首先,本文采用遗传算法进行特征选择,根据最终模型目标,从已知的一组特征集中选择出有明显区分特性的特征子集,用其形成多个视角。这些由从原始数据集中选择出的与输出结果有关或重要的特征构成的视角,对于构造具有充分性的学习器是有保证的。实验结果以及与任意划分特征集形成多个视角的方法的比较表明,特征选择用于形成多个视角的方法是行之有效的,并且能够得到较充分的多个视角。
     其次,本文为了获得更好的泛化性能,尝试了用多任务学习代替传统的单任务学习。本文完成了多任务学习在交通流量和人脸识别方面的应用,实验结果证明了其提高学习器的泛化性能的特点。
     最后,将多任务学习应用于多视角半监督学习中,形成多视角多任务学习。希望多个任务平行进行学习的过程中,任务的训练信号中所包含的领域信息能够用于进行归纳偏置,从而改进泛化性能。实验表明多任务学习的引入,起到了改进学习器的泛化性的作用。
According to different obtained ways, the pattern can be composed of the corresponding different attribute sets, where each attribute set can be taken as one view. In specific application areas, from different visual views we can obtain different visual effects for a given pattern, which can form different directions or more multiple-level information to gain better performance. The information which comes from different perspectives of data may be complemental. If making full use of them, more informative solutions are achieved from the different angles of problem.
     The idea is applied vividly and incisively in semi-supervised learning which is an algorthim between supervised learning and unsupervised learning. In semi-supervised learning, there is a large pool of unlabeled data and comparatively few labeled data. Unlabeled data are taken advantage to assist the labeled data to improve the performance of learner. For example, co-training which a simple and effective method for semi-supervised learning is works under a two-view setting. In co-training, initially two separate classifiers are trained from each view of labeled data, then the most confident ones will be respectively selected to add to the sets of initial labeled examples. Finally, the labeled data in the training set are increased. So far, many other novel multiple-view semi-supervised learning algorithms are raised based on the setting of co-training. The successful application of those methods demonstrates that the feasibility of multiple-view learning is obvious. However, it is limited by the fact that the data set in real-world application only has one view. Therefore, it is meaningful that an effective method which can generate multiple views from the original data is proposed.
     Firstly, we propose a feature selection approach to artificially generate multiple views by means of genetic algorithm in the thesis. The feature subsets which have the characteristic to distinguish the target of designed model are selected to serve as multiple views. The effectiveness of this novel approach and the result which is gained by comparing with a random feature split method are evaluated on several different classification problems.
     Secondly, in order to gain better generalization, the multiple task learning is used to take the place of the traditional learning method. In this thesis, the application in the traffic flow forecasting and the face recognition illustrate that the multiple task learning has the ability to improve the generalization.
     Finally, we also combine the proposed method with multi-task learning. In the multi-task learning, the domain specific information which is included in the training signals of other tasks is used to improve generalization. In fact, the training signals for the extra tasks serve as an inductive bias. The sufficiency for multi-view multi-task learning method is also shown in several different classification problems where encouraging experimental results are observed.
引文
[1]边肇祺,张学工.模式识别[M].北京:清华大学出版社,2000.
    [2]周志华,王钰.半监督学习中的协同训练风范[G].机器学习及其应用.北京:清华出版社,259-275,2007.
    [3]T.Jebara.Discriminative,generative and imitative learning.PhD thesis,Massachusetts Inst.of Technology Media laboratory,Dec 2001.
    [4]T.Jaakkola,M.Meila and T.Jebara.Maximum entropy discrimination.Technical Report AITR-1668,Massachusetts Inst.of Technology AI lab,1999.http://www.ai.mit.edu/.
    [5]K.P.Bennett and A.Demiriz.Semi-supervised support vector machines,Advancesin Neural Information Processing Sytems,Cambridge,MA,1998,10:368-374.
    [6]T.Scheffer and S.Wrobel.Active learning of partially hidden markov models.In Proceedings of the ECML/PKDD Workshop on Instance Selection,2001.
    [7]K.Wagstaff,C.Cardie,S.Rogers and S.Schroedl.Constrained k-means clustering with background knowledge,In Proceedings of ICML 2001.577-584.
    [8]S.Basu,A.Banerjee and R.J.Mooney.Semi-supervised clustering by seeding.In Proceedings of ICML 2002.19-26.
    [9]C.C.Kemp,T.L.Griffiths,S.Stromsten and J.B.Tenenbaum,Semi-supervised learning with trees,In Proceedings of NIPS 2003.
    [10]A.Blum and S.Chawla.Learning from Labeled and unlabeled data using graph mincuts.In Proceedings of ICML 2001.
    [11]E.Riloff and R.Jones.Learning dictionaries for information extraction using multi-level boot-strapping,In Proceedings of of AAAI 1999.
    [12]M.Belkin and P.Niyogi.Using manifold structure for partially labeled classification,Advances in Neural Information Processing of NIPS 2002.
    [13]A.Blum and T.Mitchell.Combining labeled and unlabeled data with co-training.In Proceedings of the 11th Annual Conference on Computational Learning Theory 1998.92-100.
    [14]K.Nigam and R.Ghani.Analyzing the effectiveness and applicability of co-training,In Proceedings of Information and Knowledge Management 2000.86-93.
    [15]I.Muslea,S.Minton and C.A.Knoblock.Selective sampling with redundant views,In Proceedings of National Conference on Artificial Intelligence 2000.621-626.
    [16]I.Muslea,S.Minton,C.A.Knoblock.Active+semi-supervised learning =robust multi-view learning,In Processings of ICML 2002.
    [17]C.Rich.Multitask learning,Machine Learning,1997.28(1):41-75.
    [18]Z.H.Sun,G.Bebis,and R.Miller.Object detection using feature subset selection,Pattern Recognition,2004.
    [19]D.O.Richard,H.E.Peter and S.G.David.Pattern Recognition(second edition).Wiley,New York.2001.
    [20]陈国良,王煦法,庄镇泉.遗传算法及其应用[M].国防出版社,2001.2.
    [21]周明,孙树栋.遗传算法原理及应用[M].国防科技出版社,1999,6.
    [22]王小平,曹立明.遗传算法--理论、应用与软件实现[M].西安交通大学出版社,1998,12.
    [23]A.J.Chipperfield,C.M.Fonseca and P.J.Fleming.Development of genetic optimization tools for multi-objective optimization problems in CACSD.In lEE Colloq.on Genetic Algorithms for Control Systems Engineering,The Institution of Electrical Engineers.Digest No.106.1992.
    [24]Y.M.Zhu and S.Cochoff.An -Object-oriented Framework for Medical Image Registration,Fusion,and Visualization[J].Computer Meth Prog Biomed.82:258-267,2006.
    [25]D.E.Goldberg.Genetic algorithms in search,Optimization and Machine Learning.Addison-Wesley,Reading MA,1989.
    [26]S.L.Sun,Semantic features for multi-view semisupervised and active learning of text classification,In Processings of ICDM,2008.
    [27]B.Shahshahani and D.Landgrebe.The effect of unlabeled samples in reducing the small sample size problem and mitigating the hughes phenomenon.IEEE Transactions on Geoscience and Remote Sensing,1994,32(5):1087-1095.
    [28] D.J. Miller and H.S. Uyar. A mixture of experts classifier with learning based on both labelled and unlabelled data. In: M. Mozer, M. I. Jordan, T. Petsche, eds. Advances in Neural Information Processing Systems 9, Cambridge, MA: MIT Press, 1997, 571-577.
    
    [29] K. Nigam, A. K. McCallum, S. Thrun and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning, 2000, 39(2-3): 103-134.
    
    [30] A. Blum and S. Chawla. Learning from labeled and unlabeled data using graph mincuts. In Proceedings of the 18th International Conference on Machine Learning (ICML'01), San Francisco, CA, 2001, 19-26.
    
    [31] X. Zhu, Z. Ghahramani and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. In Proceedings of the 20th International Conference on Machine Learning (ICML'03), Washington, DC, 2003,912-919.
    
    [32] M. Belkin and P. Niyogi. Semi-supervised learning on riemannian manifolds. Machine Learning, 2004, 56(1-3): 209-239.
    
    [33] D. Zhou, O. Bousquet, T. N. Lal, J. Weston and B. Scholkopf. Learning with local and global consistency. In: S. Thrun, L. Saul, B. Scholkopf, eds. Advances in Neural Information Processing Systems 16, Cambridge, MA: MIT Press, 2004, 321-328.
    
    [34] M. Belkin, P. Niyogi and V. Sindwani. On manifold regularization. In Proceedings of the 10th International Workshop on Artificial Intelligence and Statistics (AISTATS'05), Savannah Hotel, Barbados, 2005, 17-24.
    
    [35] M. Balcan, A. Blum, and K. Yang. Co-training and expansion: Towards bridging theory and practice, In Processing of NIPS 2005,17: 89-96.
    
    [36] Z.H. Zhou and M. Li. Tri-training: Exploiting unlabeled data using three classifiers, IEEE Trans. On Knowledge and Data Engineering, 2005, 17: 1529-1541.
    
    [37] V. Sindhwani and D. Rosenberg. A RKHS for multiview learning and manifold co-regularization, In Processings of ICML 2008.
    
    [38] S.L. Sun, C.S. Zhang and Y. Zhang. Traffic flow forecasting using a spatio-temporal prediction, In Processings of ICANN 2005, 273-278.
    [39] V.N. Vapnik. The nature of statistic learning theory. Springer, New York, 1995.
    
    [40] S. Haykin. Neural networks: a comprehensive foundation, 1st edn. Prentice Hall PTR, Englewood Cliffs, 1994.
    
    [41] P. Belhumeur, J. Hespanha and D. Kriegman. Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997,19(7): 771-720.
    
    [42] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience 1991,3(1): 71-86.
    
    [43] M. Turk and A. Pentland. Face Recognition using eigenfaces. In Processings of the IEEE Conf. On Computer Vision and Pattern Recognition, 1991, 586-591.
    
    [44] T. Evgeniou and M. Pontil. Regularized multi-task learning, In Processings of KDD, 2004.
    
    [45] O.L. Mangsarian. Nonlinear programming, Classic in Applied Matematic. In Processings of SIAM, 1994.
    
    [46] V.N. Vapnik. Statistical Learning Theory, Wiley, New York, 1998.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700