监督和半监督典型相关分析及其应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
典型相关分析(CCA)作为经典的多元数据分析方法,通过研究两组变量之间的相关关系来进行特征提取,近年来已开始在模式识别和机器学习等多个领域得到广泛的应用。但一方面,CCA作为一种全局线性特征提取方法,不足以很好地描述非线性问题,缺乏对局部变化的识别鲁棒性。另一方面,在半监督学习兴起的热潮下,还可以考虑将半监督技术引入到CCA中,以更好的利用先验信息。本文围绕这两个方面对CCA进行扩展研究,致力于用扩展的CCA模型来解决机器学习和模式识别中的分类问题。本文的创新性研究成果总结如下:
     (1)为解决CCA不足以很好的描述非线性问题,提出了一种新的有监督学习方法---局部判别型CCA。该方法引入样本的类信息,并同时考虑了同类样本之间的局部相关与不同类样本之间的局部相关关系及其对分类的影响。利用核技巧,进一步提出了核化的局部判别型CCA,用以解决较为复杂的线性不可分问题。LDCCA和KLDCCA提取的特征能够实现同类样本之间的相关最大化,同时使得不同类样本之间的相关最小化,这将有利于模式的分类。在人工数据集、多特征手写体数据集和人脸数据集(Yale和AR)上的实验结果表明:这两种方法具有较高的识别性能。
     (2)通过引入以成对约束形式给出的监督信息,提出了一种半监督的典型相关分析算法(Semi-CCA)。在此算法中,除了考虑大量的无标号样本以外,还考虑成对约束信息,即已知两样本属于同一类(正约束)或不属于同一类(负约束),同时验证了两者的相对重要性。同时,为解决广泛存在的非线性问题,对Semi-CCA进行了核化,提出了KSemi-CCA。在多个数据集上的实验结果表明:Semi-CCA和KSemi-CCA能够有效地利用少量的监督信息来提高分类性能。
     (3)在对行为识别深入研究的基础上,将提出的半监督CCA和局部判别型CCA以及他们的核化算法用于简单的行为识别,实验结果表明:利用我们的算法提取的特征对最终的行为分类具有较好的决定作用。
As a classic multivariate data analysis method, canonical correlation analysis (CCA) extracts features through studying the correlation between two groups of variables and has been widely applied in pattern recognition and machine learning recently. However, on one hand, as a global linear method for feature extraction, CCA cannot deal with non linear problems and also it is lack of the robustness to local variants. On the other hand, under the rise of semi-supervised learning, original CCA model can be further extended to semi-supervised case for better use of prior information. In this thesis, we focus on the extending original CCA model for classification problems in machine learning and pattern recognition addressing the two problems above. The main constributions of this thesis are summarized as follows:
     (1) In order to overcome the disadvantage of CCA which cannot describe nonlinear problems, we have proposed a new supervised learning analysis called Local Discrimination CCA (LDCCA) which considers a combination of local properties and discrimination between classes. Besides considering correlations between sample pairs as in original CCA, LDCCA also uses correlations between samples and their local neighborhoods. Besides, a kernelized LDCCA (KLDCCA) is also proposed to extract nonlinear features in datasets. Furthermore in LDCCA and KLDCCA, effective discrimination is achieved by maximizing local within-class correlations while minimizing local between-class correlations. A series of experimental results on an artificial dataset, multiple feature databases and facial databases including ORL, Yale, AR validate the effectiveness of our proposed methods.
     (2) A semi-supervised canonical correlation analysis algorithm called Semi-CCA is developed, which uses supervision information in the form of pair-wise constraints in canonical correlation analysis (CCA). In this setting, besides abundant unlabeled data examples, the domain knowledge in the form of pair-wise constraints which specify whether a pair of data examples belongs to the same class (must-link constraints) or not (cannot-link constraints) is also available. Meanwhile, the relative importance of must-link constraints and cannot-link constraints is also validated. Inorder to sovle the nonlinear problems, we proposed KSemi-CCA. Experimental results on multiple datasets show that the proposed Semi-CCA can effectively enhance the classifier performance by using only a small amount of supervision information.
     (3) Based on the study of activity recognition, we use the four proposed methods (Semi-CCA, KSemi-CCA, LDCCA, KLDCCA) as the feature extraction methods for avtivity recognition. The experimental result indicates that our feature extraction methods have done well on the final activity recognition problem.
引文
[1] M.Borga, H.Knutsson. Canonical Correlation Analysis in Early Vision Processing. In: Proceedings of the 9th European Symposium on Artificial Neural Networks. 2001, pp: 309-314.
    [2] H.Gao, W.Hong, J.Cui, Y.Xu. Optimization of Principal Component Analysis in Feature Extraction. IEEE International Conference Mechatronics and Automation, 2007, pp: 3128-3132.
    [3] W.Hsieh. Nonlinear canonical correlation analysis by neural networks. Neural Networks, Volume 13 , Issue 10, 2000, pp: 1095-1105.
    [4] S.Akaho. A kernel method for canonical correlation analysis. In: Proceedings of the International Meeting of the Psychometric Society, Tokyo: Springer-Verlag, 2001.
    [5] Francis R. Bach, Michael I. Jordan. Kernel Independent Component Analysis. Journal of Machine Learning Research, 2002, pp: 1-48.
    [6] Y.Takane, H.Yanai, H.wang. An improved method for generalized constrained canonical correlation analysis. Computational Statistics & Data Analysis, Available online 6 November 2004.
    [7] R.A.Fisher. The use of multiple measurements in taxonomic problems, Annals of Eugenics, 1936, 7 Part II: 179-188.
    [8] T. Melzer, M. Reiter, H. Bischof. Appearance models based on kernel canonical correlation analysis. Pattern Recognition, 2003, vol. 36, pp: 1961-1971.
    [9]孙权森,曾根生,王平安,等.基于典型相关分析的组合特征抽取及脸像鉴别.计算机研究与发展, 2005,42(4):614-621.
    [10] Yo.Horikawa. Use of Autocorrelation Kernels in Kernel Canonical Correlation Analysis for Text Classification. International Conference on Neural Information Processing, 2004, 1235-1240.
    [11] P.Comom. Independent Component Analysis, a New Concept. Signal Processing, 1994, vol 36, pp:287-314.
    [12] A.Ross, A.K.Jain. Multimodal biometrics: an overview. In: Proceedings of 12th European Signal Processing Conference, Vienna, 2004, pp:1221-1224.
    [13] Q.Sun, P-A.Heng, Z.Jin, D-S.Xia. Face recognition based on generalized canonical correlation analysis. In: D.S.Huang, X.-P. Zhang, G.-B.Huang(Eds.): ICIS 2005, Springer-Verlag, Berlen Heidelberg , 2005, pp:958-967.
    [14]王惠文,偏最小二乘回归与应用,北京,国防工业出版社,2001.
    [15] J.Baek, M.Kim. Face recognition using partial least squares components. Pattern recognition, 2004, 37(6):1303-1306.
    [16]孙权森,曾生根,王平安,等.典型相关分析的理论及其在特征融合中的应用.计算机学报, 2005, 28(9) :1524-1533.
    [17] Q.Sun, S.Zeng, Y.Liu, P-A. Heng, D-S.Xia. A new method of feature fusion and its application in image recognition. Pattern Recognition, 2005, 38(12):2437-2448.
    [18] D.Q.Zhang, Z.H.Zhou, S.C.Chen. Semi-Supervised Dimensionality Reduction. In: Proceeding of the 7th SIAM International Conference on Data Mining, 2007, pp: 629-634.
    [19] A.Weingessel. K.Hornik. Local PCA Algorithms. IEEE Transactions on Neural Networks, 2000, vol 11, pp: 1242-1250.
    [20] R.Gottumkkal, V.K.Asari. An Improved Face Recognition Technique based on Modular PCA Approach. Pattern Recognition Letters, 2004, vol 25, pp:429-736.
    [21] S.C.Chen, Y.L.Zhu. Subpattern-based Principal Component Analysis. Pattern Recognition, 2004, 37(1):1081-1083.
    [22] Keren T, Songcan C. Adaptively Weighted Sub-pattern PCA for Face Recognition. Neurocomputing, 2005, 64:505-511.
    [23] T.K.Sun, S.C.Chen. Locality preserving CCA with applications to data visualization and pose estimation. Image and Vision Computing, 2007, 25(5):531-543.
    [24] B.Schǒlkopf, A.Smola, K.O.Müller. Nonlinear Component Analysis as an Eigenvalue Problem. Neural Computation, 1998, vol 45, pp: 1299-1319.
    [25] S.Mika, G.Ratsch, J.Weston, et al. Fisher Discriminant Analysis with Kernels. Neural Networks for Signal Processing IX, 1999, pp: 41-48.
    [26] I.Borg, P.Groenen. Modern Multidimensional Scaling: Theory and Applications. New York: Springer-Verlag, 1997.
    [27] S.Becker, M.Plumbley. Unsupervised Neural Network Learning Procedures for Feature Extraction and Classification. Journal of Applied Intelligence, 1996, vol 6, pp: 185-205.
    [28] T.Kohonen. Self-Organizing Maps. 3rd ed. Berlin: Springer-Verlag, 1995.
    [29] S.C.Chen, Y.L.Zhu, D.Q.Zhang, et al. Feature Extraction approaches based on Matrix Pattern: MatPCA and MatFLDA. Pattern Recognition Letterns, 2005, 26(8): 1157-1167.
    [30] S.T.Roweis, L.K.Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290:2323-2326.
    [31] X.He, P.Niyogi. Locality preserving projections. Advances in Neural Information Processing Systems 16, Vancouver, Canada, 2003.
    [32] M.Barker, W.Rayens. Partial Least Squares for Discrimination. Journal of Chemometrics, 2003, 17: 166-173.
    [33] T.G.Dietterich, G.Bakiri. Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Research, 1999, vol 2, pp: 263-286.
    [34] A.Bar-hillel, T.Hertz, N.Shental, Leearning a mahalaniobis metric form equivalence constraints. Journal of Machine Learning Research, 2005(6): 937-965.
    [35] C.H.H.Steven, W.Liu, R.L.Michael, et al. Learning Distance Metric with Contextual Constraints for Image Retrieval. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition. 2006. New York.
    [36] W.Hsieh. Nonliear Canonical Correlation Analysis by Neural Networks. Neural Networks, 2000, 13(10):1095-1105.
    [37] J.Tenenbaum, Vin de Silva, J.C.Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 2000, 290:2319-2323.
    [38] T.Kajisa, T.Murakami, N.Mizoue. Estimation of stand volumes using the k-nearest neighboors method. Journal of Forest Research, 2008, 13(4):249-254.
    [39] Fabien.L, Franc.D, Remi.G. Learning from Positive and Unlabeled examples. In 11-th Intl. Conference on Algorithmic Learning Theory, Sydney, Australia, December 2000, pp: 71-85.
    [40] W.S.Lee, B.Liu. Learning with Positive and Unlabeled Examples Using Weighted Logistic Regressoin. In Proceedings of International Conference on Machine Learning, 2003, pp: 448-455.
    [41] S.Basu, A.Banerjiee. Semi-supervised Clustering by Seeding. International Conference on Machine Learning, 2002, pp: 19-26.
    [42] A.Bar-Hillel, T.Hertz, N.Shental, et al. Learning Distance Functions using Equivalence Relations. In Proc.of 20th International Conference on Machine Learning, 2003, pp: 11-18.
    [43] Eric P.Xing, Andrew Y.Ng, Michael I.Jordan, et al. Distance Metric Learning with Application to Clustering with Side-information. Advances in Neural Information Processing Systems. The MIT Press.
    [44] F.d’Alche Buc, Y.Grandvalet, C.Ambroise. Semi-supervised marginboost. Advances in Neural Information Processing Systems 14. MIT Press, 2002.
    [45] F.Cozman, I.Cohen, M.Cirelo. Semi-supervised Learning of Mixture Models, In Proceedings of International Conference on Machine Learning, 2003.
    [46] K.Nigam, R.Ghani. Analyzing the Effectiveness and Applicability of Co-training. Proceedings of Information and Knowledge Management, 2000, pp: 86-93.
    [47] I.Muslea, S.Minton, A.Knoblock. Active + Semi-supervised Learning = Robust Multi-View Learning, International Conference on Machine Learning, 2002.
    [48] M.Szummer, T.Jaakkola, T.Poggio. Learning from Partially Labeled Data, Artificial Intelligence Laboratory and The Center for Biological and Computational Learning, Massachusetts Institute of Technology Cambridge, Massachusetts 021339, http://www.ai.mit.edu.
    [49] A.P.Dempster, N.M.Laird, D.B.Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, 2000, 39(1): 1-3.
    [50] Z.H.Zhou, M.Li. Semi-Supervised Regression with Co-Training. In: International Joint Conferences on Artificial Intelligence, Edinburgh, Scotland, 2005, pp: 908-913.
    [51] R.El-Yaniv, D.Pechyony, V.Vapnik. Large Margin vs. Large Volume in Transductive Learning. Machine learning, 2008, 72(3):173-188.
    [52] W.Zheng, X.Zhou, C.Zou, L.Zhao. Facial Expression Recognition Using Kernel Canonical Correlation Analysis (KCCA). IEEE Transactions on Neural Networks, 2006, 17(1): 233-238.
    [53] M.Loog, B.van Ginneken, R.P.W.Duin. Dimensionality Reduction by Canonical Contextual Correlation Projections. In: European Conference on Computer Vision, 2004, 3021: 562-573.
    [54] O.Friman, J.Carlsson, P.Lundberg, M.Borga, H.Knutsson. Detection of Neural Activity in Functional MRI Using Canonical Correlation Analysis. Magnetic Resonance in Medicine, 2001, 45(2):323-330.
    [55] N.Vlassis, Y.Motomura, B.Krosa, Supervised Linear Feature Extraction for Mobile Robot Localization. In: Proceedings of the IEEE International Conference on Robotics and Automation, 2000, pp: 2979-2984.
    [56] B.Abraham, G.Merola. Dimensionality Reductions Approach to Multivariate Prediction. In: Computational Statistics and Data Analysis, 2005, 48(1):5-16.
    [57] Y.Li, J.Shawe-Taylor. Using KCCA for Japanese-English Cross-language Information Retrieval and Document Classification. Journal of Intelligent Information Systems, 2006, 27(2):117-133.
    [58] J. H. Friedman. Regularized discriminant anlysis, J. of the Amer. Stat. Associat., 1989, 84(405): 165-175.
    [59] M.Hasan. A new approach for computing canonical correlations and coordinates. International Symposium on Circuuits and Systems, 2004, 3:309-312.
    [60] D.R.Hardoon, S.Szedmak, J.Shawe-Taylor. Canonical Correlation Analysis: An overview with application to learning methods. Neural Computation, 2004, 16:2639-2664.
    [61] P.N.Belhumeur, J.P.Hespanha, D.J.Kriegman. Eigenfaces vs Fisherfaces: recognition using class specific liear projection. IEEE TPAMI, 1997, 19(7):711-720.
    [62]孙廷凯,增强型典型相关分析研究与应用,博士学位论文,南京航空航天大学,2006年9月.
    [63] J.D.R.Farquhar, D.R.Hardoon, H.Y.Meng, et al. Sandor Szedmak. Two view learning: SVM-2K. Theory and Practice, NIPS, 2005.
    [64] F.R.Bach, M.I.Jordan. Learning Graphical Models with Mercer Kernels. Neural Information Processing Systems, 2002:1009-1016.
    [65] Q.Sun, Z.Jin, P.A Heng, et al. A novel feature fusion method based on partial least squares regression. In: International Conference on Advances in Pattern Recognition, 2005, pp: 268-277.
    [66] D.Cai, X.He, J.Han. Semi-Supervised Discriminant Analysis. IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, Oct.2007.
    [67] M.Sugiyama, T.Ide, S.Nakajima, et al. Semi-Supervise Local Fisher Discriminant Analysis for Dimensionality Reduction. Advances in Knowledge Discovery and Data Mining, 2008, pp: 333-344.
    [68] Z.H.Zhou, D.C.Zhan, Q.Yang. Semi-Supervised Learning with Very Few Labeled Training Examples. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence, Vancouver, Canada, 2007, pp: 675-680.
    [69] A.Bobick, J.Davis. Real-time recognition of activity using temporal templates. In: Proc IEEE Workshop on Application of Computer Vision, Sarasota, Florida, 1996, pp: 39-42.
    [70] A.Bobick, J.Davis. The recognition of human movement using temporal templates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(3):257-267.
    [71] Polana.R, Nelso.R. Low level recognition of human motion. In: Proc IEEE Workshop on Motion of Non-Rigid and Articulated Objects, Austin, TX, 1994, pp: 77-82.
    [72] Brand.M, Oliver.N, Pentland.A. Coupled hidden markov models for complex action recognition. IEEE Computer Society Press, 1997, pp: 994-999.
    [73] S.Park, J.K.Aggarwal. Recognition of two-person interactions using a hierarchical bayesian network. In ACM SIGMM International Workshop on Video Surveillance, Berkeley, CA, USA, 2003, pp: 65-76.
    [74] A.Elgammal, C.S.Lee. Inferring 3D body pose from silhouettes using activity manifold learning. IEEE Confernce on Computer Vision and Pattern Recognition, 2004, 2:681-688.
    [75] L.Wang, Suter.D. Analyzing Human Movements from Silhouttes using Manifold Learning. Proceedings of the IEEE International Conference on Video and Signal Based Surveillance, 2006.
    [76] T.K.Kim, S.F.Wong, R.Cipolla. Tensor Canonical Correlation Analysis for Action Classification. IEEE Confernce on Computer Vision and Pattern Recognition, 2007.
    [77] J.Via, I.Santamaria, J.Perez. A Learning Algorithm for Adaptive Canonical Correlation Analysis of Several Data Sets. Neural Networks, 2007, 20(1):139-152.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700