增强型典型相关分析研究与应用

英文题名：Research on Enhanced Canonical Correlation Analysis with Applications
作者：孙廷凯
论文级别：博士
学科专业名称：计算机应用技术
中文关键词：典型相关分析(CCA) ; 局部保持 ; 样本标号 ; 类信息 ; 类内相关 ; 类间相关 ; 样本缺失 ; 特征抽取 ; 多模态识别 ; 主成分分析
英文关键词：Canonical correlation analysis (CCA) ; locality preserving ; sample label ; class information ; within-class correlation ; between-class correlation ; missing sample ; feature extraction ; multimodal recognition ; principal component analysis (PCA)
学位年度：2006
导师：陈松灿
学科代码：081203
学位授予单位：南京航空航天大学
论文提交日期：2006-09-01

摘要

机器学习从有限的观察样本概括特定问题世界的模型,离不开数据分析工具的支持,以发现观测数据中隐含的各种关系。典型相关分析(CCA)是研究存在于两组变量之间相关关系的有力工具。作为一种多元数据分析方法,CCA自1936年问世以来,在回归建模、图像分析与处理、计算机视觉、模式识别和生物信息学等领域得到了广泛的应用,并日益受到各领域有关研究者的重视,而多模态识别技术的兴起又为基于CCA的模式识别方法的研究提供了新的契机。
     本文以CCA数学模型为研究对象展开深入的扩展研究,致力于用增强的CCA模型来解决机器学习中两种主要的学习问题:模式识别与回归建模。本文的创新性研究成果总结如下:
     (1)提出了一个非线性CCA模型,将一个非线性问题划分为一系列线性子问题的组合,用以解决实际中大量存在的非线性相关问题,并通过数据可视化实验和姿态估计实验验证了算法的有效性。
     (2)建立了一个CCA单模态识别的统一框架,揭示了“样本-类标号”方式的CCA与线性判别分析之间等价性产生的潜在机理;在此基础上,提出一个基于样本分布的软标号CCA,打破了这种等价性限制,提高了算法的识别性能。
     (3)提出了一种新的有监督学习方法-判别型CCA,该方法引入样本的类信息,并充分考虑了样本之间的相关关系及其对分类的影响。利用核技巧,进一步提出了核化的判别型CCA,用以解决较为复杂的线性不可分问题;实验表明这两种方法具有较高的识别性能。
     (4)在判别型CCA基础上,提出了一种有样本缺失的判别型CCA,用以克服实际中由于各种原因导致的样本缺失问题,该方法继承了判别型CCA的优点,且具有识别性能较好、节约时间和内存、对缺失样本数目相对不敏感等优点。
     (5)CCA将相关性作为样本间相似性度量。将这种思想推广到主成分分析(PCA),提出了基于相关性度量的伪主成分分析。在此基础上,将这种思想方法推广到近年来提出的基于二维模式的PCA算法家族中,使之成为有监督学习方式。此外,在不改变PCA原有算法框架的基础上,提出了引入类信息的PCA。实验表明这两种有监督PCA具有较好的分类效果。
Machine learning models the problem at hand using the finite observational data with the help of the data analysis tools to reveal the underlying relationship among the observations. Canonical correlation analysis (CCA) acts as a powerful tool to analyze the underlying dependency between the observed samples in two sets of data. Initially proposed in 1936 as the multivariate analysis method, CCA has been widely employed in regression and modeling, image analysis and processing, computer vision, pattern recognition, bioinformatics and etc. As a result, CCA has gained more and more attention of the researchers in the related research fields; in addition, the emerging multimodal recognition techniques offer new opportunities for the CCA based recognition algorithms.
     In this dissertation, we focus on the two principal problems in machine learning, i.e., regression and pattern recognition, using the proposed enhanced CCA models. The main contributions of this dissertation are summarized as follows:
     (1) Local preserving CCA (LPCCA) is proposed as a nonlinear extension of CCA to adapt to the nonlinear correlation in real applications. The globally nonlinear problem is decomposed into a serial of locally linear sub-problems. The proposed method is validated through the experiments of both the data visualization and the pose estimation.
     (2) A simple unified framework is constructed for the unimodal recognition using CCA to reveal the underlying mechanism of equivalence between the class-label-based CCA and linear discriminant analysis (LDA). Moreover, we propose CCA based on the independent soft label for each sample rather than a class to break the limitation of recognition performance due to this equivalence.
     (3) A novel supervised learning method, termed as discriminative CCA (DCCA), is proposed, which embodies the impacts of both within-class correlation and between-class correlation on classification. With the help of kernel trick, the kernelized discriminative CCA (KDCCA) is further proposed to tackle the linearly inseparable cases. The experiments show the superiority of both DCCA and KDCCA to other relatived methods in terms of the recognition performance.
     (4) Based on DCCA, the discriminative CCA with missing samples (DCCAM) is further proposed to overcome the difficulties due to the loss of samples in real applications. Besides the inherited advantages from DCCA, DCCAM possess the characteristics of better recognition performace,timesaving, space-saving and relatively insensitive w.r.t. the number of the missing samples.
     (5) The idea of correlation as the similarity metric between the samples in the context of CCA is generalized to principal component analysis (PCA), and the correlation based pseudo-principal component analysis (p-PCA) is proposed, Moreover, this idea is generalize to the recently developed algorithms of matrix-pattern-based PCA family and better recognition performance can be achieved. On the other hand, another supervised learning method, the class-information-incorporated PCA, is proposed without change of the original PCA framework, and the experiments validate the proposed method.

引文

[1] 王珏,关于机器学习的讨论,王珏,周志华,周傲英主编,机器学习及其应用,北京,清华大学出版社,2006.
    [2] K. Pearson. On lines and planes of closest fit to systems of points in space. Philosophical Magazine, 1901, 2(6), 559-572.
    [3] H. Hotelling, Analysis of a complex of statistical variables into principal components, Journal of Educational Psychology, 1933, 24: 417-441, 498-520.
    [4] H. Hotelling, Relations between two sets of variates. Biometrika, 1936, 28:321–377.
    [5] R.A.Fisher. The use of multiple measurement in taxonomic problems, Annals of Eugenics., 1936,7 Part II:179-188.
    [6] J. Kettenring. Canonical analysis of several sets of variables. Biometrika, 1971,58: 433-451.
    [7] 边肇祺,张学工等,模式识别(第二版),北京,清华大学出版社,2000.
    [8] 王惠文,偏最小二乘回归方法及其应用,北京,国防工业出版社,1999.
    [9] E. Kidron, Y.Y. Schechner, M. Elad. Pixels that sound, IEEE Proc. of Computer Vision and Pattern Recognition, 2005: 1, 88-95.
    [10] M. Shahjahan, K. Murase, A neural learning rule for CCA approximation. In: N.R. Pal et al. (Eds.): ICONIP 2004, Berlin Heidelberg, Springer-Verlag, 2004. LNCS 3316, pp. 465–470,
    [11] O. Friman, M, Borga, P. Lundberg, H. Knutsson. Canonical correlation as a tool in functional MRI data analysis. In: Proceedings of the SSAB Symposium on Image Analysis, Norrk?ping, Sweden, March 2001.
    [12] A. Nielsen. Multiset canonical correlations analysis and multispectral, truly multitemporal remote sensing data. IEEE Transactions on Image Processing, 2002,11: 293-305.
    [13] S. Ando. Image field categorization and edge/corner detection from gradient covariance. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(2):179-190.
    [14] H. Ichihashi, K.Honda, S. Araki. Fuzzy canonical correlation and cluster analysis for brain mapping on long term memory consolidated by mnemonics. IEEE International Conference on Fuzzy Systems, 2004, 1(1): 155- 160.
    [15] Y. Hel-Or, The canonical correlations of color images and their use for demosaicing, HP Labs Technical Report, HPL-2003-164(R.1), February 2004.
    [16] M. Borga. Learning Multidimensional Signal Processing. [ph.D Dissertations], Link?ping, Sweden, Link?ping University, 1998.
    [17] Y. Yamanishi, J.-P. Vert, A. Nakaya1, et al. Extraction of correlated gene clusters from multiple genomic data by generalized kernel canonical correlation analysis. Bioinformatics, 2003, 19(1), pp.i323–i330.
    [18] J. Galy, C. Adnet. Canonical correlation analysis: a blind source separation using non-circularity. Neural Networks for Signal Processing X, Proceedings of the 2000 IEEE Signal Processing Society Workshop, 2000, 1(11-13):465 – 473.
    [19] G. Huang, L.Yang, Z. He, Multiple acoustic sources location based on blind source separation. In: L. Wang, K. Chen, and Y.S. Ong (Eds.): ICNC 2005, Berlin Heidelberg, Springer-Verlag, 2005, LNCS 3610, pp.683-687.
    [20] M. Borga, H. Knutsson. A canonical correlation approach to blind source separation. Technical Report, LiU-IMT-EX-0062, Department of Biomedical Engineering, Link?ping University, 2001.
    [21] M. Borga, O. Friman, P. Lundbergy, H. Knutsson. A canonical correlation approach to exploratory data analysis in fMRI. Proceedings of the ISMRM Annual Meeting, Honolulu, Hawaii,2002.
    [22] J.K.Thomas, L.L.Scharf. Canonical correlations and canonical time series, IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996, 3: 1637-1640.
    [23] P. Schreier, L. Scharf, Canonical coordinates for transform coding of noisy sources, IEEE Transactions on signal processing, 2006, 54(1):235-243.
    [24] M Kuss, T. Graepel. The geometry of kernel canonical correlation analysis. Max Planck Institute for Biological Cybernetics Technical Report No. 108, May 2003.
    [25] B. Abraham, G. Merola. Dimensionality reduction approach to multivariate prediction. Computational Statistics & Data Analysis, 2005, 48(1):5-16
    [26] N. Cristianini, J. Shawe-Taylor. An Introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, 2000.
    [27] W. Zheng; X. Zhou; C. Zou et al. Facial expression recognition using kernel canonical correlation analysis (KCCA). IEEE Transactions on Neural Networks, 2006,17(1): 233-238
    [28] A. Falcone, M. Azimi-Sadjadi, J. Kankiewicz, et al. Feature extraction for cloud analysis using satellite imagery data, Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004, 2(7-10):1695-1699.
    [29] N. Vlassis, Y. Motomura, B. Krose, Supervised linear feature extraction for mobile robot localization, in: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’00), San Francisco, CA, 2000, pp. 2979–2984.
    [30] P. Lai, S. Chuang, C Fyfe. Power load forecasting using neural canonical correlates. International Conference on Pattern Recognition, 2000.
    [31] T. Melzer, M. Reiter, H. Bischof, Appearance models based on kernel canonical correlation analysis, Pattern Recognition, 2003, 36(9):1961–1971.
    [32] B. Fortuna. Kernel canonical correlation analysis with applications. In: SIKDD 2004 atMulticonference, Ljubljana, Slovenia. 2004.
    [33] D.R. Hardoon, S. Szedmak, J. Shawe-Taylor. Canonical correlation analysis: an overview with application to learning methods. Neural Computation 2004, 16: 2639-2664.
    [34] D. Hardoon, C. Saunders, S. Szedmak,et al. A correlation approach for automatic image annotation, In: X. Li, O.R. Zaiane, and Z. Li (Eds.): ADMA 2006, Berlin Heidelberg, Springer-Verlag, 2006, LNAI 4093, pp. 681–692.
    [35] C. Chibelushi, F. Deravi, J. Mason. A review of speech-based bimodal recognition. IEEE Transactions on Multimedia, 2002, 4(1):23-37.
    [36] A. Ross, A. K. Jain. Multimodal biometrics: an overview. In: Proc. of 12th European Signal Processing Conference (EUSIPCO), Vienna, 2004, pp. 1221-1224.
    [37] M. Turk, A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 1991, 3(1): 71-86.
    [38] M. Barker, W. Rayens. Partial least squares for discrimination. Journal of Chemometrics, 2003, 17:166-173.
    [39] J. Baek, M. Kim. Face recognition using partial least squares components. Pattern Recognition, 2004, 37(6):1303-1306.
    [40] C.M. Bishop. Neural Network for Pattern Recognition, Oxford, Clarendon Press. 1995.
    [41] M.S. Bartlett. Further aspects of the theory of multiple regression. Proceedings of the Cambridge Philosophical Society, 1938, 34:33-40.
    [42] T. Gestel, J. Suykens, J. De Brabanter, et al. Kernel canonical correlation analysis and least squares support vector machines. In: Proc. of the International Conference on Artificial Neural Networks, 2001, 384-389.
    [43] Y. He, L. Zhao, C. Zou. Face Recognition Based on PCA/KPCA plus CCA. In: L. Wang, K. Chen, and Y.S. Ong (Eds.): ICNC 2005, Berlin Heidelberg, Springer-Verlag, 2005, LNCS 3611,pp.71-74.
    [44] S. Mika, G. Ratsch, J. Weston, B. Scholkopf, K-R. Müller. Fisher discriminant analysis with kernels. In: IEEE Neural Networks for Signal Processing Workshop, 1999, 41~48.
    [45] 孙平,徐宗本,申建中,基于核化原理的非线性典型相关判别分析,计算机学报, 2004,27(6):789-795.
    [46] M. Yamada, A. Pezeshki, M. Azimi-Sadjadi. Relation between kernel CCA and kernel FDA. IEEE International Joint Conference on Neural Networks, 2005,1(1) :226-231.
    [47] Yo Horikawa. Use of autocorrelation kernels in kernel canonical correlation analysis for texture classification. In: N.R. Pal et al. (Eds.): ICONIP 2004, Berlin Heidelberg, Springer-Verlag, 2004, LNCS 3316, pp. 1235-1240.
    [48] B.Sch?lkopf, A.Smola, K-R. Müller. Nonlinear component analysis as a kernel eigenvalue problem,Neural Computation, 1998, 10, 1299-1319.
    [49] A. Pezeshki, L. Scharf, Azimi-Sadjadi, et al. Empirical canonical correlation analysis in subspaces. Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004,1: 994-997.
    [50] M. Loog, B. Ginneken, R.P.W. Duin. Dimensionality reduction of image features using the canonical contextual correlation projection. Pattern Recognition, 2005,38(12): 2409-2418
    [51] Thomas Melzer. Generalized canonical correlation analysis for object recognition. [ph.D Dissertations], Vienna University of Technology, 2002.
    [52] H. Pan. A Bayesian fusion approach and its application to integrating audio and visual signals in HCI. [ph.D. Dissertations], University of Illinois at Urbana-Champaign, 2001.
    [53] L. Klein. Sensor and Data Fusion Concepts and Applications. Bellingham, WA: SPIE Press, 1993.
    [54] A. Dempster. A generalization of Bayesian inference. Journal of the Royal Statistical Society, 1968, 30: 205-247.
    [55] Hao Pan, Z-P. Liang, Thomas S. Huang. Estimation of the joint probability of multisensory signals. Pattern Recognition Letters, 2001,22(13):1431-1437.
    [56] Hao Pan, Z-P. Liang, Thomas S. Huang. Exploiting the dependencies in information fusion. IEEE Conference on Computer Vision and Pattern Recognition, 1999, 2: 23-25.
    [57] Q. Sun, S. Zeng,Y. Liu, P-A. Heng, D-S. Xia. A new method of feature fusion and its application in image recognition. Pattern Recognition, 2005, 38(12): 2437-2448.
    [58] Q. Sun, P-A. Heng, Z. Jin, D-S. Xia. Face recognition based on generalized canonical correlation analysis. In: D.S. Huang, X.-P. Zhang, G.-B. Huang (Eds.): ICIC 2005, Part II, LNCS 3645, pp. 958-967, 2005. Springer-Verlag, Berlin Heidelberg.
    [59] 孙权森,曾生根,王平安,夏德深.典型相关分析的理论及其在特征融合中的应用,计算机学报,2005,28(9):1524-1533.
    [60] 孙权森,曾生根,杨茂龙,王平安,夏德深.基于典型相关分析的组合特征抽取及脸像鉴别,计算机研究与发展,2005,42(4): 614-621.
    [61] M.Sargin, E. Erzin, Y. Yemez, et al. Multimodal speaker identification using canonical correlation analysis. IEEE International Conference on Acoustics, Speech and Signal Processing, 2006, 1:I-613 - I-616.
    [62] A. Pezeshki, A-Sadjadi, Scharf, et al. A canonical correlation-based feature extraction method for underwater target classification, Oceans'02 MTS/IEEE , 2002 1:29-37.
    [63] 张尧庭,方开泰. 多元统计分析引论. 北京,科学出版社,1999.
    [64] M. Slaney, M. Covell, FaceSync: a linear operator for measuring synchronization of video facial images and audio tracks. Advances in Neural Information Processing Systems, 2000.
    [65] M.Hasan. A new approach for computing canonical correlations and coordinates. International Symposium on Circuits and Systems, 2004, 3: 309-312.
    [66] M.Hasan. Information criteria for reduced rank canonical correlation analysis. IEEE International Joint Conference on Neural Networks, 2004 3:2215 – 2220.
    [67] M. Borga, T. Landelius, H. Knutsson. A unified approach to PCA, PLS, MLR and CCA, Technical Report, LiTH-ISY-R-1992, Link?ping University, Sweden, 1992.
    [68] F. Bach, M. Jordan. Kernel independent component analysis. Journal of Machine Learning Research, 2002, 3:1-48.
    [69] J. Friedman. Regularized discriminant analysis, J. of the Amer. Stat. Associat., 1989, 84(405): 165-175.
    [70] J. Shawe-Taylor, N. Cristianini, Kernel Methods for Pattern Analysis. Cambridge University Press.(英文影印版:模式分析的核方法,北京,机械工业出版社,2005)
    [71] 徐兴忠.多组变量的典型相关系数和典型相关变量.科学通报,1996, 41(13):1153-1157.
    [72] B. Neuenschwander, B.Flury. Common canonical variates. Biometrika 1995, 82: 553–560.
    [73] S. Akaho, Y. Kiuchi, S. Umeyama. MICA: multimodal independent component analysis. Neural Networks, 1999, 2(2):927-932.
    [74] W. Hsieh, Nonlinear canonical correlation analysis by neural networks. Neural Networks 2000,13 (10):1095–1105.
    [75] S. Kumar, E. Martin, A. Morris. Non-linear canonical correlation analysis using a RBF network. Proceedings of European Symposium on Artificial Neural Networks, Bruges, Belgium, 2002, pp. 507–512.
    [76] P. Lai, C. Fyfe. Neural implementation of canonical correlation analysis. Neural Networks, 1999,12(10):1391–1397.
    [77] Y. Takane,Y. Oshima-Takane. Nonlinear generalized canonical correlation analysis by neural network models. In: Nishisato, S. et al. (Eds.), Measurement and Multivariate Analysis, Tokyo, Springer Verlag, 2002, pp.183-190.
    [78] X. Yin. Canonical correlation analysis based on information theory. Journal of Multivariate Analysis, 2004,91(2):161-176.
    [79] A. Klami, S. Kaski. Non-parametric dependent components. IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005. 5:v/209 - v/212.
    [80] K. Fukunaga. Introduction to Statistical Pattern Recognition, 2nd edition. Academic Press, 1990.
    [81] T. M. Cover. Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transactions. on Electronic Computers, 1965, 14:326-334.
    [82] B. Scholkopf, A. Smola. Learning With Kernels, Cambridge: MIT Press, 2002.
    [83] J. Mercer. Functions of positive and negative type, and their connection with the theory of integral equations. Trans. of the London Philosophical Society (A), 1909, 209:819-835.
    [84] G. Baudat, F. Anouar. Generalized discriminant analysis using a kernel approach. Neural Computation, 2000, 12(10):2385-2404.
    [85] J. Tenenbaum, Vin de Silva, J.C. Langford. A global geometric framework for nonlinear dimensionality reduction. Science, 2000,290:2319-2323.
    [86] S.T. Roweis, L.K. Saul. Nonlinear dimensionality reduction by locally linear embedding. Science, 2000, 290:2323-2326.
    [87] X. He, P. Niyogi. Locality preserving projections. Advances in Neural Information Processing Systems, 2004.
    [88] X. He, S. Yan, Y. Hu, et al. Face recognition using Laplacianfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3):328–340.
    [89] W. Min, K. Lu, X. He. Locality pursuit embedding. Pattern Recognition, 2004,37(4):781–788.
    [90] N. Kambhatla, T.K. Leen. Dimension reduction by local principal component analysis, Neural Computation, 1997, 9:1493-1516.
    [91] J. Verbeek, S. Roweis, N. Vlassis. Non-linear CCA and PCA by alignment of local models. Advances in Neural Information Processing Systems (2004).
    [92] T. Kim, J. Kittler, Locally linear discriminant analysis for multimodally distributed classes for face recognition with a single model image, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(3):318–327.
    [93] L.Saul, S. Roweis. Think globally, fit locally: unsupervised learning of low dimensional manifold. Journal of Machine Learning Research, 2003, 4:119-155.
    [94] Mathworks Inc., Matlab 7.0 Release 14 help: Statistics toolbox, January 2005.
    [95] H. Murase, S. Nayar. Visual learning and recognition of 3-D objects from appearance. International Journal of Computer Vision, 1995, 14(1):5–24.
    [96] L. Hoegaerts, J. Suykens, J. Vandewalle, B. De Moor. Subset based least squares subspace regression in RKHS. Neurocomputing, 2005, 63: 293-323.
    [97] H. Seung, D. Lee. The manifold ways of perception. Science, 2000, 290:2268-2269.
    [98] S. Nayar, S. Nene, H. Murase. Subspace methods for robot vision. IEEE Transactions on Robotics and Automation, 1996, 12(5):750-758.
    [99] B. Johansson. On classification: simultaneously reducing dimensionality and finding automatic representation using canonical correlation. Technical Report, LiTH-ISY-R-2375, Link?ping University, 2001.
    [100] V N.Vapnik. The Nature of Statistical Learning Theory (2nd ed.). NY: Springer-Verlag, 2000. (中译本:统计学习理论的本质,张学工译,北京,清华大学出版社,2000).
    [101] R. Duda, P. Hart, D. Stock. Pattern Classification, 2nd edition. John Wiley and Sons, New York, 2001.(中译本:模式分类,李宏东等译,北京,机械工业出版社,2003)
    [102] J. Yang; A. Frangi, J-Y. Yang, et al. KPCA plus LDA: a complete kernel Fisher discriminant framework for feature extraction and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(2):230-244.
    [103] S. Yan, D. Xu, B. Zhang, H-J. Zhang, Graph embedding: a general framework for dimensionality reduction. In: Proceeding of IEEE CVPR’05, 2005.
    [104] T.M. Cover, P.E. Hart. Nearest neighbor pattern classification, IEEE Transactions on Information Theory. 1967, 13: 21–27.
    [105] J. Keller, M. Gray, J. Givens. A fuzzy k-nearest neighbor algorithm. IEEE Transactions on Systems, Man and Cybernetics. 1985, 15(4):580-584.
    [106] S. Chen, J. Liu, Z-H. Zhou. Making FLDA applicable to face recognition with one sample per person. Pattern Recognition, 2004, 37(7):1553-1555.
    [107] 张军平,王珏,流形学习,周志华,曹存根主编,神经网络及其应用,北京,清华大学出版社,2004.
    [108] J. Kittler, R.Ghaderi, T.Windeatt, et al. Face verification using error correcting output codes. In: Computer Vision and Pattern Recognition, 2001, Hawaii, USA, 755-760.
    [109] F. Masulli, G.Valentini. Effectiveness of error correcting output coding methods in ensemble and monolithic learning machines. Pattern Analysis & Applications, 2003, 6(4): 285-300.
    [110] A. Passerini, M. Pontil, P. Frasconi,. New results on error correcting output codes of kernel machines. IEEE Transactions on. Neural Network, 2004, 25(1):45-54.
    [111] K. Chang, K. Bowyer. S. Sarkar, et al. Comparison and combination of ear and face images in appearance-based biometrics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(9): 1160-1165.
    [112] S.Gurbuz, Z. Tufekci, E.Patterson, et al. Independent information from visual features for multimodal speech recognition. IEEE Proceedings on SoutheastCon 2001: 221-228.
    [113] A.K.Jain, A. Ross, S. Prabhakar. An introduction to biometric recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2004, 14(1):4-20.
    [114] J. Kittler, M. Hatef, R.P.W. Duin, et al. On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(3):226-239.
    [115] Z.-H. Zhou, Y. Jiang, S.-F. Chen. Extracting symbolic rules from trained neural network ensembles. AI Communications, 2003, 16(1): 3-15.
    [116] A.K. Jain, R.P.W. Duin, J. Mao. Statistical pattern recognition: a review. IEEE Transactions onPattern Analysis and Machine Intelligence, 2000, 22(1): 4-37.
    [117] J. Yang, J-Y. Yang, D. Zhang, et al. Feature fusion: parallel strategy vs. serial strategy. Pattern Recognition, 2003, 36(6):1369-1381.
    [118] P. Sampson, A. Streissguth, H. Barr, et al. Neurobehavioral effects of prenatal alcohol: Part II. Partial least squares analysis. Neurotoxicology and Teratology, 1989,11(5): 477-491.
    [119] Q. Sun, Z. Jin, P-A Heng, et al. A novel feature fusion method based on partial least squares regression. In: S. Singh et al. (Eds.): ICAPR 2005, Berlin Heidelberg, Springer-Verlag, 2005, LNCS 3686, pp. 268-277.
    [120] M. Borga. Canonical correlation: a tutorial. [online] at http://people.imt.liu.se/~magnus /cca/tutorial/, 1999.
    [121] J. Rennie. Improving multi-class text classification with naive Bayes. [Master thesis], Massachusetts Institute of Technology, 2001.
    [122] B. Dasarathy. Nearest neighbor (NN) norms: NN pattern classification techniques. Las Alamitos, California, IEEE Computer Society Press, 1990.
    [123] R. Baeza-Yates, B. Ribeiro-Neto. Modern Information Retrieval, Addison Wesley.(影印版:现代信息检索,北京:机械工业出版社,2004).
    [124] F. Sebastiani. Machine learning in automated text categorization. ACM Computing Surveys, 2002, 34(1):1-47.
    [125] P. N. Belhumeour, J. P. Hespanha, D. J. Kriegman. Eigenfaces vs. Fisherfaces: recognition using class specific linear projection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997, 19(7):711-720.
    [126] R Rosipal. Kernel partial least squares for nonlinear regression and discrimination. Neural Network World, 2003, 13(3): 291-300.
    [127] R. Rosipal, L. Trejo, B. Matthews. Kernel PLS-SVC for Linear and Nonlinear Classification. Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), Washington DC, 2003.
    [128] R. Rosipal, L. Trejo. Kernel partial least squares regression in reproducing kernel Hilbert space. Journal of Machine Learning Research, 2001, 2:97-123.
    [129] N. Ayat, M. Cheriet, C. Suen. Automatic model selection for the optimization of SVM kernels. Pattern Recognition, 2005, 38(10):1733-1745.
    [130] E.Parrado-Hernández, J.Arenas-García, I.Mora-Jiménez, et al. On problem-oriented kernel refining. Neurocomputing, 2003, 55(1-2):135-150.
    [131] Y. Wu, E. Chang, K. Chang, et al. Optimal multimodal fusion for multimedia data analysis. ACM Multimedia 2004, New York, October 2004.
    [132] B. Walczak, D. Massart. Dealing with missing data Part I. Chemometrics & Intelligent Laboratory Systems, 2001, 58(1):15-27.
    [133] A. Dempster, N. Laird, D. Rubin. Maximum likelihood from incomplete data via the EM Algorithm. Journal of the Royal Statistical Society, Series B, 1977, 39(1):1–38.
    [134] D. Zhang, S. Chen. Clustering incomplete data using kernel-based fuzzy c-means algorithm. Neural Processing Letters, 2003, 18(3):155-162.
    [135] H. Kim, G. Golub, H. Park. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics, 2005,21:187-198.
    [136]. L. Wang; Y. Zhang, J. Feng. On the Euclidean distance of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2005, 27(8):1334-1339.
    [137] V. Perlibakas. Distance measures for PCA-based face recognition. Pattern Recognition Letters, 2004, 25(6), 711-724.
    [138] J. Yang, D. Zhang, A. Frangi, J-Y.Yang. Two-dimensional PCA: a new approach to appearance-based face representation and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(1):131-137.
    [139] S. Chen, Y. Zhu, D. Zhang, et al. Feature extraction approaches based on matrix pattern: MatPCA and MatFLDA. Pattern Recognition Letters, 2005, 26(8): 1157-1167.
    [140] D. Zhang, Z-H. Zhou. (2D)2PCA: two-directional two-dimensional PCA for efficient face representation and recognition. Neurocomputing, 2005, 69(1-3):224-231.
    [141] L. Wang , X. Wang, X. Zhang, et al. The equivalence of two-dimensional PCA to line-based PCA. Pattern Recognition Letters, 2005, 26 (1):57-60.
    [142] A.M. Martinez, A.C. Kak, PCA versus LDA. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(2):228 -233.
    [143] G. Ratsch, T. Onoda, K-R. Müller. Soft margins for AdaBoost. Machine Learning, 2001, 42 (3):287–320.
    [144] S. Becker, M. Plumbley. Unsupervised neural network learning procedures for feature extraction and classification. Applied Intelligence, 1996, 6(3):185-205.
    [145] S. Chen, Y. Zhu. Subpattern-based principle component analysis. Pattern Recognition, 2004, 37(5): 1081-1083.
    [146] Z-H. Zhou, M. Li. Tri-Training: exploiting unlabeled data using three classifiers. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11):1529-1541.
    [147] J. Wegelin. A survey of partial least squares (PLS) methods, with emphasis on the two-block case. Technical Report No.371, Department of Statistics, University of Washing, 2000.
    [148] D.Wolpert, W. Macready. No free lunch theorems for optimization. IEEE Transactions onEvolutionary Computation, 1997, 1(1):67-82.
    [149] B. Scholkopf, S. Mika, C. Burges, et al. Input space versus feature space in kernel-based methods. IEEE Transactions on Neural Networks, 1999, 10(5): 1000-1017.
    [150] J. Ma, J. Theiler, Simon Perkins. Two realizations of a general feature extraction framework. Pattern Recognition, 2004, 37(5):875-887.
    [151] 谭晓阳,单训练样本条件下基于自组织神经网络的鲁棒人脸识别技术研究,南京大学博士学位论文,2005年4月.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700