Kernel discriminant analysis and clustering with parsimonious Gaussian process models
详细信息    查看全文
  • 作者:C. Bouveyron ; M. Fauvel ; S. Girard
  • 关键词:Model ; based classification ; Kernel methods ; Gaussian process ; parsimonious models ; Mixed data
  • 刊名:Statistics and Computing
  • 出版年:2015
  • 出版时间:November 2015
  • 年:2015
  • 卷:25
  • 期:6
  • 页码:1143-1162
  • 全文大小:2,827 KB
  • 参考文献:Akaike, Hirotugu: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716鈥?23 (1974)MATH MathSciNet CrossRef
    Andrews, J.L., McNicholas, P.D.: Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat. Comput. 22(5), 1021鈥?029 (2012)MATH MathSciNet CrossRef
    Biernacki, C., Celeux, G., Govaert, G.: Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans. Pattern Anal. Mach. Intell. 22(7), 719鈥?25 (2001)CrossRef
    Bouguila, N., Ziou, D., Vaillancourt, J.: Novel mixtures based on the Dirichlet distribution: application to data and image classification. In: Machine Learning and Data Mining in Pattern Recognition, pp. 172鈥?81. Springer, Berlin (2003)
    Bouveyron, C., Brunet, C.: Simultaneous model-based clustering and visualization in the Fisher discriminative subspace. Stat. Comput. 22(1), 301鈥?24 (2012)MathSciNet CrossRef
    Bouveyron, C., Brunet-Saumard, C.: Model-based clustering of high-dimensional data: a review. Comput. Stat. Data Anal. 71, 52鈥?8 (2013)MathSciNet CrossRef
    Bouveyron, C., Girard, S.: Robust supervised classification with mixture models: learning from data with uncertain labels. Pattern Recognit. 42(11), 2649鈥?658 (2009)MATH CrossRef
    Bouveyron, C., Jacques, J.: Model-based clustering of time series in group-specific functional subspaces. Adv. Data Anal. Classif. 5(4), 281鈥?00 (2011)MATH MathSciNet CrossRef
    Bouveyron, C., Girard, S., Schmid, C.: High-dimensional discriminant analysis. Commun. Stat. 36, 2607鈥?623 (2007a)MATH MathSciNet CrossRef
    Bouveyron, C., Girard, S., Schmid, C.: High-dimensional data clustering. Comput. Stat. Data Anal. 52, 502鈥?19 (2007b)MATH MathSciNet CrossRef
    Canu, S., Grandvalet, Y., Guigue, V., Rakotomamonjy, A.: SVM and kernel methods matlab toolbox. In: Perception Systemes et Information. INSA de Rouen, Rouen (2005)
    Caponnetto, A., Micchelli, C.A., Pontil, M., Ying, Y.: Universal multi-task kernels. J. Mach. Learn. Res. 68, 16151646 (2008)MathSciNet
    Cattell, R.: The scree test for the number of factors. Multivar. Behav. Res. 1(2), 245鈥?76 (1966)CrossRef
    Celeux, G., Govaert, G.: Clustering criteria for discrete data and latent class models. J. Classif. 8(2), 157鈥?76 (1991)MATH CrossRef
    Chapelle, O., Sch枚lkopf, B., Zien, A. (eds.): Semi-Supervised Learning. MIT Press, Cambridge. http://鈥媤ww.鈥媖yb.鈥媡uebingen.鈥媘pg.鈥媎e/鈥媠sl-book (2006)
    Couto, J.: Kernel k-means for categorical data. In: Advances in Intelligent Data Analysis VI, vol. 3646 of Lecture Notes in Computer Science, pp. 739鈥?39. Springer, Berlin (2005)
    Cuturi, M., Vert, J.P.: The context-tree kernel for strings. Neural Netw. 18(8), 1111鈥?123 (2005)CrossRef
    Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 39(1), 1鈥?8 (1977)MATH MathSciNet
    Dundar, M.M., Landgrebe, D.A.: Toward an optimal supervised classifier for the analysis of hyperspectral data. IEEE Trans. Geosci. Remote Sens. 42(1), 271鈥?77 (2004)CrossRef
    Evgeniou, T., Micchelli, C.A., Pontil, M.: Learning multiple tasks with kernel methods. J. Mach. Learn. Res. 6, 615637 (2005)MathSciNet
    Fisher, R.A.: The use of multiple measurements in taxonomic problems. Ann. Eugenics 7, 179鈥?88 (1936)CrossRef
    Forbes, F., Wraith, D.: A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Stat. Comput. (to appear) (2014)
    Franczak, B.C., Browne, R.P., McNicholas, P.D.: Mixtures of shifted asymmetric Laplace distributions. IEEE Trans. Pattern Anal. Mach. Intell. 36(6), 1149鈥?157 (2014)CrossRef
    Girolami, M.: Mercer kernel-based clustering in feature space. IEEE Trans. Neural Netw. 13(3), 780鈥?84 (2002)CrossRef
    G枚nen, M., Alpaydin, E.: Multiple kernel learning algorithms. J. Mach. Learn. Res. 12, 2211鈥?268 (2011)MATH MathSciNet
    Hofmann, T., Sch枚lkopf, B., Smola, A.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171鈥?220 (2008)MATH CrossRef
    Kadri, H., Rakotomamonjy, A., Bach, F., Preux, P.: Multiple Operator-Valued Kernel Learning. In: Neural Information Processing Systems (NIPS), pp. 1172鈥?080 (2012)
    Kuss, M., Rasmussen, C.: Assessing approximate inference for binary Gaussian process classification. J. Mach. Learn. Res. 6, 1679鈥?704 (2005)MATH MathSciNet
    Lee, S., McLachlan, G.J.: Finite mixtures of multivariate skew t-distributions: some recent and new results. Stat. Comput. 24(2), 181鈥?02 (2013)MathSciNet CrossRef
    Lehoucq, R., Sorensen, D.: Deflation techniques for an implicitly restarted arnoldi iteration. SIAM J. Matrix Anal. Appl. 17(4), 789鈥?21 (1996)MATH MathSciNet CrossRef
    Lin, T.I.: Robust mixture modeling using multivariate skew t distribution. Stat. Comput. 20, 343鈥?56 (2010)MathSciNet CrossRef
    Lin, T.I., Lee, J.C., Hsieh, W.J.: Robust mixture modeling using the skew t distribution. Stat. Comput. 17, 81鈥?2 (2007)MathSciNet CrossRef
    Mah茅, P., Vert, J.P.: Graph kernels based on tree patterns for molecules. Mach. Learn. 75(1), 3鈥?5 (2009)CrossRef
    McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (1992)CrossRef
    McLachlan, G., Peel, D., Bean, R.: Modelling high-dimensional data by mixtures of factor analyzers. Comput. Stat. Data Anal. 41, 379鈥?88 (2003)MATH MathSciNet CrossRef
    McNicholas, P., Murphy, B.: Parsimonious Gaussian mixture models. Stat. Comput. 18(3), 285鈥?96 (2008)MathSciNet CrossRef
    Mika, S., Ratsch, G., Weston, J., Sch枚lkopf, B., M眉llers, K.R.: Fisher discriminant analysis with kernels. In: Neural Networks for Signal Processing (NIPS), pp. 41鈥?8 (1999)
    Minka, T.: Expectation propagation for approximate bayesian inference. In: Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, pp. 362鈥?69. Morgan Kaufmann, San Francisco (2001)
    Montanari, A., Viroli, C.: Heteroscedastic factor mixture analysis. Stat. Model. 10(4), 441鈥?60 (2010)MathSciNet CrossRef
    Murphy, T.B., Dean, N., Raftery, A.E.: Variable selection and updating in model-based discriminant analysis for high dimensional data with food authenticity applications. Ann. Appl. Stat. 4(1), 219鈥?23 (2010)MathSciNet CrossRef
    Murua, A., Wicker, N.: Kernel-based Mixture Models for Classification. Technical Report, University of Montr茅al (2014)
    Pekalska, E., Haasdonk, B.: Kernel discriminant analysis for positive definite and indefinite kernels. IEEE Trans. Pattern Anal. Mach. Intell. 31(6), 1017鈥?032 (2009)CrossRef
    Ramsay, J.O., Silverman, B.W.: Functional Data Analysis. Springer Series in Statistics, 2nd edn. Springer, New York (2005)
    Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning Matlab Toolbox. MIT, Cambridge (2006a)
    Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT, Cambridge (2006b)MATH
    Scholkopf, B., Smola, A.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT, Cambridge (2001)
    Sch枚lkopf, B., Smola, A., M眉ller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10(5), 1299鈥?319 (1998)CrossRef
    Sch枚lkopf, B., Tsuda, K., Vert, J.-P. (eds.): Kernel Methods in Computational Biology. MIT, Cambridge (2004)
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461鈥?64 (1978)MATH CrossRef
    Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)CrossRef
    Shorack, G.R., Wellner, J.A.: Empirical Processes with Applications to Statistics. Wiley, New York (1986)MATH
    Smola, A., Kondor, R.: Kernels and regularization on graphs. In: Proceedings of Conference on Learning Theory and Kernel Machines, pp. 144鈥?58 (2003)
    Wang, J., Lee, J., Zhang, C.: Kernel trick embedded Gaussian mixture model. In: Proceedings of the 14th International Conference on Algorithmic Learning Theory, pp. 159鈥?74 (2003)
    Xu, Z., Huang, K., Zhu, J., King, I., Lyu, M.R.: A novel kernel-based maximum a posteriori classification method. Neural Netw. 22, 977鈥?87 (2009)CrossRef
  • 作者单位:C. Bouveyron (1)
    M. Fauvel (2)
    S. Girard (3)

    1. Laboratoire MAP5, UMR 8145, Universit茅 Paris Descartes & Sorbonne Paris Cit茅, Paris, France
    2. Laboratoire DYNAFOR, UMR 1201, INRA & Universit茅 de Toulouse, Toulouse, France
    3. Equipe MISTIS, INRIA Grenoble Rh么ne-Alpes & LJK, Grenoble Cedex, France
  • 刊物类别:Mathematics and Statistics
  • 刊物主题:Statistics
    Statistics Computing and Software
    Statistics
    Numeric Computing
    Mathematics
    Artificial Intelligence and Robotics
  • 出版者:Springer Netherlands
  • ISSN:1573-1375
文摘
This work presents a family of parsimonious Gaussian process models which allow to build, from a finite sample, a model-based classifier in an infinite dimensional space. The proposed parsimonious models are obtained by constraining the eigen-decomposition of the Gaussian processes modeling each class. This allows in particular to use non-linear mapping functions which project the observations into infinite dimensional spaces. It is also demonstrated that the building of the classifier can be directly done from the observation space through a kernel function. The proposed classification method is thus able to classify data of various types such as categorical data, functional data or networks. Furthermore, it is possible to classify mixed data by combining different kernels. The methodology is as well extended to the unsupervised classification case and an EM algorithm is derived for the inference. Experimental results on various data sets demonstrate the effectiveness of the proposed method. A Matlab toolbox implementing the proposed classification methods is provided as supplementary material. Keywords Model-based classification Kernel methods Gaussian process parsimonious models Mixed data

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700