A graph regularized dimension reduction method for out-of-sample data

详细信息查看全文

作者：Mengfan Tang^aAuthor Vitae ; Feiping Nie^b ; ^{feipingnie@gmail.com}Author Vitae ; Ramesh Jain^aAuthor Vitae
关键词：Dimension reduction ; Out-of-sample data ; Graph regularized PCA ; Manifold learning ; Clustering
刊名：Neurocomputing
出版年：2017
出版时间：15 February 2017
年：2017
卷：225
期：Complete
页码：58-63
全文大小：400 K
卷排序：225

文摘

Among various dimension reduction techniques, Principal Component Analysis (PCA) is specialized in treating vector data, whereas Laplacian embedding is often employed for embedding graph data. Moreover, graph regularized PCA, a combination of both techniques, has also been developed to assist the learning of a low dimensional representation of vector data by incorporating graph data. However, these approaches are confronted by the out-of-sample problem: each time when new data is added, it has to be combined with the old data before being fed into the algorithm to re-compute the eigenvectors, leading to enormous computational cost. In order to address this problem, we extend the graph regularized PCA to the graph regularized linear regression PCA (grlrPCA). grlrPCA eliminates the redundant calculation on the old data by first learning a linear function and then directly applying it to the new data for its dimension reduction. Furthermore, we derive an efficient iterative algorithm to solve grlrPCA optimization problem and show the close relatedness of grlrPCA and unsupervised Linear Discriminant Analysis at infinite regularization parameter limit. The evaluations of multiple metrics on seven realistic datasets demonstrate that grlrPCA outperforms established unsupervised dimension reduction algorithms.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700