A comparative investigation on subspace dimension determination
详细信息查看全文 | 推荐本文 |
摘要
It is well-known that constrained Hebbian self-organization on multiple linear neural units leads to the same k-dimensional subspace spanned by the first k principal components. Not only the batch PCA algorithm has been widely applied in various fields since 1930s, but also a variety of adaptive algorithms have been proposed in the past two decades. However, most studies assume a known dimension k or determine it heuristically, though there exist a number of model selection criteria in the literature of statistics. Recently, criteria have also been obtained under the framework of Bayesian Ying–Yang (BYY) harmony learning. This paper further investigates the BYY criteria in comparison with existing typical criteria, including Akaike's information criterion (AIC), the consistent Akaike's information criterion (CAIC), the Bayesian inference criterion (BIC), and the cross-validation (CV) criterion. This comparative study is made via experiments not only on simulated data sets of different sample sizes, noise variances, data space dimensions, and subspace dimensions, but also on two real data sets from air pollution problem and sport track records, respectively. Experiments have shown that BIC outperforms AIC, CAIC, and CV while the BYY criteria are either comparable with or better than BIC. Therefore, BYY harmony learning is a more preferred tool for subspace dimension determination by further considering that the appropriate subspace dimension k can be automatically determined during implementing BYY harmony learning for the principal subspace while the selection of subspace dimension k by BIC, AIC, CAIC, and CV has to be made at the second stage based on a set of candidate subspaces with different dimensions which have to be obtained at the first stage of learning.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700