用户名: 密码: 验证码:
基于高斯过程的在线建模问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
高斯过程(Gaussian Process,简称GP)作为一种非参数概率模型,已经成为机器学习领域重要的学习方法之一,它相对于神经网络和支持向量机的优点在于既能给出模型的响应输出,又能给出输出的不确定度,也即输出值的方差。高斯过程建模方法主要基于统计学习理论,用于分类和回归问题,在解决小样本、非线性问题中具有独特的优势。
     同其他建模方法相比,高斯过程具有参数少的优点,但其参数优化过程仍然是模型辨识中最耗时的一部分。最优化超参数最常见的训练方法是共轭梯度法,不过这种方法必须计算海森矩阵,从而导致在迭代过程中耗费了大量的计算时间,不符合在线算法的实时性要求。同时高斯过程的时间复杂度与样本量密切相关,在训练样本数目比较大的时候,高斯过程的协方差矩阵及其逆的计算将十分浪费计算资源和时间。因此,对于在线训练算法,如果直接对其进行计算将会影响算法的实时性。
     自适应自然梯度(Adaptive Natural Gradient,简称ANG)是基于黎曼空间的优化方法,相比牛顿梯度法,有着计算简便和接近于高效率Fisher方法的优点,所以针对上面提到的问题,本文提出了基于自适应自然梯度(ANG)的高斯过程在线算法,将自适应自然梯度用于高斯过程模型的优化中。与批量式学习算法不同,增量式学习方法能对迭代过程中增加的样本进行学习,利用前一次迭代的运算结果,减少时间复杂度,从而实现在较小的时间代价下学习新的样本。高斯过程在线算法不断地将新数据加入到训练集,通过在线调整模型的参数,实现高斯过程模型的实时优化。这样不但能提高其训练速度,而且还能提高模型的适应能力。
     将上述算法运用在Mackey-Glass系统和连续搅拌反应釜(Continuous Stirred Tank Reactor,简称CSTR)模型的建立中。仿真结果表明此算法满足非线性系统建模的实时性和精度的要求。
     最后,本文针对高斯过程在线建模中出现的数据冗余问题,利用相关性分析、数据规范化和调节概率密度阈值的方法,对数据进行预处理,减小冗余度,从而进一步提高模型训练的速度。
As a non-parametric probability model, Gaussian Process has already become an important method of machine learning field. Compared with other modeling methods, GP has the advantage of giving both the prediction and the confidence degree of the prediction, which is also called variance. Based on statistical learning theory, GP can be used for classification and regression, and has unique advantages in resolving small sample, nonlinear problems.
     Compared with other modeling methods, GP has the advantage that has fewer parameters to optimize, but its optimization process of parameters is still the most time-consuming part of the system identification work. The most common training method of optimizing hyper-parameter is the conjugate gradient method, which needs to compute the Hessian matrix. In iterative processes of online algorithm, the Hessen matrix needs a lot of time to compute, so the conjugate gradient method is not appropriate for online algorithm in the requirement of real time. Meanwhile, the computational complexity of Gaussian process is closely associated with the size of sample set. If the number of training sample is larger, the covariance matrix of GP model and its inverse calculation would need much computing resources and time. So, for the online training algorithm, it will not suitable for the requirement that directly calculating the precise value of the matrix’s inverse.
     Based on Riemann space, Adaptive Natural Gradient method (ANG) has the advantage of simple in calculation and closed to efficient Fisher method, when compared to the standard Gradient method. So for solving the problem that is mentioned above, this paper puts forward the online GP modeling algorithm based on Adaptive Natural Gradient, which will use the Adaptive Natural Gradient method in optimization of the online GP model. Unlike the bulk learning method, Incremental learning method is different and can add sample on the iterative process of learning, then use operation result of last iteration process in the present process to reduce the computational complexity. So, the algorithm can rapidly shorter the training time when the new sample added to the modeling training process. Online Gaussian process algorithm constantly put new training data to training set, by adjusting model parameters, implement Gaussian process model of real-time optimization. So, we can not only improve the training speed, but also the adaptiveness.
     The algorithm is utilized in Mackey-Glass system and Continuous Stirred Tank Reactor (for short, CSTR) system to modeling. The simulation results show that the nonlinear system model satisfied the requirements of real time and precision.
     Finally, am at resolving data redundancy problem which appeared in the algorithm of online Gaussian process modeling, this paper use the correlation analysis, data standardization and adjust the probability density threshold method of data pretreatment. So we can get more suitable data for modeling, so as to reduce redundancy and further to improve the training speed.
引文
[1]王琳,马平.系统辨识方法综述[J].电力情报, 2001, 4: 63-66
    [2] Mackay D.J.C. Introduction to Gaussian Processes[A]. Christopher M. Neural Networks and Machine Learning[C]. 1998: 133-165
    [3] Rasmussen C.E., Christopher K.I. Gaussian Processes for Machine Learning[M]. America: The MIT Press, 2005
    [4] Girard A., Rasmussen C.E., Candela J.Q. Gaussian Process Priors with Uncertain Inputs-Application to Multiple-Step Ahead Time Series Forecasting[A]. Becker S., Thrun S. Advances in Neural Information Processing Systems[C]. Massachusetts: The MIT Press, 2003: 542-552
    [5] Neal R.M. Bayesian Learning for Neural Networks[D]. Toronto of Canada: Dept. of Computer Science, 1994
    [6] Rasmussen C.E. Evaluation of Gaussian Processes and Other Methods for Non-linear Regression[D]. Toronto: Dept. of Computer Science, 1996
    [7] Seeger M. Bayesian Gaussian Process Models: PAC-Bayesian Generalization Error Bounds and Sparse Approximations[D]. Edinburgh: University of Edinburgh, 2003
    [8] Mackay D.J.C. Information Theory, Inference, and Learning Algorithms[M]. Cambridge, U.K.: Cambridge Univ. Press, 2003
    [9] Williams C.K.I., Rasmussen C.E. Gaussian Processes for Regression[J]. Advances in Neural Information Processing Systems, 1996(8): 598-604
    [10] Williams C.K.I., Barber D. Bayesian Classification with Gaussian Processes[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 1998, 22: 1342-1351
    [11] CsatóL., FokouéE., Opper M., et al. Efficient Approaches to Gaussian Process Classification[A]. Solla S.A., Leen T.K., Muller K.R. Advances in Neural Information Processing Systems 12[C]. Massachusetts: The MIT Press, 2000: 251-257
    [12] Gibbs M.N., Mackay D.J.C. Variational Gaussian Process Classifiers[J]. IEEE Transactions on Neural Networks, 2000, 11(6): 1458-1464
    [13] Abdel-Gawad A.H., Atiya A.F. A New Accurate Approximation for the Gaussian Process Classification Problem[A]. The 2008 IEEE International Joint Conference onNeural Networks[C]. IEEE, 2009: 912-916
    [14] Smola A.J., Bartlett P. Sparse Greedy Gaussian Process Regression[A]. Leen T.K. Advances in Neural Information Processing Systems 13[C]. 2001: 619-625
    [15] Zhang Y., Leithead W.E. Approximate Implementation of the Logarithm of the Matrix Determinant in Gaussian Process Regression[J]. Journal of Statistical Computation and Simulation, 2007, 77(3/4): 329-348
    [16] CsatóL. Gaussian Processes-Iterative Sparse Approximations[D]. Britain: Neural Computing Research Group, Aston University, 2002
    [17] CsatóL., Opper M. Sparse On-line Gaussian Processes[J]. Neural computation, 2002, 14(3): 641-668
    [18] Ranganathan A., Yang M.H., Ho J. Online Sparse Gaussian Process Regression and Its Applications[J]. IEEE Transactions on Image Processing, 2011, 20(2): 391-404
    [19] Deisenroth M.P., Peters J., Rasmussen C.E. Approximate Dynamic Programming with Gaussian Processes[A]. 2008 American Control Conference (ACC)[C], 11, 2008: 4480-4485
    [20] Gregorcic G., Lightbody G. Local Model Network Identification with Gaussian Processes[J]. IEEE Transactions on Neural Networks, 2007, 18(5): 1404-1423
    [21] Tuong D.N., Peters J. Local Gaussian Processes Regression for Real-time Model-based Robot Control[A]. 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems[C]. 2008: 380-385
    [22] Smith R.M., Sbarbaro D. Nonlinear Adaptive Control Using Nonparametric Gaussian Process Prior Models[A]. Basenez L, De la Puenta J.A. Proceedings of the 15th IFAC World Congress[C]. 2003: 321-326
    [23] Kocijan J., Murray-Smith R., Rasmussen C.E., et al. Gaussian Process Model based Predictive Control[A]. Proceeding of the 2004 American Control Conference[C], Massachusetts: The MIT Press, 2004: 2214-2219
    [24]王华忠.高斯过程及其在软测量建模中的应用[J].化工学报, 2007, 58(11): 2840-2845
    [25]熊志化,黄国宏,邵惠鹤.基于高斯过程和支持向量机的软测量建模比较及应用研究[J].信息与控制, 2004, 33(6): 754-757
    [26]熊志化,张继承,邵惠鹤.基于高斯过程的软测量建模[J].系统仿真学报, 2005, 17(4): 793-800
    [27]熊志化,杨海滨,吴云峰等.基于稀疏高斯过程的热力参数软仪表[J].中国电机工程学报, 2005, 25(8): 130-133
    [28]沈赟.基于高斯过程的时间序列分析[D].上海:上海交通大学, 2009
    [29]王雪松,张依阳,陈玉虎.基于高斯过程分类器的连续空间强化学习[J].电子学报. 2009(6): 1153-1158
    [30]熊文英.基于核方法的高斯过程及其在分类问题上的应用[D].北京:北京大学, 2008
    [31]熊志化,张卫庆,赵瑜等.基于混合高斯过程的多模型热力参数测量软仪表[J].中国电机工程学报. 2005, 25(7): 30-40
    [32]王华忠.一种基于高斯过程的非线性PLS建模方法[J].华东理工大学学报(自然科学版), 2007, 33(5): 708-711
    [33]李国正,王猛,曾华军.支持向量机导论[M].北京:电子工业出版社, 2006
    [34] Amari S. Natural Gradient Works Efficiently in Learning[J]. Neural Computation, 1998, 10(2): 251-276
    [35] Amari S., Douglas S.C. Why Natural Gradient?[A]. Acoustics, Speech and Signal Processing (ICASSP), 1998 IEEE International Conference[C]. 1998: 1213-1216
    [36]胡德文.多变量控制系统Fisher信息矩阵的研究[J].控制理论与应用, 1992, 9(2): 193-197
    [37]黄翔,石贤汇. Fisher信息阵在矩阵不等式证明中的应用[J].安庆师范学院学报(自然科学版), 2010, 16(1): 5-6
    [38] Park H., Amari S., Fukumizu K. Adaptive Natural Gradient Learning Algorithms for Various Stochastic Models[J]. Neural Networks, 2000, 13(7): 755-764
    [39]申倩倩,孙宗海.基于自适应自然梯度法的在线高斯过程建模[J].计算机应用研究. 2011, 28(1): 95-97
    [40] Shen Q.Q., Sun Z.H. Online Learning Algorithm of Gaussian Process Based on Adaptive Natural Gradient for Regression[A]. Zhang L.C., Zhang C.L., Shi T.L. Manufacturing Engineering and Automation I[C]. 2010: 1847-1851
    [41]段海滨.蚁群算法原理及其应用[M].北京:科学出版社, 2005
    [42]陈永祥.基于中心定位的蚁群算法及其在交通选路中的应用[D].武汉:武汉理工大学, 2008
    [43]关志强.人体健康管理及预测系统的研究与开发[D].北京:北京工商大学, 2008
    [44] CsatóL. Sparse Online Gaussian Processes[CP]. [2010.01.01]. http://www.gaussianprocess.org/
    [45] Mackey M.C., Glass L. Oscillation and Chaos in Physiological Control Systems[J]. Science. 1977, 197: 287–289
    [46]周东华,叶银忠.现代故障诊断与容错控制[M].北京:清华大学出版社, 2000
    [47]尹焕平.支持向量机在线算法及其在连续搅拌反应釜中的应用研究[D].广州:华南理工大学, 2009
    [48]陈非,敬忠良,姚晓东.一种模糊神经网络的快速参数学习算法[J].控制理论与应用. 2002, 19(4): 583-587
    [49]方洪鹰.数据挖掘中数据预处理的方法研究[D].重庆:西南大学, 2009
    [50]张林波.并行计算导论[M].北京:清华大学出版社, 2006

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700