偏最小二乘算法及其在基于结构风险最小化的机器学习中的应用

英文题名：PLS Algorithm and Its Applications to SRM-Based Machine Learning
作者：白裔峰
论文级别：博士
学科专业名称：电力电子及电力传动
中文关键词：结构风险最小化 ; 偏最小二乘算法 ; 核偏最小二乘算法 ; 模糊系统建模 ; 支持向量机 ; 鲁棒递推偏最小二乘算法 ; 接触网检测
英文关键词：structure risk minimization ; partial least-squares algorithm ; kernel partial least-squares-algorithm ; fuzzy system modeling ; support vector machine ; robust recursive partial least-squares algorithm ; overhead contact system detection
学位年度：2007
导师：肖建
学科代码：080804
学位授予单位：西南交通大学
论文提交日期：2007-04-01

摘要

机器学习是非线性科学研究的主要内容之一。大部分建立非线性系统模型的机器学习方法以极小化训练误差为优化目标，即基于经验风险最小化原则。近年来，基于统计学习理论，兼顾模型的经验风险和置信范围的基于结构风险最小化原则已成为机器学习研究热点之一。偏最小二乘算法作为一种源于过程控制的算法，借助提取数据中解释性最强的综合信息，实现对高维数据空间的降维处理，克服变量多重相关性。核偏最小二乘算法、支持向量机和模糊系统建模都是机器学习的有效学习方法，但在建立非线性模型的过程中各自仍存在一些不足。
     本文以核偏最小二乘算法、支持向量机和模糊系统建模等机器学习方法与偏最小二乘算法结合为思路，在机器学习过程中实现结构风险最小化原则为目标，展开论文的研究。
     根据Mercer定理，本文提出了一种简化核偏最小二乘算法，并同时提出一种满足结构风险最小化原则的风险的指标，仿真计算说明了该指标的有效性。为了解决核偏最小二乘算法中核函数矩阵维数随辨识样本膨胀的问题，本文提出的分块核偏最小二乘算法，通过划分核函数矩阵，减少了核偏最小二乘算法的计算负担。
     针对模糊系统模型的“规则数爆炸”问题，本文提出了基于子空间划分的模糊系统模型，并给出了基于遗传算法的自适应模型辨识方法。该方法按照一致、完备原则划分论域，部分地解决了模糊系统的“规则数爆炸”问题。在改进算法当中，使用偏最小二乘算法对数据进行预处理、建立初始模型，再利用基于子空间划分的模糊系统模型对残差进行建模。通过ε不敏感损失函数和子空间的划分达到模型的置信范围与经验风险的折中，实现了结构风险最小化。
     由于偏最小二乘算法泛化能力较差，本文将支持向量机算法和偏最小二乘算法结合，提出了基于结构风险最小化的加权偏最小二乘算法。使用支持向量机训练算法计算加权偏最小二乘算法中外模型的线性回归模型，实现了结构风险最小化原则。然后本文将支持向量机算法应用于T-S模糊系统模型的建模过程中，提出了基于支持向量机的T-S模糊系统模型的建模方法。该算法以支持向量为中心在论域空间模糊聚类，然后根据聚类结果形成模糊规则：模糊规则的前件为聚类中心，后件为对应该类的的线性偏最小二乘回归模型。不但可以自适应地建立T-S模糊系统模型，而且实现了结构风险最小化原则。
     为了能够在对时变系统建模或存在大数据量时建模过程中完成野点检测算法，本文提出了鲁棒递推偏最小二乘算法，解决了通常情况下计算量大的问题。通过将递推偏最小二乘法与鲁棒主分量回归算法相结合，不但有效解决了计算量大的问题，而且有效避免了存在多个野点时的掩盖和淹没现象。接触网检测对于高速电气化铁路安全运营意义重大。本文基于所提出的鲁棒算法研究了弓网关系的非线性模型。经过数据标准化和去除野点后，使用偏最小二乘法构造有效的输入—输出数据，最后使用支持向量机算法建立非线性模型。仿真结果说明模型精度能够满足实际要求。
Machine learning is one of major fields of nonlinear science. Empirical risk minimization rule is a commonly used optimization index for large part of machine learning method for building nonlinear system model. According to statistical learning theory, how to give attention to both empirical risk and confidence interval, i.e. based on structure risk minimization rule, has attracted widely attention in the area of machine learning. Partial least-squares algorithm originated from process control achieves dimension reduction in high-dimensional data space and circumvents the collinearity problem due to highly correlated data by capture most explanatory variance in the original data. Kernel partial least-squares algorithm, support vector machine algorithm and fuzzy system modeling are all effective machine learning approaches, but they still have their own deficient in constructing nonlinear models yet.
     From the view point of integration of kernel partial least-squares, support vector machine, fuzzy system modeling and partial least-squares, the study in this paper is expanded to attain the goal of realization structure risk minimization rule in the process of machine learning.
     From the Mercer theorem, a kind of simple kernel partial least-squares algorithm is presented, and an index of risk is given at the same time, whose validity is proved in the simulation. To solve the problem of kernel function matrix expanding with the increase of the identification sample, block-wise kernel partial least-squares algorithm is described which divides the kernel function matrix and cut down the calculation burden.
     A subspace-partition based fuzzy system model (SPFS) and an adaptive model identification algorithm are proposed in this paper to solve the rule number's explosion problem. The algorithm partitions the discourse universe on principle of consistency and completion, and is helpful in relieving the rule number's explosion problem. In the advanced algorithm, partial least-squares algorithm is employed to pretreat identification sample and build initial model, then SPFS is built on the residuals. The balance of confidence interval and empirical risk is achieved byε-insensitive loss function and subspace-partition, and the structure risk minimization rule is realized.
     For the sake of poor generalization ability of partial least-squares, weighted partial least-squares algorithm are presented by integrated support vector machine and partial least-squares algorithm. Support vector machine training algorithm is used to calculate the linear regression model in outer model of weighted partial least-squared, and structure risk minimization rule is realized. Then T-S fuzzy system model based on support vector machine is proposed by applying support vector machine training algorithm in T-S fuzzy model modeling. The algorithm clusters with support vectors as centers in the discourse universe, then form the fuzzy rules: cluster center as antecedent proposition and linear partial least-squares regression model of the cluster as consequent proposition. Not only construct the T-S fuzzy system model adaptively, but also achieve the structure risk minimization rule.
     A robust recursive partial least-squares algorithm is proposed in this paper to solve the large computation burden problem of outlier detection algorithm in regression for time-varying system or massive data. By combination of recursive partial least-squares and robust principal components regression based on principal sensitivity vectors, settle the problem of large computation burden, as well as avoiding effectively masking and swamping with multi outlier exist. Overhead contact system detection is significant to manage high speed electrified railway safely. The nonlinear model of relation between pantograph and contact wire is studied based on the robust algorithm in this paper. After data standardization and outlier detection, input-output data is created by partial least-squares algorithm, and nonlinear model is constructed by support vector machine algorithm. Simulation result show the prediction precision of the model satisfied practical requirement.

引文

[1] Wold S., Trygg J., Berglund A.. Some recent developments in PLS modeling[J]. Chemometrics and Intelligent Laboratory systems, 2001, 58: 131-150.
    [2] Svante Wold. Personal memories of the early PLS development[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 58: 83-84.
    [3] 梁林．基于非线性部分最小二乘的软测量建模方法研究[D]，硕士学位论文，北京：清华大学，2001．
    [4] 武娇．偏最小二乘回归模型及其在教育统计中的应用[D]，硕士学位论文，西安：陕西师范大学，2002．
    [5] 蒋红卫．偏最小二乘回归的扩展及其实用算法构建[D]，硕士学位论文，西安：中国人民解放军第四军医人学，2003．
    [6] de Jong S.. SIMPLS: An alternative approach to partial least squares regression[J]. Chemometrics and Intelligent Laboratory Systems, 1993, 18: 251-263.
    [7] Geladi P.. Notes on the history and nature of partial least squares(PLS) modeling[J]. Journal of Chemometrics, 1988, 2: 231-246.
    [8] 李世玲．基于PLS成分的变量筛选法[J]，信息与电子工程，2003，1(2)：31-35．
    [9] 张大仁，赵立新．基于遗传算法的PLS分析在QSAR研究中的应用[J]．环境科学，2000，21(6)：11-15．
    [10] Trygg J., Wold S.. PLS regression on wavelet compressed NIR spectra[J]. Chemometrics and Intelligent Laboratory Systems, 1998, 42: 209-220.
    [11] Hoskuldsson A.. PLS regression methods[J]. Journal of Chemometrics, 1998, 2: 211-228.
    [12] Hoskuldsson A.. Variable and subset selection in PLS regression[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 55: 23-38.
    [13] Xu Q. S., de Jong S., Lewi P., Massart D. L.. Partial least squares regression with Curds and Whey[J]. Chemometrics and Intelligent Laboratory Systems, 2004, 71: 21-31.
    [14] Srivastava M. S., Solanky T. K. S.. Predicting multivariate response in linear regression model[J]. Communications in Statistics: Simulation and Computation, 2003, 32(2): 389-409.
    [15] 吴晓华，陈德钊．化学计量学非线性偏最小二乘算法进展评述[J]．分析化学，2004，32(4)：534-540．
    [16] Wold S., Kettaneh-Wold N., Skagerberg B.. Nonlinear PLS modeling[J]. Chemometrics and Intelligent Laboratory Systems, 1989, 7: 53-65.
    [17] Durand J. F.. Local polynomial additive regression through PLS and splines: PLSS[J]. Chemometrics and Intelligent Laboratory Systems, 2001, 58: 235-246.
    [18] Holcomb T. R., Morari M.. PLS/neural networks[J]. Computers Chem. Engng., 1992, 16(4): 393-411.
    [19] Bang Y. H., Yoo C. K., Lee I. B.. Nonlinear PLS modeling with fuzzy inference system [J]. Chemometrics and Intelligent Laboratory Systems, 2002, 64: 137-155.
    [20] Rosipal R.. Kernel partial least squares regression in reproducing kernel Hilbert space [J]. Journal of Machine Learning Research, 2001, 2: 97-123.
    [21] Momma M., Bennett K P. (Eds.) Sparse kernel partial least squares regression[J]. LNAI: COLT/Kernel. 2003, 2777: 216-230
    [22] 曾三友，孙星明，夏利民，金可音．基于Chebyshev多项式的自适应偏最小二乘回归建模[J]，长沙铁道学院学报，2001，19(1)：95-99．
    [23] George R.. Contributions to the problem of approximation of non-linear data with linear PLS in an absorption spectroscopic context[J]. Chemometrics and Intelligent Laboratory Systems, 1999, 47: 99-106.
    [24] 王惠文，朱韵华．PLS回归在消除多重共线性中的作用[J]．数理统计与管理，1996，15(6)：48-52．
    [25] Lingjoerde O. C., Christophersen N.. Shrinkage structure of partial least-squares[J]. Scand. J. Statist., 2000, 27: 459-473.
    [26] Doymaz F., Palazoglu A., Romagnoli J. A.. Orthogonal nonlinear partial least-squares regression[J]. Ind. Eng. Chem. Res., 2003, 42: 5836-5849.
    [27] 马智明，阳宪惠．PLSR模型的回归效果分析[J]．数理统计与管理，2000，19(3)：44-48．
    [28] 蒋红卫．偏最小二乘回归的扩展及其实用算法构建[D]．硕士学位论文，西安：中国人民解放军第四军医人学，2003．
    [29] 程博．对几个统计模型的构造和数据分析[D]．博士学位论文，天津：南开大学，2001．
    [30] 蒋红卫，夏结来，余莉莉．偏最小二乘回归的离群点检测方法[J]．中国卫生统计， 2004，21(3)：135-138
    [31] Osten D. W.. Selection of optimal regression models via cross-validation[J]. Journal of Chemometrics, 1988, 2: 39-48.
    [32] Clark R. D.. Boosted leave-many-out cross-validation: the effect of training and test set diversity on PLS statistics[J]. Journal of Computer-Aided Molecular Design, 2003, 17: 265-275.
    [33] Zhang M. H., Xu Q. S., Massart D. L.. Averaged and weighted average partial least squares[J]. Analytica Chimica Acta, 2004, 504(2): 279-289.
    [34] Dayal B. S., Macgregor J. F.. Improved PLS algorithms[J]. Journal of chemometrics, 1997, 11: 73-85.
    [35] 薄雪梅，李梦龙．偏最小二乘法的简化算法[J]．四川大学学报(自然科学版)，1999，36(2)：396-397．
    [36] Lindgren F, Rannar S. Alternative partial least-squares(PLS) algorithms[J]. Perspectives in Drug Discovery and Design, 1998, 12/23/14: 105-113.
    [37] Tominaga Y, Fujiwara I. Prediction-weighted partial least-squares regression method (PWPLS)[J]. Chemometrics and Intelligent Laboratory Systems, 1997, 38: 139-144.
    [38] Dayal B. S., MacGregor J. F.. Recursive exponentially weighted PLS and its applications to adaptive control and prediction[J]. J. Proc. Cont., 1997, 7(3): 169-179.
    [39] Qin S J. Recursive PLS algorithms for adaptive data modeling[J]. Computers Chem. Engng., 1998, 22(4/5): 503-514.
    [40] Adebiyi O A, Corripio A B. Dynamic neural networks partial least squares(DNNPLS) identification of multivariable processes[J]. Computers and Chemical Engineering, 2003, 27: 143-155.
    [41] Fisher R A. Contributions to Mathematical Statistics[M]. J. Wiley, New York, 1952
    [42] Rosenblatt F. Principles of Neurodinamics: Perceptron and Theory of Brain Mechanisms[M]. Spartan Books, Washington D. C., 1962
    [43] Novikoff A B J. On Convergence Proofs on Perceptrons[A]. Proceedings of the Symposium on the Mathematical Theory of Automata[C], Poltechnic Institute of Brooklyn, 1962, Ⅻ: 615～622
    [44] Vapnik V N, Chervonenkis A J. On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities[J]. Doklady Akademii Nauk USSR, 1968, 181(4): 173～187
    [45] Ivanov V V. On Linear Problems Which are not Well-posed [J]. Soviet Math Dokl., 1962, 3(4): 981～983
    [46] Vapnik V N, Stefanyuk A R. Nonparametric Methods for Estimation Probability Densities [J], Automation and Remote Control, 1978, 8:27～35
    [47] Solomonoff R J. A Preliminary Report on General Theory of Inductive Inference [R]. Technical Report ZTB-138, Zator Company, Cambridge, 1960
    [48] Rissanen J. Modeling by Shortest Data Description [J]. Automatica, 1978, 14:465～471
    [49] Rumelhart D E, Hinton G E, Williams R J. Learning Internal Representations by Error Propagation [M]. Parallel Distributed Processing: Explorations in Macrostructure of coganition, 1986, I, Badford Books, Cambridge, MA.:318～362
    [50] Vapnik V N. Three Fundamental Concepts of the Capacity of Learning Machines [J]. PhysicaA, 1993, 200:538～544
    [51] Vapnik V N, Bottou L. Local Algorithms for Pattern Recognition and Dependencies Estimation [J]. Neural Computation, 1993, 5(6): 893～908
    [52] Vapnik V N. Statistical Learning Theory [M]. J. Wiley, New York, 1998
    [53] B. Boser, I. Guyon, V. N. Vapnik. A training algorithm for optimal margin classifiers [A]. In Proc. 5th Annual Workshop Computation on Learning Theory [C]. Pittsburgh, PA: ACM, 1992: 144-152.
    [54] V. N. Vapnik. The Nature of Statistical Learning Theory [M]. New York: Springer-Verlag, 2000.
    [55] V. N. Vapnik. Statistical Learning Theory [M]. New York: Wiley, 2000.
    [56] V. N. Vapnik. The Nature of Statistical Learning Theory [M]. New York: Springer-Verlag, 1995.
    [57] V. N. Vapnik. An overview of statistical learning theory [J]. IEEE trans, on Neural Networks, 1999, 10(5): 988-999.
    [58] Vapnik V. N., Chervonenkis A. J.. On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities [J]. Doklady Akademii Nauk USSR, 1968, 181(4): 173-187.
    [59] V. N. Vapnik, Estimation of Dependencies Based on Empirical Data. Moscow, Russia: Nauka, 1979 (in Russian) [M]. English translation, New York: Springer-Verlag, 1982.
    [60] Robbins H., Monroe H,. A Stochastic Approximation Method [J]. Annalsof Mathematical Statistics, 1951,22: 400-407.
    [61] Aizerman M. A., Braveman E. M., Rozonoer L. I.. Theoretical Foundation of Potential Function Method in Pattern Recognition Learning [J]. Automation and Remote Control, 1964, 25: 821-837.
    [62] Vapnik V N, Chervonenkis A J. On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities [J]. Theory Probab. Apl., 1971, 16:264～280
    [63] B. Boser, I. Guyon, V. N. Vapnik. A training algorithm for optimal margin classifiers [A]. In Proc. 5th Annual Workshop Computation on Learning Theory [C]. Pittsburgh, PA:ACM, 1992: 144-152.
    [64] E. Osuna, R. Freund, F. Girosi. An improved training algorithm for support vector machines [A]. Proceedings of the 1997 IEEE Workshop on Neural Networks for Signal Processing [C]. New York: IEEE Press, 1997: 276-285.
    [65] T. Joachims. Making large-scale SVM learning practical. In: Burges C., Scholkopf B.. Advances in Kernel Methods: Support Vector Learning [M]. Cambridge, MA: MIT press, 1998.
    [66] C. J. Lin. On the convergence of the decomposition method for support vector machines [J]. IEEE Transactions on Neural Networks, 2001, 12(6): 1288-1298.
    [67] C. W. Hsu, C. J. Lin. A simple decomposition method for support vector machines [J]. Machine Learning, 2002, 46(1): 91-314.
    [68] T. Joachims. Making large-scale SVM learning practical. In: Burges C, Scholkopf B.. Advances in Kernel Methods: Support Vector Learning [M]. Cambridge, MA: MIT press, 1998.
    [69] J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In: B Scholkopf., C. J. Burges, A. J. Smola (ed.). Advance in Kernel Methods-Support Vector Learning [M]. Cambridge, MA, MIT Press, 1999: 185-208.
    [70] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya. Improvements to Platt's SMO algorithm for SVM classifier design [J]. Neural Computation, 2001, 13: 637-649.
    [71] C. J. Lin. Asymptotic convergence of an SMO algorithm without any assumptions [J]. IEEE Transactions on Neural Networks, 2002, 13(1): 248-250
    [72] S. S. Keerthi, E. G. Gilbert. Convergence of a generalized SMO algorithm for SVM classifier design [J]. Machine Learning, 2002, 46: 351-360.
    [73] N. Takahashi, T. Nishi. Rigorous Proof of Termination of SMO Algorithm for Support Vector Machines [J]. IEEE Transactions on Neural Network, 2005, 16(3): 774-776.
    [74] O. L. Mangasarian, D. R. Musicant. Successive overrelaxation for support vector machines[J]. IEEE trans, on Neural Network, 1999, 10(5): 1032-1037.
    [75] O. L. Mangasarian. Generalized Support Vector Machine. Advances in Large Margin Classifiers[M]. A. J. Smola, P. Bartlett, B. Schokopf and D. Schuurmans, editors, MIT Press, 2000: 135-146.
    [76] Yuh-Jye Lee, O. L. Mangasarian. SSVM: A smooth support vector machine[J]. Computational Optimization and Applications, 2001, 20(1): 5-22
    [77] O. L. Mangasarian, D. R. Musicant. Active Set Support Vector Machine Classification. In Todd K. Lee et. al.(ed), Neural Information Processing Systems 2000 (NIPS 2000) [M], MIT Press, 2001: 577-583.
    [78] O. L. Mangasarian, D. R. Musicant. Lagrangian Support Vector Machines[J]. Journal of Machine Learning Research, March 2001, 1: 161-177.
    [79] Yuh-Jye Lee, O. L. Mangasarian. RSVM: Reduced Support Vector Machines[A]. CD Proceedings of the SIAM International Conference on Data Mining[C], Chicago, April 5-7, 2001, SIAM, Philadelphia.
    [80] Glenn Fung, O. L. Mangasarian. Proximal Support Vector Machine Classifiers[A]. Proceedings KDD-2001[C], San Francisco August 26-29, 2001. Association for Computing Machinery, New York, 2001: 77-86
    [81] O. L. Mangasarian. A Finite Newton Method for Classification[J]. Optimization Methods and Software, 2002, 17: 913-929.
    [82] O. L. Mangasarian, D. R. Musicant. Robust Linear and Support Vector Regression[J]. IEEE trans, on Pattern Analysis and Machine Intelligence, 2000, 22(9): 950-955.
    [83] J. A. K. Suykens, E. J. Vandewal. Least squares support vector machine classifiers[J]. Neural Processing Letters, 1999, 9(3): 293-300.
    [84] J. A. K. Suykens, J. Vanderwalle. Recurrent least squares support vector machines[J]. IEEE Trans. on Circuit and Systems-Ⅰ, 2000, 47: 1109-1114.
    [85] T. Joachims. Learning to Classify Text Using Support Vector Machings: Method, Theory and Algorithms[D]. Ph. D. Dissertation, Department of the Computer Science, University of Dortmund, 2000.
    [86] 任建峰，郭雷，李刚．多类支持向量机的自然图像分类．西北工业大学学报，2005，23，3：295-298．
    [87] 孙华丽．谢剑英．薛耀锋．基于支持向量机的机械传动方案决策模型[J]．上海交通大学学报，2005，39，6：975-978．
    [88] 李杰，楚恒，朱维乐，彭静．基于支持向量机和遗传算法的纹理识别[J]．四川大学学报(工程科学版)，2005，37，4：104-109．
    [89] Scholkopf B., Simardz P. Y.. Prior Knowledge in Support Vector Kernels. Advances in Neural Information Processing Systems[C]. Cambridge, MA: MIT Press, 1998: 640-646.
    [90] Gold C., Sollich P.. Model selection for support vector machine classification[J]. Neurocomputing, 2003, 55(1-2): 221-249.
    [91] Takimoto E, Warmuth M. Path kernels and multiplicative updates[A]. In Proceedings of the 15th Conference on Learning Theory[C]. ACM, 2002.
    [92] Smits G. F., Jordan E. M.. Improved SVM regression using mixtures of kernels[A]. Proceedings of the International Joint Conference on Neural Networks[C]. Honolulu, 2002, 3: 2785-2790.
    [93] Liu J. X, Li J., Tan Y. J.. An empirical assessment on the robustness of support vector regression with different kernels[A]. International Conference on Machine Learning and Cybernetics[C]. Guangzhou, China, 2005, 4289-4294.
    [94] Vapnik V., Golowich S., Smola A.. Support vector method for function approximation, regression estimation, and signal processing[J]. Neural Information Processing Systems, 1997, 9: 281-287.
    [95] Smola A. J., Scholkopf B. The connection between regularization operators and support vector kernels[J]. Neural Networks, 1998, 10: 1445-1454.
    [96] Genton M. G.. Classes of Kernels for Machine Learning: A Statistics Perspective[J]. Journal of Machine Learning Research, 2001, 2: 299-312.
    [97] Zhang L., Zhou W. D., Jiao L. C.. Wavelet support vector machine[J]. IEEE Transaction on systems, man, and cybernetics-Part B: Cybernetics, 2004, 34(1): 34-39.
    [98] 胡丹，肖建．尺度核支持向量机及在动态系统辨识中的应用[J]．西南交通大学学报，2006，41(4)：460-465．
    [99] Lanckriet G. R., Cristianini N., Bartlett P.. Learning the kernel matrix with semi-definite programming[J]. Journal of Machine Learning Research, 2004, 5(1): 27-72.
    [100] Scholkopf B., Smola A. J., Bartlett P. L.. New support vector algorithms[J]. Neural Computation, 2000, 12: 1207-1245.
    [101] Sugeno M., Kang G.. Structure identification of fuzzy model[J]. Fuzzy Sets and Systems, 1988, 28(1): 15-33.
    [102] Sugeno M., Yasukawa T.. A fuzzy-logic-based approach to qualitative modeling[J]. IEEE Transactions on Fuzzy Systems, 1993, 1(1): 7-31.
    [103] Chiu S.. Fuzzy model identification based on cluster estimation[J]. Journal of Intelligent & Fuzzy Systems, 1994, 2(3): 267-278.
    [104] Hadjili ML, Wertz V.. Takagi-Sugeno fuzzy modeling incorporating input variables selection[J]. IEEE Transactions on Fuzzy Systems, 2002, 10(6): 728-742.
    [105] Su M. C., Lai E., Tew C. Y.. A SOM-based fuzzy system and its application in handwritten digit recognition[A]. Proceedings of International Symposium on Multimedia Software Engineering[C]. Taipei, Taiwan, 2000: 253-258, Publisher: IEEE Comput. Soc..
    [106] Carpenter G., Grossberg S., Maukuzon N., Reynolds J., Rosen DB.. Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system [J]. Nerual Networks, 1991, 4: 757-764.
    [107] Tzafestas S. G., Zikidis K. C.. Neuro FAST: On-Line Neuro-Fuzzy ART-Based Structure and Parameter Learning TSK Model[J]. IEEE Transactions on Systems, Man and Cybernetics Part B, 2001, 31(5): 797-802.
    [108] Chen Y, Wang J Z. Support Vector Learning for Fuzzy Rule-Based Classification Systems[J]. IEEE Transactions on Fuzzy Systems, 2003, 11(6): 716-728.
    [109] Chiang J H, Hao P Y. A New Kernel-Based Fuzzy Clustering Approach: Support Vector Clustering With Cell Growing[J]. IEEE Transactions on Fuzzy Systems, 2003, 11(4): 518-527
    [110] Kaieda K., Abe S.. A Kernel Fuzzy Classifier with Ellipsoidal Regions[A]. Proceedings of the International Joint Conference on Neural Networks[C]. Vol.3, Portland, America, 20-24 July 2003: 2043-2048.
    [111] Wang L.. Universal approximator by hierarchical fuzzy systems[J]. Fuzzy Sets and Systems, 1998, 93(2): 223-230.
    [112] 孙多青，霍伟．基于分层模糊系统的直接自适应控制[J]．控制与决策，2002，17(4)：465-472．
    [113] 孙多青，霍伟，杨枭．含模型不确定性移动机器人路径跟踪的分层模糊控制[J]．控制理论与应用，2004，21(4)：489-494．
    [114] Lee M L, Chung H Y, Yu F M. Modeling of hierarchical fuzzy systems[J]. Fuzzy Sets and Systems, 2003, 138(2): 343-361.
    [115] Gabriel T. R., Berthold M R. Formation of hierarchical fuzzy rule systems[A]. Proc. of 22nd International Conference of the North American Fuzzy Information[C]. Chicago, USA, 24-26 July 2003: 87-92.
    [116] Venturini G. SIA: a supervised inductive algorithm with genetic search for learning attribute based concepts[A]. Proc. European Conference on Machine Learning[C], Viena, Austria, 1993: 280-296.
    [117] Beligiannis G. N., Skarlas L. V., Likothanassis S. D.. Nonlinear Model Structure Identification of Complex Biomedical Data Using a Genetic Programming Based Technique[J]. IEEE Transactions on Instrumentation and Measurement, 2005, 54(6): 2184-2190.
    [118] Tang Y., Huang T.. The driven genetic selection mechanism[A]. 2000 IEEE International Conference on Systems, Man, and Cybernetics[C]. Vol. 5, Nashville, America, 8-11 Oct. 2000: 3873-3876.
    [119] Lin Y., George A. C.. A new approach to fuzzy-neural system modeling[J]. IEEE Transactions on Fuzzy Systems. 1995, 3(2): 190-198.
    [120] 王辉，肖建．基于多分辨率分析的T-S模糊系统研究[J]．控制理论与应用，2005，22(2)：325-329．
    [121] 王辉，肖建．基于多分辨率的模糊系统的函数逼近性分析[J]．西南交通大学学报，2004，39(1)：26-30．
    [122] Wang J., Peng H., Xiao J.. Position control for PM synchronous motor using fuzzy neural network[A]. Proc. 2nd International Symposium on Neural Networks[C]. Chongqing, China, Lecture Notes in Computer Science, 2005, 3498(Ⅲ): 179-184.
    [123] 王惠文．偏最小二乘回归方法及其应用[M]，北京：国防工业出版社，1999．
    [124] B. Scholkopf. Nonlinear component analysis as a kernel eigenvalue problem[J]. Neural Computation, 1998, 10: 1299-1319.
    [125] Roman Rosipal, Leonard J. Trejo. Kernel Partial Least Square Regression in Reproducing Kernel Hilbert Space[J]. Journal of Machine Learning Research 2001, 2: 97-123
    [126] Jung-Hsien Chiang, Pei-Yi Hao. Support Vector Learning Mechanism for Fuzzy Rule-Based Modeling: A New Approach[J]. IEEE Transactions on Fuzzy Systems, 12(1), 2004: 1-12.
    [127] Bai Yifeng, Xiao Jian, Yu Long. Kernel Partial Least-squares Regression[A]. Proceedings of 2006 International Joint Conference on Neural Networks[C]. Vancouver, 2006, July 16-21: 1231-1238.
    [128] 张贤达．矩阵分析与应用[M]．北京：清华大学出版社，2004．
    [129] John Shawe-Taylor, Nello Cristianini. Kernel Methods for Pattern Analysis[M]. Cambridge University Press, 2004.
    [130] 胡包刚，应浩．模糊PID控制技术研究发展回顾及其面临的若干重要问题[J]．自动化学报，2001，27(4)：567-584
    [131] 王立新．模糊系统与模糊控制教程[M]．北京：清华大学出版社，2003．
    [132] Li-Xin Wang. Analysis and Design of Hierarchical Fuzzy Systems[J]. In: IEEE Transactions on Fuzzy Systems, 1999, 7(5): 617-624
    [133] Moon G Joo. A method of converting conventional fuzzy logic system to 2 layered hierarchical fuzzy system[A]. In: Proceedings on 2003 IEEE International Conference on Fuzzy Systems[C], St. Louis, MO, USA, May 25-28, 2003. 1357-1362, vol.2
    [134] Ping Li, Zhifeng Cai, Abu Zhang. A designing method of hierarchical fuzzy systems[A]. In: Proceedings of the 4th World Congress on Intelligent Control and Automation[C], Shanghai, P. R. China June 10-14, 2002,. 2379-2383
    [135] Jeffrey J Weinschenk, William E Combs, Robert J Marks. Avoidance of rule explosion by mapping fuzzy systems to a union rule configuration[A]. In: Proceedings on 2003 IEEE International Conference on Fuzzy Systems[C], St. Louis, MO, USA, May 25-28, 2003. 43-48, vol. 1
    [136] Combs W E, Andrews J E. Combinatorial rule explosion eliminated by a fuzzy rule configuration[J]. In: IEEE Transactions on Fuzzy Systems, 1998, 6: 1-11
    [137] 王立新．模糊系统与模糊控制教程[M]．北京：清华大学出版社，2003
    [138] 张化光，何希勤．模糊自适应控制理论及其应用[M]．北京：北京航空航天大学出版社，2002
    [139] Smola A J, Scholkopf B. A Tutorial on Support Vector Regression[J]. Statistics and Computing, 2004, 14(3): 199-222
    [140] Momma M, Bennett K P. A pattern search method for model selection of support vector regression[A]. In Proceedings of SIAM Conference on Data Mining[C], Philadelphia: SIAM, 2002
    [141] Chang M W, Lin C J. Leave-One-Out Bounds for Support Vector Regression Model Selection[J]. Neural Computation, 2005, 17: 1188-1222
    [142] Lin C J, WENG R C. Simple probabilistic predictions for support vector regression[R]. Taipei: Department of Computer Science, National Taiwan University, 2004
    [143] Hathaway, Richard J., Bezdek, James C.. Switching regression models and fuzzy clustering[J]. IEEE Transactions on Fuzzy Systems, 1993, 1(3): 195-204
    [144] Li Ning, Li Shaoyuan, Xi Yugeng. A multiple model approach to modeling based on LPF algorithm[J]. Journal of Systems Engineering and Electronics, 2001, 12(3): 64-70
    [145] Takagi T, Sugeno M. Fuzzy identification of systems and its applications to modeling and control[J]. IEEE Trans on SMC, 1985, 15(4): 116-132
    [146] 李建民，张钹，林福宗．支持向量机的训练算法[J]．清华大学学报(自然科学版)，Vol．43，No．1，2003：120-124
    [147] Smola A J, Scholkopf B. A Tutorial on Support Vector Regression[J]. Statistics and Computing, 2004, 14(3): 199-222
    [148] Chang M W, Lin C J. Leave-One-Out Bounds for Support Vector Regression Model Selection[J]. Neural Computation, 2005, 17: 1188-1222
    [149] Chiang J H, Hao P Y. A New Kernel-Based Fuzzy Clustering Approach: Support Vector Clustering With Cell Growing[J]. IEEE Transactions on Fuzzy Systems, 2003, 11(4): 518-527
    [150] Chiang Jung-Hsien, Hao Pei-Yi. Support Vector Learning Mechanism for Fuzzy Rule-Based Modeling: A New Approach[J]. IEEE Transactions on Fuzzy Systems, 2004, 12(1): 1-12
    [151] Chen Y, Wang J Z. Support Vector Learning for Fuzzy Rule-Based Classification Systems[J]. IEEE Transactions on Fuzzy Systems, 2003, 11(6): 716-728
    [152] 刘强，尹力．一种简化递推偏最小二乘建模算法及其应用[J]．北京航空航天大学学报，vol．29，No．7，2003：640-643
    [153] 汪小勇，梁军，刘育明，王文庆．基于递推PLS的自适应软测量模型及其应用[J]．浙江大学学报(工学版)，vol．39，No．5，2005：676-680
    [154] Sven Serneels, Christophe Croux, Peter Filzmoser, Pierre J. Van Espen. Partial robust M-regression[J]. Chemometrics and Intelligent Laboratory Systems 79, 2005: 55-64
    [155] Mia Hubert, Peter J. Rousseeuw, Sabine Verboven. A fast method for robust principal components with applications to chemometrics[J]. Chemometrics and Intelligent Laboratory Systems 60, 2002: 101-111
    6] M. H. Zhang, Q. S. Xu, D. L. Massart. Robust principal components regression based on principal sensitivity vectors[J]. Chemometrics and Intelligent Laboratory Systems 67, 2003: 175-185
    7] 张卫华，沈志云．接触网动态研究[J]．铁道学报，1991，13(4)：26-33
    8] 张卫华，沈志云．受电动力学研究[J]．铁道学报，1993，15(1)：23-30
    9] Yeirce B.. Criterion for the rejection of doubtful observation[J]. Astr. J., 1852, 2: 161-163.
    0] 程博．对几个统计模型的构造和数据分析[D]．博士学位论文，天津：南开大学，2001
    1] Cook R. D.. Detection of influential observation in linear regression[J]. Technometrics, 1977, 19: 15-18
    2] Huber P.. Robust Statistics[M], New York: Wiley&Sons, 1981.
    3] Elashoff J. D.. A model for quadratic outliers in linear regression[J]. J. A m. Statist. Assoc., 1972, 67: 418-485
    4] Wakeling IN, Macfie HJH. A robust PLS procedure[J]. Journal of Chemometrics, 1992, 6: 189-198.
    5] Juan A. G., Rosario R.. On robust partial least squares(PLS) methods[J]. Journal of Chemometrics, 1998, 12: 365-378.
    6] Kianifard F., Swallow W. H.. Using recursive residuals calculated on adaptively ordered observations to identify outliers in linear regression[J]. Biometrics, 1989, 45: 571-585.
    7] 韦博成，鲁国斌，史建清．统计诊断引论[M]．南京：东南大学出版社，1991．
    8] Tracy N. D., Young J. C., Mason R. L.. Multivariate control charts for individual observations[J]. Journal of Quality Technology, 1992, 24: 88-95
    9] M. H. Zhang, Q. S. Xu, D. L. Massart. Robust principal components regression based on principal sensitivity vectors[J]. Chemometrics and Intelligent Laboratory Systems 67, 2003: 175-185.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700