用户名: 密码: 验证码:
基于历史数据的偏最小二乘建模方法研究与应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
电站信息技术的大力发展,为基于数据驱动的运行优化研究提供了良好的平台,其中,基于电站实时/历史数据库中海量数据对复杂热力系统建模的方法已逐渐成为研究的热点课题之一。然而,运行数据不同于试验数据,有着许多不利于建模的特点,例如,变量间存在多重相关性、工况组合分布不均匀、过程存在非线性等,这些问题都严重阻碍着历史数据建模方法的发展和应用。针对这些问题,本文研究了基于偏最小二乘理论(Partial Least Squares projection to latent structures, PLS)的热工过程建模方法,较好地解决了上述问题。论文的主要内容和研究成果包括:
     1.分析了电站历史数据的特点,总结了基于历史数据建模方法的三个阶段,即数据准备、数据建模和模型验证。介绍了历史数据的常用预处理方法,常用的建模理论以及模型检验方法,阐述了拟合精度与预测精度的区别。
     2.回顾了PLS方法的发展历史及研究现状,介绍了PLS以特征提取思想解决变量间多重相关性的过程,并给出了利用交叉有效性确定提取成分个数以及PLS模型的常用辅助分析方法,最后总结了PLS的非线性建模方法。
     3.针对电站历史数据分布不均匀的特点,提出了建模样本选择的三点原则。借鉴于试验建模方法具有优质建模样本的思想,提出了以试验设计原理选择建模样本的方法。在分析比较了几种常见的试验设计方法后,确定了以均匀设计作为建模样本的选取原则,并给出了该方法的详细实现过程。最后通过仿真分析验证了建模样本均匀性对于提高模型精度的意义。
     4.以均匀设计为原则的建模样本选择方法在应用到实际热工过程时,会因变量间的多重相关性导致无法获得所要求的数据,针对这一问题,提出了先对原始数据进行PLS特征提取再进行均匀选择的方法,并分析了该方法的有效性。随后在此方法的基础上,提出了对原始数据进行正交信号修正(Orthogonal Signal Correction, OSC)的改进方法,进一步保证了均匀建模样本数据的获取。
     5.以热工过程中的再热汽温系统为例,介绍了基于历史数据的PLS建模方法应用过程。从能量平衡的原理出发,提出了以再热期望焓升(即单位流量蒸汽吸热能力)作为因变量的再热汽温建模方法,并对其影响因素进行了全面的定性分析,构建了火焰中心高度、入炉煤质等现场没有引入但却有着关键作用的中间变量。通过多组模型对比的方式验证了:1)以再热期望焓升作为因变量建模能有效地减轻再热汽温模型中的非线性成分,较好地反映汽温变化的本质;2)以均匀设计为原则的建模样本选择方法能有效地提高模型的预测能力。最后,给出了基于所建模型的再热汽温运行优化指导。
Nowadays, the rapid development of information technology in power plant has provided a convenient platform for the study of data based operation optimization. In which, the modeling of complex thermodynamic system based on the huge amounts of data in the power plant real time/historical database is gradually becoming one of the hottest topics. However, the disadvantages of historical data seriously obstacle the development of modeling methods, such as variable multicollinearity, nonlinearity, non-uniformity distribution of working conditions and so on. To solve these issues, the dissertation studies thermal process modeling methods based on partial least squares projection to latent structures, which solves the problems above in a better way. The main contributions of this dissertation can be summarized as followings:
     1. The characteristics of the power plant historical data is analyzed, three stages of historical data modeling is summarized, namely, data preparation, modeling process and model validation. Some common pretreatments, modeling theories and model validation methods are introduced, and the difference between fitting precision and prediction precision is elaborated.
     2. The history of PLS development and current research situation are reviewed, the extraction process of PLS which solves multicollinearity is described, and the method of which determines the number of PLS extracted components by cross validation, some auxiliary analytical methods are introduced. Finally, PLS nonlinear modeling methods are summarized.
     3. With regard to the characteristic of uneven data distribution, three principles sample selection in historical data modeling are put forward and then method of modeling sample selection is also proposed. After the analysis of several common experimental design methods, the uniform design is determined as the principle of modeling sample selection. Finally, the significance of sample uniformity for improving prediction precision is verified through simulation.
     4. To solve the problem which no data can be required due to multicollinearity, method based on PLS transform and its improvement based on orthogonal signal correction are proposed, and both of their validity are analyzed through simulation.
     5. As an example of reheat steam temperature system in thermal process, PLS modeling method based on historical data is proposed. Starting from the energy balance principle, method with expected reheat enthalpy rise (which represents the heat-absorbing capacity of unit flow steam) as the dependent variable is put forward and its influencing factors are analyzed. Some variables which cannot be measured but really play a key role in the field are also constructed. The results show that:First, the model established with expected reheat enthalpy rise can effectively reduce the nonlinearity component of reheat steam temperature; Second, modeling sample selection based on uniform design principle can effectively improve the predictive precision of model.
引文
[1]中电联.2011年全国电力工业统计快报[EB/OL]. http://www.cec.org.cn.
    [2]应明良,戴成峰,胡伟锋等.600MW机组对冲燃烧锅炉低氮燃烧改造及运行调整[J].中国电力,2011,44(4):55-58.
    [3]陈玉忠,石践,罗小鹏.缝隙式燃烧器“W”火焰锅炉燃烧系统改造后燃烧调整及运行特性分析[J].中国电机工程学报,2011,(S1):212-216.
    [4]张洪波.微油点火技术在大型电站锅炉中的应用[J].节能与环保,2010,(7):31-33.
    [5]赵立中.420t/h锅炉回转空气预热器的改造[J].中国电力,1993,(9):57-59.
    [6]孔凡平,张文逊.国产300MW汽轮机通流改造及经济性分析[J].发电设备,2003,17(4):15-17.
    [7]刘福国.电站锅炉入炉煤元素分析和发热量的软测量实时监测技术[J].中国电机工程学报,2005,25(6):139-145.
    [8]刘焕章,刘吉臻,常太华等.电站锅炉风煤配比的优化控制[J].动力工程,2007,27(4):515-517.
    [9]韩忠旭,周传心,李丹等.燃煤发热量软测量技术及其在超临界机组控制系统中的应用[J].中国电机工程学报,2008,28(35):90-95.
    [10]华志刚,吕剑虹,张铁军.状态变量-预测控制技术在600MW机组再热汽温控制中的研究与应用[J].中国电机工程学报,2005,25(12):103-107.
    [11]管志敏,王兵树,林永君等.自抗扰控制在火电厂主汽温控制中的应用研究[J].系统仿真学报,2009,(1):307-311.
    [12]侯子良.再论火电厂厂级监控信息系统[J].电力系统自动化,2002,26(15):1-3.
    [13]卢勇,徐向东.数据挖掘与锅炉负荷多模型自适应控制研究[J].电站系统工程,2003,19(4):49-51.
    [14]卢勇,徐向东.锅炉变工况运行优化监控系统的实现[J].动力工程,2003,23(2):2325-2328.
    [15]王培红,陈强,董益华等.数据挖掘及其在电厂SIS中的应用[J].电力系统自动化,2004,28(8):76-79.
    [16]李建强,刘吉臻,张栾英等.基于数据挖掘的电站运行优化应用研究[J].中国电机工程学报,2006,26(20):118-123.
    [17]牛成林.增量数据挖掘及其在电站运行优化中的理论研究及应用[D].北京:华北电力大学,2010:1-84.
    [18]杨婷婷.基于数据的电站节能优化控制研究[D].北京:华北电力大学,2010:1-93.
    [19]韩璞,乔弘,王东风等.火电厂热工参数软测量技术的发展和现状[J].仪器仪表学报,2007,28(6):1139-1146.
    [20]赵征,曾德良,田亮等.基于数据融合的氧量软测量研究[J].中国电机工程学报,2005,25(7):7-12.
    [21]田亮,刘鑫屏,赵征等.一种新的热量信号构造方法及实验研究[J].动力工程,2006,26(4):499-502.
    [22]卢勇,徐向东.烟气含氧量软测量新方法研究[J].热能动力工程,2002,17(6):614-617.
    [23]周昊,朱洪波,曾庭华等.基于人工神经网络的大型电厂锅炉飞灰含碳量建模[J].中国电机工程学报,2002,22(6):96-100.
    [24]方湘涛,叶念渝.基于BP神经网络的电厂锅炉飞灰含碳量预测[J].华中科技大学学报(自然科学版),2003,31(12):75-77.
    [25]陈敏生,刘定平.基于核主元分析和支持向量机的电站锅炉飞灰含碳量软测量建模[J].华北电力大学学报,2006,33(1):72-75.
    [26]王东风,宋之平.基于神经元网络的制粉系统球磨机负荷软测量[J].中国电机工程学报,2001,21(12):97-99.
    [27]刘福国,郝卫东,韩小岗等.基于烟气成分分析的电站锅炉入炉煤质监测模型[J].燃烧科学与技术,2002,8(5):441-445.
    [28]梁秀满,孙文来.钢球磨煤机出口温度软测量方法[J].机械与电子,2000,(6):14-16.
    [29]李勇,张瑞青,曹丽华.漏入汽轮机真空系统空气量的软测量[J].系统仿真学报,2003,15(3):444-446.
    [30]刘吉臻,赵征,刘锦康等.利用数据融合方法实现通风机流量软测量[J].华北电力大学学报,2005,32(3):61-65.
    [31]王建国,彭雅轩,徐志明等.电站锅炉省煤器积灰状态在线软测量方法[J].仪表技术与传感器,2005,(3):24-25.
    [32]樊绍胜,王耀南.基于模糊建模的冷凝器污脏软测量[J].控制理论与应用,2005,22(3):434-439.
    [33]张毅,陈彪,丁艳军等.燃煤锅炉高效低NOx运行策略的实验研究[J].清华大学学报:自然科学版,2006,46(5):666-669.
    [34]顾燕萍,赵文杰,吴占松.基于最小二乘支持向量机的电站锅炉燃烧优化[J].中国电机工程学报,2010,30(17):91-96.
    [35]王春林.大型电站锅炉配煤及燃烧优化的支持向量机建模与实验研究[D].杭州:浙江大学,2007.
    [36]余廷芳.火电厂厂级监控信息系统(SIS)建模、实现及人工智能的应用研究[D].南京:东南大学,2004.
    [37]张小桃,倪维斗,李政等.基于现场数据热工对象建模的可辨识性[J].清华大学学报(自然科学版),2004,44(11):1544-1547.
    [38]刘福国.基于统计分析的电站锅炉性能建模与优化[J].动力工程,2004,24(4):477-480.
    [39]付忠广,靳涛,周丽君等.复杂系统反向建模方法及偏最小二乘法建模应用研究[J].中国电机工程学报,2009,29(2):25-29.
    [40]黄景涛,马龙华,茅建波等.基于支持向量回归的300MW电站锅炉再热汽温建模[J].中国电机工程学报,2006,26(7):19-24.
    [41]文昌俊,钟毓宁,刘文超.现场数据可靠性分析非参数方法比较[J].湖北工学院学报,2002,17(4):31-33.
    [42]赵宇,杨军,马小兵.可靠性数据分析教程[M].北京:北京航空航天大学出版社,2009.
    [43]L. C. Luderman, M. I. C. Murguia. Random processes:filtering, estimation and detection[J]. HE Transactions,2004,36(9):913-914.
    [44]刘福国,土学同,苏相河等.基于系统测量冗余的电厂异常运行数据检测与校正[J].中国电机工程学报,2003,23(7):204-207.
    [45]S. Narasimhan, C. S. Kao, RSH Mah. Detecting changes of steady states using the mathematical theory of evidence[J]. AIChE journal,1987,33(11):1930-1932.
    [46]付克昌,戴连奎,吴铁军.基于多项式滤波算法的自适应稳态检测[J].化工自动化及仪表,2006,33(005):18-22.
    [47]毕小龙,王洪跃,司风琪等.基于趋势提取的稳态检测方法[J].动力工程,2006,26(4):503-506.
    [48]吕游,刘吉臻,赵文杰等.基于分段曲线拟合的稳态检测方法[J].仪器仪表学报,2012,33(1):194-200.
    [49]邱天,刘吉臻,牛玉广.电站锅炉主元分析建模中的数据选取[J].中国电机工程学报,2009,29(8):87-91.
    [50]曲亚鑫,刘吉臻,田亮等.基于数理统计的飞灰含碳量预测及异常原因分析[J].电站系统工程,2009,25(1):14-16.
    [51]H. Wold. Path models with latent variables:The NIPALS approach[M]. Acad. Press,1975.
    [52]H. Wold. Soft modeling:the basic design and some extensions[J]. Systems under indirect observation,1982,2:1-53.
    [53]S. Wold, S. Hellberg, T. Lundstedt, et al. PLS modeling with latent variables in two or more dimensions[M].1987.
    [54]S. Wold. Personal memories of the early PLS development[J]. Chemometrics and Intelligent Laboratory Systems,2001,58(2):83-84.
    [55]H. Wold. Estimation of principal components and related models by iterative least squares[J]. Multivariate analysis,1966,1:391-420.
    [56]H. Wold. Partial least squares[J]. International Journal of Cardiology,1985, 147(2):581-591.
    [57]Agnar Hoskuldsson. PLS regression methods[J]. Journal of Chemometrics, 1988,2(3):211-228.
    [58]S. Wold, E. Johansson, M. Cocchi. PLS-partial least squares projections to latent structures[J].3D QSAR in drug design,1993,1:523-550.
    [59]蒋红卫.偏最小二乘回归的扩展及其实用算法构建[D].西安:中国人民解放军第四军医大学,2003.
    [60]Sijmen de Jong. SIMPLS:An alternative approach to partial least squares regression[J]. Chemometrics and Intelligent Laboratory Systems,1993,18(3): 251-263.
    [61]S. Wold, A. Ruhe, H. Woldet al. The collinearity problem in linear regression. The partial least squares(PLS) approach to generalized inverses[J]. SI AM J. Sci. Stat. Comput,1984,5:735-743.
    [62]S. Rannar, F. Lindgren, P. Geladiet al. A PLS kernel algorithm for data sets with many variables and fewer objects. Part 1:Theory and algorithm[J]. Journal of Chemometrics,1994,8(2):111-125.
    [63]S. Rannar, P. Geladi, F. Lindgrenet al. A PLS kernel algorithm for data sets with many variables and few objects. Part Ⅱ:Cross-validation, missing data and examples[J]. Journal of chemometrics,1995,9(6):459-470.
    [64]R. Noonan, H. Wold. PLS path modeling with indirectly observed variables:a comparison of alternative estimates for the latent variable[J]. Systems Under Indirect Observation, Part Ⅱ,1982:75-94.
    [65]S. Wold, N. Kettaneh, K. Tjessem. Hierarchical multiblock PLS and PC models for easier model interpretation and as an alternative to variable selection[J]. Journal of chemometrics,1996,10(5-6):463-482.
    [66]王惠文.偏最小二乘回归方法及其应用[M].北京:国防工业出版社,1999.
    [67]王惠文,吴载斌,孟洁.偏最小二乘回归的线性与非线性方法[M].北京:国防工业出版社,2006.
    [68]B. Cheng, X. Wu. A modified PLSR method in prediction[J]. J. Data Science, 2006,4:257-274.
    [69]N. M. Faber, R. Rajko. How to avoid over-fitting in multivariate calibration— The conventional validation approach and an alternative[J]. Analytica chimica acta,2007,595(1):98-106.
    [70]Huiwen Wang, Qiang Liu, Yongping Tu. Interpretation of partial least-squares regression models with VARIMAX rotation[J]. Computational Statistics & Data Analysis,2005,48(1):207-219.
    [71]Xin Bao, Liankui Dai. Partial least squares with outlier detection in spectral analysis:A tool to predict gasoline properties[J]. Fuel,2009,88(7):1216-1222.
    [72]Svante Wold, Nouna Kettaneh-Wold, Bert Skagerberg. Nonlinear PLS modeling[J]. Chemometrics and Intelligent Laboratory Systems,1989,7(1-2): 53-65.
    [73]Ildiko E. Frank. A nonlinear PLS model[J]. Chemometrics and Intelligent Laboratory Systems,1990,8(2):109-119.
    [74]Svante Wold. Nonlinear partial least squares modelling Ⅱ. Spline inner relation[J]. Chemometrics and Intelligent Laboratory Systems,1992,14(1-3): 71-84.
    [75]S. J. Qin, T. J. McAvoy. Nonlinear PLS modeling using neural networks[J]. Computers & Chemical Engineering,1992,16(4):379-391.
    [76]DJH Wilson, G. W. Irwin, G. Lightbody. Nonlinear PLS modelling using radial basis functions[M]. IEEE,1997:3275-3276.
    [77]Tonghua Li, He Mei, Peisheng Cong. Combining nonlinear PLS with the numeric genetic algorithm for QSAR[J]. Chemometrics and Intelligent Laboratory Systems,1999,45(1-2):177-184.
    [78]G. Baffi, E. B. Martin, A. J. Morris. Non-linear projection to latent structures revisited:the quadratic PLS algorithm[J]. Computers & Chemical Engineering,1999,23(3):395-411.
    [79]G. Baffi, E. B. Martin, A. J. Morris. Non-linear projection to latent structures revisited:the quadratic PLS algorithm[J]. Computers & Chemical Engineering,1999,23(3):1293-1307.
    [80]Anders Berglund, Svante Wold. INLR, implicit non-linear latent variable regression[M]. John Wiley & Sons Ltd.,1997:141-156.
    [81]Fredrik Lindgren, Paul Geladi, Svante Wold. The kernel algorithm for PLS[J]. Journal of Chemometrics,1993,7(1):45-59.
    [82]Nouna Kettaneh, Anders Berglund, Svante Wold. PCA and PLS with very large data sets[J]. Computational Statistics & Data Analysis,2005,48(1):69-85.
    [83]靳涛,付忠广,刘刚等.反向建模方法在火电厂关键参数建模中的应用[J].动力工程,2009,(11):1008-1012.
    [84]Chan, L.Y., K.T. Fang and P. Winker. An equivalence theorem for orthogonality and D-optimality[J]. Technical report math-186, Hong Kong Baptist University, 1998.
    [85]J. Kiefer. Optimum experimental designs[J]. Journal of the Royal Statistical Society, Series B (Methodological),1959:272-319.
    [86]F. J. Hickernell. A generalized discrepancy and quadrature error bound[J]. Mathematics of computation,1998,67(221):299-322.
    [87]L. K. Hua, Y. Wang. Applications of Number Theory to Numerical Analysis[J]. 1981,241.
    [88]张学中.关于均匀设计表的应用[J].均匀设计论文选(第一集),1995:116-120.
    [89]J. C. Wang, CFJ Wu. Nearly orthogonal arrays with mixed levels and small runs[J]. Technometrics,1992:409-422.
    [90]方开泰.均匀设计与均匀设计表[M].北京:科学出版社,1994.
    [91]Dennis K. J. Lin Kai-Tai Fang. Uniform Design:Theory and Application[J]. Technometrics,2000,42(3):237-248.
    [92]Y. Fang. Relationships between uniform design and orthogonal design[M]. The 3rd International Chinese Statistical Association Statistical Conference,1995.
    [93]K. T. Fang, D. K. J. Lin, H. Qin. A note on optimal foldover design[J]. Statistics & probability letters,2003,62(3):245-250.
    [94]K. T. Fang, D. K. J. Lin, P. Winkeret al. Uniform design:theory and application[J]. Technometrics,2000:237-248.
    [95]K. T. Fang, C. X. Ma, P. Winker. Centered L2-discrepancy of random sampling and Latin hypercube design, and construction of uniform designs[J]. Mathematics of Computation,2002,71(237):275-296.
    [96]孙尚拱.均匀设计中有重复试验的统计分析[J].数理统计与管理,2000,19(2):24-29.
    [97]S. Wold, H. Antti, F. Lindgrenet al. Orthogonal signal correction of near-infrared spectra[J]. Chemometrics and Intelligent Laboratory Systems,1998,44(1-2): 175-185.
    [98]T. Fearn. On orthogonal signal correction[J]. Chemometrics and Intelligent Laboratory Systems,2000,50(1):47-52.
    [99]C. A. Andersson. Direct orthogonalization[J]. Chemometrics and Intelligent Laboratory Systems,1999,47(1):51-63.
    [100]J. A. Westerhuis, S. de Jong, A. K. Smilde. Direct orthogonal signal correction[J]. Chemometrics and Intelligent Laboratory Systems,2001,56(1):13-25.
    [101]J. Trygg, S. Wold. Orthogonal projections to latent structures (O-PLS)[J]. Journal of Chemometrics,2002,16(3):119-128.
    [102]S. Wold. Cross-validatory estimation of the number of components in factor and principal components models[J]. Technometrics,1978:397-405.
    [103]Svensson, T. Kourti, J. F. MacGregor. An investigation of orthogonal signal correction algorithms and their characteristics[J]. Journal of Chemometrics, 2002,16(4):176-188.
    [104]原苏联.锅炉机组热力计算标准方法[J].1973.
    [105]肖平,刘统华.300MW机组摆动燃烧器调节再热汽温的改进建议[J].热力发电,2006,35(9):44-46.
    [106]周振起,李保峰,张炳文.喷水减温对机组热经济性的影响[J].锅炉技术,2006,37(3):15-18.
    [107]刘强.定功率下喷水减温对机组热经济性影响的数学模型[J].中国电机工程学报,2008,28(26):19-23.
    [108]于开江,吕剑虹.锅炉主汽温和一级汽温的优化控制[J].动力工程,2004,24(2):212-217.
    [109]赵锡龄,焦云婷.单神经元自适应控制PSD在再热汽温控制中的应用[J].中国电机工程学报,2001,21(2):93-96.
    [110]黄本元,罗自学,胡光明,等.基于炉膛辐射能信号的主汽温控制试验研究[J].电站系统工程,2006,22(5):21-22.
    [111]范从振.锅炉原理[M].北京:中国水利电力出版社,1986:171-180.
    [112]焦景贵,姜祖光,高秀芬.摆动燃烧器调节再热汽温试验[J].电站系统工程,1998,14(4):11-15.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700