两类组合预测方法的研究及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
时间序列预测是一个被广泛讨论的问题,现今已有很多模型来解决各种各样的时间序列预测问题,它们主要分为线性模型和非线性模型。线性模型如ARMA、ARIMA、SARIMA模型能够根据获得的新数据进行实时调整,从而提高模型的预测精度,但是它们只能对数据的整体趋势进行分析和预测,对影响数据变化的各个因素并不能进行全面地分析;ARCH和GARCH模型不仅能预测条件方差值,还可获得预测值的迭代计算方法,并可据此迭代方法计算出风险预测值,但是为了保证非负性,通常假定模型中的所有系数均大于0,这种约束条件隐含着的任何滞后项的增大都会排除序列的随机波动行为,这可能导致在建模过程中出现震荡现象。非线性模型如神经网络,它们以快速的计算能力、对任意连续映射的逼近能力、学习能力及动态分析能力等优良特性,在回归预测中得到了广泛应用,但是它们容易陷入局部最小化和过拟合;支持向量机算法将实际问题通过非线性变换转换到高维的特征空间,在高维空间中构造线性判别函数来实现原空间中的非线性判别函数,这个特殊性质能保证其有较好的泛化能力,同时它巧妙地解决了维数问题,其算法复杂度与样本维数无关,但是这个模型对参数的选取非常敏感。考虑到上述各种预测模型的优缺点,Bate和Granger在1969年提出了组合预测思想,这个思想及其延伸为克服单一预测模型的各种缺陷带来了越来越多的选择。与目前被广泛使用的预测结果间的组合思想不同的是,本文紧紧围绕基于预测方法的组合思想来进行研究,这种思想对原始数据的性状各异的不同部分使用不同的方法来进行预测,从而充分发挥了各种方法的长处。基于后面这种组合预测思想,本文发展了两种组合方法来进行预测:1)对原始时间序列的趋势发展和随机波动分别用不同的合适模型进行预测,然后用它们的预测值组合成最终的预测结果;2)首先利用非线性信号处理技术对数据序列进行分解从而得到不同频率的子序列,然后得到高频和低频两条序列,再对它们用不同或相同的模型分别进行预测,而它们的预测值的组合即为最终的预测结果。
     另一方面,电力负荷和电价预测对社会效益最大化有着重大的现实意义,但是对它们的准确预测一直以来都是一个困难的问题,因此本文以这两项预测作为上述新的组合思想的应用背景。根据上述两种新的组合方法,本文建立了三个模型:1)根据第一种方法,考虑到电力负荷数据的多重季节性,先用季节累积式自回归动平均模型(SARIMA)模拟和预测时间序列,从而得到了拟合的残差序列(即随机波动量),由于这个残差序列的高度非线性性,本文引入非线性模型BP神经网络模型来预测它,最后将原始序列和残差序列的预测结果组合起来得到了最终的预测结果;2)根据第一种方法,考虑到电价数据的强波动性,首先使用处理方差聚集效应序列的主流算法广义自回归条件异方差模型(GARCH)来预测时间序列,然后用泛化能力卓著的支持向量回归机(SVM)来预测残差序列,最终得到组合的预测结果;3)根据第二种方法,首先利用经验模态分解法(EMD)对原始序列进行分解得到频率不同的子序列,然后将频率较高和较低的子序列分别加总为高频序列和低频序列,由于两条子序列所包含的波动因素远远少于原始序列,因此也就便于用预测模型来进行模拟,本文选用的是BP神经网络分别对它们进行预测,最后组合为最终的预测结果。此外本文针对每个模型还构造了衡量模型有效性的有效度准则,通过对澳大利亚的电力负荷和电价的预测,证明这些模型在提高预测准确度和有效度上是有效的。
     本文的主要研究成果及贡献如下所示:
     1)发展了组合预测的思想,引入了两种新的组合预测的方法,然后依此衍生出三个有效的预测模型,并且对每个模型构造了衡量模型有效性的有效度准则;
     2)电力负荷和电价预测是电力市场的热点问题也是难点问题,本文试图通过理论分析和实验比较找到准确度较高的预测模型,最终提高了预测准确度,是一次非常有意义的尝试。
As a widely discussed issue, there are already a large number of models to solve time series prediction problems. These models can be roughly divided into two cate-gories:linear and nonlinear ones. Linear models such as ARMA, ARIMA, SARIMA models can adjust itself according to the new obtained data so as to improve the fore-casting accuracy, however, they can only distinguish the overall trend of the data but not all of the factors which effect the data changes; ARCH and GARCH models can not only get the conditional variance, but also the iteration method of the predictive value, and shall calculate the risk predictive value according to the iterative method, but in order to ensure non-negative, it is usually assumed that all coefficients in these models are greater than0, any lagging terms implied by these constraints will increase and thus it rules out the random fluctuations in behavior, which may come up a shock phenomenon when models are estimated. Nonlinear models such as neural networks, have the advantages of fast computing power, and the ability to approximate arbitrary continuous mapping, powerful learning ability and dynamic analysis capabilities, they have been widely used in the regression, however, they are easy to fall into local min-imum and over-fitting; support vector machine algorithm converts practical problems to high-dimensional feature space through a nonlinear transformation, and constructs linear discrimination function in high-dimensional space, this algorithm not only pro-vides a fine generalization ability but also solves the high-dimensional problem, which makes the algorithm complexity is independent with the dimension of the sample, but this model is very sensitive to the selection of parameters.
     Taking into account the advantages and disadvantages of these various prediction models, Bate and Granger proposed the idea of combination forecasting in1969. This idea and its extension bring more and more choices to overcome the above-mentioned defects of an individual forecasting model. This paper is tightly around a combination idea based on the combination of forecasting methods for research, which is so dif-ferent from the widely combination idea based on combination of forecasting results. This idea gives full play to the strengths of the various methods. Based on this idea, this paper develops two types of combination methods:1) the original time series trend and random fluctuations are predicted by different proper models respectively, and then combine their predicted values to form the final forecasts;2) the original time series is first decomposed into some different frequency sub-series by nonlinear signal process-ing technology and then the resulting high frequency and low frequency series, then these two series are predicted by the same or different models, and the combination of their forecasts is just the final prediction results.
     On the other hand, electricity load and price forecasting have very great practical significance to maximize the social benefits, but their accuracy forecasting are always difficult tasks. Thus this paper uses these two tasks as the application background for the above-mentioned new combination forecasting. According to the above-mentioned methods, this paper establishes three forecasting models:1) Under the first approach, taking into account the multiple seasonality of electric load data, the seasonal ARIMA model (SARIMA) first models and predicts the original time series, which can obtain the residual series (i.e. the random fluctuations), then due to the highly nonlinear na-ture of this residual series, the nonlinear model BP neural network model is introduced to predict it, and finally the predicted values of the original series and residuals are combined to get the final prediction results;2) Also under the first approach, taking into account the strong volatility of electricity price data, the original time series is first predicted by the generalized autoregressive conditional heteroscedasticity model (GARCH) which is the mainstream algorithm for processing the series contained vari-ance gathered effect, then its residual series is modeled by support vector machine (SVM) which has outstanding generalization ability, and the ultimate combination pre-dicted results can be obtained in the same way of the above model;3) Under the second approach, the original series is decomposed into some different frequency sub-series by empirical mode decomposition (EMD), then the higher and lower frequency sub-series are summed up as the high frequency and low frequency series respectively, due to these two series contained far less fluctuation factors than the original series, they can be simulated by the forecasting models more conveniently, and this paper selects the BP neural network to forecast these sub-series and then obtains the combined results. In addition, this paper establish an effective degree for each models which can measure their effectiveness, and the experiments of power load and electricity price forecast of Australia prove these methods are effective to improve the prediction accuracy.
     The main research achievements and contributions are as follows:
     1) This paper develops the idea of the combination forecasting through the intro-duction of two new combination forecasting methods and the resulting three prediction models which is derived from the above two methods, in addition, this paper establish an effective degree for each models which can measure their effectiveness;
     2) Electricity load and price forecasting are two hot and also difficult issues of the electricity market. This paper attempts to find the highly accurate forecasting models by theoretical analysis and experiments for them, and improve the forecasting accuracy ultimately, which is a very meaningful attempt.
引文
[1]预测的目的与意义http://chinajuece99.blog.163.com/blog/static/855168092008715940852/. 9.2011.
    [2]市场预测http://baike.baidu.com/view/178003.htm.9.2011.
    [3]姜启源,谢金星,叶俊.数学模型(第三版)[M].北京:高等教育出版社.2003.
    [4]牛东晓,曹树华,赵磊,张文文.电力负荷预测技术及其应用[M].北京:中国电力出版社.1998.
    [5]Bates, J.M., Granger, C.W.J. The combination of forecasts[J]. Operational Research Society. 1969,20(4):451-468.
    [6]汪同三,张涛.组合预测—理论、方法及应用[M].北京:社会科学文献出版社.2008.
    [7]Al-Hamadi, H.M., Soliman, S.A. Short-term electric load forecasting based on kalman fil-tering algorithm with moving window weather and load model [J]. Electric Power Systems Research.2004,68:47-59.
    [8]Wang, B., Tai, N.L., Zhai, H.Q., Ye, J., Zhu, J.D., Qi, L.B. A new ARMAX model based on evolutionary algorithm and particle swarm optimization for short-term load forecasting[J]. Electric Power Systems Research.2008,78:1679-1685.
    [9]Amjady, N., Keynia, F. Day-ahead price forecasting of electricity markets by mutual infor-mation technique and cascaded neuro-evolutionary algorithm[J]. IEEE Transactions on Power Systems.2009,24(l):306-318.
    [10]黄德生,郭海强,沈铁峰,关鹏,吴伟,周宝森SAR1MA模型在肾综合征出血热发病率预测中的应用[J].数学的实践与认识.2009,39(23):100-106.
    [11]郝洁,梁业民SARIMA模型在我国社会消费品零售额预测中的应用[J].兰州商学院学报.2007,23(3):33-36.
    [12]孙湘海,刘潭秋.基于SARIMA模型的城市道路短期交通流预测研究[J].公路交通科技.2008,25(1):129-133.
    [13]岳付昌,闫群章,徐廷学,赵明.BP神经网络在装备使用阶段质量评估中的应用[J].四川兵工学报.2010,31(11):56-59.
    [14]魏宗舒.概率论与数理统计教程[M].北京:高等教育出版社.1983.
    [15]S. Ediger, V., Akar, S. ARIMA forecasting of primary energy demand by fuel in Turkey[J]. Energy Policy.2007.35:1701-1708.
    [16]Erdogdu, E. Electricity demand analysis using cointegration and ARIMA modelling:A case study of Turkey [J]. Energy Policy.2007,35:1129-1146.
    [17]Ong, C.S., Huang, J.J., Tzeng, G.H. Model identification of arima family using genetic algo-rithms[J]. Applied Mathematics and Computation.2005,164:885-912.
    [18]S. Ediger, V., Akar, S., Ugurlu, B. Forecasting production of fossil fuel sources in Turkey using a comparative regression and ARIMA model[J]. Energy Policy.2006,34:3836-3846.
    [19]Wang, H., Zhao, W. ARIMA Model Estimated by Particle Swarm Optimization Algorithm for Consumer Price Index Forecasting[J]. Lecture Notes in Computer Science, Artificial Intelli-gence and Computational Intelligence.2009,5855:48-58.
    [20]Valenzuela, O., Rojas, I., Rojas, F., Pomares, H., Herrera, L.J., Guillen, A., Marquez, L., Pasadas, M. Hybridization of intelligent techniques and ARIMA models for time series pre-diction[J]. Fuzzy Sets and Systems.2008,159:821-845.
    [21]李民,邹捷中,李俊平,梁建武.用ARMA模型预测深沪股市[J].长沙铁道学院学报.2000,18(1):78-84.
    [22]Sumer, K.K., Goktas, O., Hepsag, A. The application of seasonal latent variable in forecasting electricity demand as an alternative method[J]. Energy Policy.2009,37:1317-1322.
    [23]Lai, S.L., Lu, W.L. Impact analysis of september 11 on air travel demand in the USA[J]. Journal of Air Transport Management.2005,11:455-458.
    [24]Tseng, F.M., Tzeng, G.H. A fuzzy seasonal ARIMA model for forecasting[J]. Fuzzy Sets and Systems.2002,126:367-376.
    [25]田雨波.混合神经网络技术[M].北京:科学出版社.2009.
    [26]Mohandes, M.A., Rehman, S., Halawani, T.O. A neural networks approach for wind speed prediction[J]. Renewable Energy.1998,13(3):345-354.
    [27]Yu, L., Wang, S., Lai, K.K. Forecasting crude oil price with an EMD-based neural network ensemble learning paradigm[J]. Energy Economics.2008,30:2623-2635.
    [28]Law, R. Back-propagation learning in improving the accuracy of neural network-based tourism demand forecasting[J]. Tourism Management.2000,21:331-340.
    [29]Rumelhart, D.E., Hinton, G.E., Williams, R.J. Learning representations by back-propagating errors[J]. Nature Publishing Group.1986,323(9):533-536.
    [30]More, A., Deo, M. Forecasting wind with neural networks[J]. Marine Structures.2003,16:35-49.
    [31]Zhang, Y.D., Wu, L.N. Stock market prediction of S&P 500 via combination of improved BCO approach and BP neural network[J]. Expert Systems with Applications.2009,36:8849-8854.
    [32]Cadenas, E., Rivera, W. Short term wind speed forecasting in La Venta, Oaxaca, Mexico, using artificial neural networks[J]. Renewable Energy.2009,34:274-278.
    [33]Feng, C.X.J., Abhirami, C.G., Alice, E.S., Yu, Z.G.S. Practical guidelines for developing BP neural network models of measurement uncertainty data[J]. Journal of Manufacturing Sys-tems.2006,25:239-250.
    [34]Plumb, A.P., Rowe, R.C., York, P., Brown, M. Optimisation of the predictive ability of artificial neural network (ANN) models:A comparison of three ANN programs and four classes of training algorithm[J]. European Journal of Pharmaceutical Sciences.2005,25:395-405.
    [35]刘达.电力市场中电价预测模型方法及应用研究[博士论文].北京:华北电力大学.2007.
    [36]马海兴LSSVM在教学质量评价中的应用[J].中国科技信息.2010,9:283-285.
    [37]Contreras, J., Espinola, R., Nogales, F., Conejo, A. ARIMA models to predict next-day elec-tricity prices[J]. IEEE Transactions on Power Systems.2003,18(3):1014-1020.
    [38]Garcia, R.C., Contreras, J., van Akkeren, M., Garcia, J.B.C. A GARCH forecasting model to predict day-ahead electricity prices[J]. IEEE Transactions on Power Systems.2005, 20(2):867-874.
    [39]汉密尔顿.时间序列分析[M].北京:中国社会科学出版社.1999.
    [40]劳斯,S.M.随机过程[M].北京:中国统计出版社.1997.
    [41]Garch模型与应用简介http://wenku.baidu.com/view/0dd026791711cc7931b716c3.html. 5.2006.
    [42]Engle, R. Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation[J]. Econometrica.1982,50:987-1008.
    [43]Bollerslev, T. Generalised autoregressive conditional heteroskedasticity[J]. Journal of Econo-metrics.1986,31:307-327.
    [44]Tseng, C.H., Cheng, S.T., Wang, Y.H. New hybrid methodology for stock volatility predic-tion[J]. Expert Systems with Applications.2009,36:1833-1839.
    [45]Kung, L.M., Yu, S.W. Prediction of index futures returns and the analysis of financial spillovers-A comparison between GARCH and the grey theorem[J]. European Journal of Operational Research.2008,186:1184-1200.
    [46]Tan, Z., Zhang, J., Wang, J., Xu, J. Day-ahead electricity price forecasting using wavelet transform combined with ARIMA and GARCH models[J]. Applied Energy.2010,87:3606-3610.
    [47]Akaike, H. Information theory and an extension of the maximum likelihood principle[J]. Sec-ond international symposium of information theory, Budapest.1973,:267-281.
    [48]Lo, M.S. Generalized Autoregressive Conditional Keteroscedastic Time Series Models[硕士 论文]. Burnaby:Simon Fraser University.2003.
    [49]陈纪修,於崇华,金路.数学分析(第二版下册)[M].北京:高等教育出版社.2004.
    [50]邓乃扬,田英杰.数据挖掘中的新方法—支持向量机[M].北京:科学出版社.2004.
    [51]Vapnik, V. The Nature of Statistical Learning Theory[M]. New York:Springer.1995.
    [52]Niu, D., Liu, D., Wu, D.D. A soft computing system for day-ahead electricity price forecast-ing[J]. Applied Soft Computing.2010,10:868-875.
    [53]Tran, Q.A., Li, X., Duan, H.X. Efficient performance estimate for one-class support vector machine[J]. Pattern Recognition Letters.2005,26:1174-1182.
    [54]Cortes, C., Vapnik, V. Support vector networks[J]. Machine Learning.1995,20:273-297.
    [55]Lin, H.J., Yeh, J.P. Optimal reduction of solutions for support vector machines[J]. Applied Mathematics and Computation.2009,214:329-335.
    [56]Abe, S. Support Vector Machines for Pattern Classification[M]. New York:Springer,2005.
    [57]Karim, O.E., Mahmoud, O.E. Predicting defect-prone software modules using support vector machines[J]. The Journal of Systems and Software.2008,81:649-660.
    [58]Carbonneau, R., Laframboise, K., Vahidov, R. Application of machine learning techniques for supply chain demand forecasting[J]. European Journal of Operational Research.2008, 184:1140-1154.
    [59]Smola, A.J., Scholkopf, B. A tutorial on support vector regression [J]. Statistics and Comput-ing.2004,14:199-222.
    [60]Wang, J., Zhu, W., Sun, D., Lu, H. Application of SVM combined with mackov chain for inventory prediction in supply chain[J].4th International Conference on Wireless Communi-cations, Networking and Mobile Computing.2008,:4.
    [61]Guo, Z., Bai, G. Application of least squares support vector machine for regression to reliabil-ity analysis[J]. Chinese Journal of Aeronautics.2009,22:160-166.
    [62]Suykens, J.A.K., Vandewalle, J. Least squares support vector machine classifiers[J]. Neural Processing Letters.1999,9:293-300.
    [63]Cheng, Y.H., Hai-Wei, L., Chen, Y.S. Implementation of a back-propagation neural network for demand forecasting in a supply chain-a practical case study [J]. IEEE International Con-ference on Service Operations and Logistics, and Informatics.2006,:1036-1041.
    [64]秦树人,季忠,尹爱军.工程信号处理[M].北京:高等教育出版社.2008.
    [65]希尔伯特黄变换http://baike.baidu.com/view/4700051.htm.10.2011.
    [66]Datig, M., Schlurmann, T. Performance and limitations of the Hilbert-Huang transformation (HHT) with an application to irregular water waves[J]. Ocean Engineering.2004,31:1783-1834.
    [67]Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.C., Tung, C.C., Liu, H.H. The empirical mode decomposition and the hilbert spectrum for nonlinear and non-stationary time series analysis[J]. Proceedings of the Royal Society A:Mathematical, Physical & Engineering Sciences.1998,454:903-995.
    [68]Qi, K., He, Z., Zi, Y. Cosine window-based boundary processing method for EMD and its application in rubbing fault diagnosis[J]. Mechanical Systems and Signal Processing.2007, 21:2750-2760.
    [69]Wei, Y.C., Lee, C.J., Hung, W.Y., Chen, H.T. Application of Hilbert-Huang transform to char-acterize soil liquefaction and quay wall seismic responses modeled in centrifuge shaking-table tests[J]. Soil Dynamics and Earthquake Engineering.2010.30:614-629.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700