基于支持向量机方法的非平稳时间序列预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本文研究了一种新的机器学习方法——支持向量机(SVM)方法对非平稳时间序列的预测能力,并对其在气象领域中的应用进行了试验研究。全文分为以下三个部分:
     在论文的第一部分中,我们系统地介绍了支持向量机方法的基本思想、特点、回归方法的内容、预测建模思路以及支持向量机学习建模软件平台等。
     在论文的第二部分中,以33模Lorenz系统和虫口模型作为“理想”时空序列的“发生器”,利用SVM回归方法建立预测模型,并与人工神经网络(ANN)进行了比较。结果表明:(1)SVM方法不仅对平稳时间序列有较好的预报能力,也适用于非平稳时间序列,预报值与真实值的相关系数均能达到0.99以上;(2)SVM回归模型预报的准确率和效率均显著优于人工神经网络,平均相对误差优于ANN模型0.3%—0.5%。我们可以理解为SVM通过非线性映射,将低维空间中的非平稳过程映射到高维空间,一定程度上降低了系统的非平稳程度
     论文的第三部分,我们选取实际资料,利用上述方法分别对北京密云县的温度和印度新德里地区的臭氧浓度进行了试预报。得到初步结论:(1)温度预报值与真实值吻合较好,二者的相关系数达到0.98以上;臭氧浓度预报值较真实值提前,相关系数为0.63,显示了SVM对实际序列有一定的预测能力;(2)随着预报因子的不断增加,预报值与真实值的相关系数略有上升,预报误差明显下降。这表明,训练样本中包含的信息越多,SVM方法建立的预报模型就越稳定;(3)采用相同的温度资料,ANN的预报误差大于SVM的预报误差,说明在实际应用中SVM回归方法的预报效果也优于ANN方法;(4)对两个个例的预测都显示SVM回归方法对一些拐点的预报误差较大,原因可能是建模受到资料的限制,在参与训练的预报因子中缺少与预报对象密切相关的物理量场。
A new machine learning method—support vector machine (SVM) is used tobuild forecast models on the nonstationary time series in this paper, and theapplication in weather prediction field by this method is tested and analyzed aswell. Three parts are given as follows:
     In the first part, the SVM regression principal and basic ideas based on thestatistical learning theory, the main idea of the forecasting model and the CMSVMsoftware are introduced systematically.
     In the second part, the 33 modes Lorenz system and the logistic map are used asgenerators for chaotic spatio-temporal series. We build SVM regression forecastmodels and compare them with the artificial neural network (ANN) method. Thepreliminary results are: (1) The SVM method is available for both stationary timeseries and nonstationary ones, and the correlation coefficient between the predictedvalue and the actual value can reach above 0.99. (2) The SVM regression modelsgain an advantage over the ANN method on both the forecasting accuracy and thecomputing speed, and the average relative error is about 0.3%—0.5% less than thatof the ANN method. We can consider that when we mapped a nonstationary processin the low-dimension sample space to the high-dimension (infinite-dimension)feather space by a nonlinear mapping, the nonstationarity of the system is reduced.
     In the third part, we use the method mentioned before to try to predict thetemperature in Miyun country in Beijing and the ozone concentration in NewDelhi. The main results are: (1) The temperature predicted value match well with theactual value and the correlation coefficient between them can reach above 0.98. Theozone concentration predicted value are a little ahead that of the actual ones and the correlation between them is 0.63. This shows that the SVM method can be used inreal data prediction. (2) With the increase of the sample numbers, the correlationcoefficient between the predicted value and the actual value are raised a bit and theprediction errors are descended obviously. That means the more informationincluded in the training sample, the more stable the model built by SVM method willbe. (3) Comparing with the SVM method, the errors produced by the ANN methodare larger when using the same temperature data to predict. It turns out that the SVMregression method also has advantages in real data prediction. (4) Both of theexamples show that there exist high errors when predicting some inflexions by SVMregression method. That maybe ascribes to the limitation of the real data and thereare little physical quality fields which have close correlation with prediction objectsin the prediction factors when training.
引文
[1] 张学工.关于统计学习理论与支持向量机.自动化学报,2000,26(1):32~42
    [2] Vapnik V N.The Nature of Statistical Learning Theory.N Y:Springer-Verlag,1995张学工译.统计学习理论的本质.北京:清华大学出版社,2000
    [3] Muler K R, et al. Using support vector machines for time series prediction. In: Scholkopf C, et al. eds. Advances in Kernel Methods. MIT Press, 1999, 242
    [4] Mukherjee S, et al. Nonlinear prediction of chaotic time series using a support vector machines. In: Principe J, et al. eds. IEEE Workshop on Neural Networks for Signal Processing Ⅶ. IEEEPress, 1997, 511
    [5] Vapnik V N. Estimation of Dependencies Based on Empirical Data. Berlin: Springer-Verlag, 1982
    [6] Vapnik V N. Statistical Learning Theory. John Wiley & Sons, Inc., New York, 1998
    [7] Cherkassky V, Miller F. Learning from Data: Concepts, Theory and Methods N Y: John Viley & Sons, 1997
    [8] Yao X. Evolving artificial neural networks. Proceedings of the IEEE, 1999, 87 (9): 1423
    [9] Boser B, Guyon I, Vapnik V. A training algorithm for optimal margin classifiers, Fifth Annual Workshop on Computational Learning Theory. Pittsburgh: ACM Press, 1992
    [10] Cortes C, Vapnik V. Support-vector networks. Machine Learning, 1995, 20:273~297
    [11] Scholkopf B, Burges C, Vapnik V. Extracting support data for a given task. In: Fayyad U M, et al. eds. Proc of First Intl Conf on Knowledge Discovery &Data Mining, AAA I Press, 1995, 262~267
    [12] Burges C J. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 1998, 2:127~167
    [13] Scholkopf B et al. edited. Advances in kernel methods—Support Vector Learning. MIT Press, Cambridge, MA, 1999
    [14] Bennett K, Mangasarian O. Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 1992, 1:23~34
    [15] Osuna E, Freund R, Girosi F. An improved training algorithm for support vector machines. In: Proc of NNSP'97, 1997
    [16] Bennett K P, Demiriz A. Semi-supervised support vector machines. In: Proc of NIPS'98, 1998
    [17] 边肇祺等.模式识别.北京:清华大学出版社,1988
    [18] 陈永义,俞小鼎,高学浩,冯汉中.处理非线性分类和回归问题的一种新方法(Ⅰ)—支持向量机方法简介.应用气象学报,2004,15(3):345~354
    [19] 冯汉中,陈永义.处理非线性分类和回归问题的一种新方法(Ⅱ)—支持向量机方法在天气预报中的应用.应用气象学报,2004,15(3):355~365
    [20] Scholkopf B, Sung K-K, Burges C et al. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans on Signal Processing, 1997, 45 (11): 2758~2765
    [21] Scholkopf B, Burges C, Vapnik V. Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, et al. eds. Artificial Networks-ICANN'96, Spingers Lecture Notes in Computer Science, Berlin, 1996, 1112: 47~52
    [22] Scholkopf B, Sinard P, Smola A, et al. Prior knowledge in support vector kernels NIPS'97, 1997
    [23] Guyon I, Matic N, Vapnik V. Discovering informative patterns and data clearning. In: Fayyad U M, et al. eds. Advances in Knowledge Discovery &Data Mining, MIT Press, 1996: 181~203
    [24] Burges C, Scholkopf B. Improving the accuracy and speed of support vector machines. In: Mozer M, Jordan M, Petsche T, eds. Neural Information Processing Systems, MIT Press, 1997,9
    [25] 卢增祥、李衍达.交互SVM学习算法及其在文本信息过滤中的应用.清华大学学报,1999
    [26] Lu Chunyu, Yah Pingfan, Zhang Changshui, Zhou Jie. Face recognition using support vector machine. In: Proc of ICNNB'98, Beijing, 1998:652~655
    [27] Brown M, Lewis H G, Gunn S R. Linear spectral mixture models and support vector machines for remote sensing, (submitted to)IEEE Trans. Geoscience and Remote Sensing, 1998
    [28] Miller K R, Smola A J et al. Predicting time series with support vector machines. In: Proc of ICANN'97, Spingers Lecture Notes in Computer Science, 1997: 999~1005
    [29] Drucker H, Burges C et al. Support vector regression machines. In: Mozer M, Jordan M, Petsche T. eds. Neural Information Processing Systems, MIT Press, 1997, 9
    [30] Kwok J T-Y. Support vector mixture for classification and regression problems. ICPR'98, 1998
    [31] 陈永义.支持向量机方法及其在气象中的应用.北京:中国气象局培训中心,2004
    [32] 黄奕铭.支持向量机方法在雷雨天气预报中的应用.广东气象,2006(1):22~28
    [33] 吴爱敏、郭江勇等.支持向量机方法在冰雹预报中的应用.干旱气象,2005,23(4):41~45
    [34] 冯汉中、陈永义等.双流机场低能见度天气预报方法研究.应用气象学报,2006,17(1):94~99
    [35] 马晓光,胡非.利用支撑向量机预报大气污染物浓度.自然科学进展,2004,14(3):349~353
    [36] 马晓光.非线性动力系统分析和预报方法的几个问题研究.中国科学院大气物理研究所硕士论文,2004
    [37] PackardN H. Geometry from atime series. Phys. Rev. Lett., 45 (712), 1980
    [38] Yakens F. Detecting strange attractors in turbulence. Lect. Noted in Math. 1981
    [39] 王革丽.时空混沌时间序列和区域气候的非线性预报中的某些问题.中国科学院大气物理研究所博士论文,2000
    [40] Lorenz E. Deterministic nonperiodic flow. J Atmos Sci, 1963, 20:130~141
    [4]] 杨培才,周秀骥.气候系统的非平稳行为和预测理论.气象学报,2005,63(5):556~570
    [42] Zou C, ZhouX, Yang P. The statisticalstructure ofLorenzstrange attractors. Adv Atoms Sci, 1985, 2:216~224
    [43] Eckmann J P,Ruelle D.Ergodic theory of chaos and stmnge attractors. Rev Mod Phy, 1985, 57: 617~656
    [44] 杨培才,卞建春,王革丽等.气候系统的层次结构和非平稳行为:复杂系统预测问题探讨.科学通报,2003,48(13):1470~1476
    [45] Yang Peicai, Bian Jianchun, Wang Geli, et al. Hierarchies and nonstationarity in climate systems. Chinese Science Bulletin, 2003, 48 (19): 2148~2154
    [46] Schmutz C, Luterbacher J, Gyalistras D, et al. Can we trust proxy-based NAO index reconstructions? GRL, 2000, 27:1135~1138
    [47] Slonosky V C, Jones P D, Davies T D. Atmospheric circulation and surface temperature in Europe from the 18th century to 1995. Inter J Climat, 2001, 21:63~75
    [48] Trenberth K E. Recent observed interdecadal climate changes in the northern hemisphere. Bull Amer Meteor So, 1990, 7: 988~993
    [49] Yang P C, Zhou X J, Bian J C. A nonlinear regional prediction experiment on a short-strange climate process of the atmospheric ozone. J G R, 2000, 105:12253~12258
    [50] Hegger R, Kantz H, Matassini L, et al. Coping with nonstationarity by over-embedding. Phys Rev L, 2000, 84:4092~4095
    [51] Wang G L, Yang P C. A compound reconstructed prediction model for nonstationary climate process. Inter J of Climatology, 2005, 25:1265~1277
    [52] Curry J H. Order and disorder in two and there dimensional Benard convection. J Fluid Mech, 1978, 147:1~38
    [53] 杨培才.33模Lorenz系统的某些总体特征.大气科学,1987,11(1):48~57
    [54] 王革丽,杨培才,吕达仁.33模Lorenz系统的混沌特征及其可预报性分析.高原气象,2006,25(1):9~15
    [55] Eckmann J P. Roadstoturbulencein dissipative dynamics system. Rev. Mod. Phya, 1981, 53:643~649
    [56] 吴祥兴.混沌学导论。上海:上海科学技术文献出版社,1997,7~23
    [57] 施伟锋.Logistic映射及其混沌特性研究.光电技术应用,2004,19(2):53~56
    [58] Lee C K. Fractal analysis of temporal variation of air pollutant concertration by box counting. Environmental Modelling & Software, 2003
    [59] Raga G B. On the nature of air pollution dynamics in mexico city i: nonlinear analysis. Atmos. Environ, 1996
    [60] 孙明华,徐大海,朱蓉,陈军明.城市空气臭氧污染业务预报方案研究.气象,2002,28(4):3~8
    [61] Anh V V. Multifractal analysis ofhong kong air quality data. Environmetrics, 2000

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700