一种多核径向基函数神经网络的广西降水预报集成模型
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
提高降水预报的准确率是气象预测领域一个十分重要的课题。受地表、地势、空气分子、湿度和云层等众多因素影响,多年的降水量难以预测,每天的降水变化是一个复杂的非线性过程。目前,包括中国气象局的T213模式等一批降水预测模型相继被提出,这些模型都尽可能找到降水量与测量出的物理量预报因子之间的某些非线性关系,希望能够建立一个基于历史资料的稳定、准确、泛化性能好且适用性强的范例推理模型。20世纪90年代以来,随着神经网络技术在大气科学中的广泛应用,神经网络以其良好的自适应学习能力和非线性处理能力,在天气预报中的成效日益显著。但是神经网络技术由于没有一套完整的理论体系指导,使用者的经验对应用效果起了决定性的作用。具体实际应用中,研究人员由于相应先验知识的缺乏,为了确定合适的网络参数,往往避免不了大量既费时又费力的实验摸索。很多情况下,就算对一个问题采用的方法相同,不同的操作者,结果也常常相差甚远。或者上一批数据设定好的参数,换另一批数据后效果并不好,甚至根本不适用,模型的参数需要重新实验摸索确定。这些都在很大程度上限制了神经网络模型在实际中的应用。
     径向基神经网络(RBF网络)是一种三层前馈神经网络。中心、隐层神经元数及宽度的选择是否合适直接影响了RBF神经网络的性能。隐层节点数过少,则无法产生满足若干样本的学习需要的足够的连接权组合数,影响网络的逼近精度;隐层节点数过多,则学习以后网络的泛化能力变差,可能会带来过拟合现象,网络反应速度缓慢等。实际应用中,中心的选择可随机选择一定数目的输入模式;或在指定数目的情况下,采用K均值聚类算法选择出中心;RBF网络的梯度下降法学习算法,能根据网络误差调整数据中心、宽度和权值,使得网络的性能达到更优。但是算法学习前需要确定中心数和宽度;正交最小二乘法(Orthogonal least squares, OLS)的RBF网络学习算法,可以在训练的过程中根据各隐节点对网络误差的大小,对隐节点进行调整,获得较合适的隐节点数,相对于固定中心的学习算法,OLS不需要事先确定网络隐层节点数,解决了隐节点数难以确定的问题。但是需要事先确定隐节点的扩展常数,即宽度。这些算法具有普遍性且已经被应用,缺点是需要事先确定好隐节点数目或宽度等参数,而这些参数的确定尚没有一套完整有效的方法,往往为了得到合适的参数值需要进行大量费事费力的工作。
     由于影响因素较多,众多的降水影响因子使模型的训练和预测规模变大,网络的训练时间长,收敛速度慢,降低了模型的预测性能。因此有必要对这些几十个乃至上百个的降水预报因子进行降维。而在这些影响降水因子的众多数据中,不同核函数对各输入因子的影响程度不一样,如何选择一个较好的核函数权衡这些因子,有助于提高降水预测模型的准确度。
     本文针对径向基函数神经网络(RBF网络)的隐层节点数、中心和宽度难以确定的问题,及不同核函数对不同输入变量的影响程度不一、输入影响因子众多数据量大等问题。旨在建立一个适用普遍且有效的降水预报模型。本文的研究对象为基于径向基神经网络(RBF网络)的降水预报模型,主要完成的研究工作如下:
     (1)针对降水影响因子众多的问题,首先采用核主成分分析(KPCA)筛选主要降水影响因子,实验结果表明KPCA降维提取到的因子是可行且有效的。
     (2)针对径向基函数神经网络(RBF网络)的隐层节点数、中心和宽度难以确定的问题,为提高网络性能,分析了常用学习算法的不足,提出了一种基于模糊聚类算法确定隐结点数的RBF优化算法。首先采用模糊聚类算法确定样本数据隐节点数;然后采用K均值聚类算法进行分类,确定各隐节点的中心;再以隐节点问的最小距离作为扩展常数,即宽度;最后在保证最小化网络性能度量下,实现非线性多维插值,使用最小二乘算法训练网络,确定网络输出层权值,使网络性能在某种意义下最优。该算法学习前不需要确定隐节点数、中心及宽度,解决了这些参数的确定问题。
     (3)混合了6种径向基核函数,建立多核径向基函数神经网络集成模型,对得出的6个子预报模型进行简单平均、多元回归预报集成,从而考虑到了不同核函数对不同维输入变量的影响。
     (4)对广西5月3个区逐口降水量的真实数据进行预报实验,实验表明,该模型具有较好的泛化性能,预报准确率高于T213降水预报模式,具有一定应用价值。
To improve the accuracy of precipitation forecast of meteorological prediction field is a very important topic. Surface topography, the molecules of air, humidity and clouds, and other factors, many years of precipitation forecast hard, daily precipitation variation is a complex nonlinear process. At present, including the Chinese Meteorological Bureau T213mode and a number of precipitation prediction models have been proposed, each of these models as far as possible to find precipitation and the measured physical quantity forecast factor between certain nonlinear relationship, hope to establish a historical data based stable, accurate, good generalization performance and the applicability of the case-based reasoning model. Since the nineteen ninties, along with neural network technology in Atmospheric Science in a wide range of applications, the neural network with its good adaptive learning ability and nonlinear processing ability, in weather forecast effect is increasingly remarkable. But the neural network technology due to the absence of a complete theoretical system, the user experience on application effect plays a decisive role. Actual application, the researchers because of the lack of prior knowledge, in order to determine the appropriate network parameters, often can not avoid a time-consuming and laborious experiments. In many cases, even for a problem by using the method of the same, different operators, the results often differ very far. Even on a number of data set up parameters, another group of data the result is not good, even not applicable, the parameters of the model to experimental exploration to determine. These are largely restricted to the neural network model in practical application.
     Radial basis function neural network (RBF network) is a three layer feedforward neural network. Center, the number of hidden neurons and width selection is suitable or not directly affect RBF neural network performance. The hidden layer nodes is too little, cannot satisfy some sample learning needs enough connection weights combination number, affecting the network approximation accuracy; hidden layer nodes number, then after studying the generalization ability of the network becomes poor, may lead to overfitting phenomenon, network response slow. In practical application, the random selection of a certain number of input mode; or in a specified number of cases, using K means clustering algorithms to select the RBF Network Center; the gradient descent learning algorithm, the network error adjustment data centers, widths and weights, the network performance to achieve better. But the algorithm before learning needs to determine the center number and width; orthogonal least squares (Orthogonal least squares, OLS) RBF network learning algorithm, can be in the process of training according to the hidden nodes on the network size of error, the hidden nodes to adjust, to obtain a suitable hidden nodes, relative to the fixed center learning algorithm, OLS does not need to determine the network node numbers of hidden layer, solves the hidden nodes is difficult to identify the problem. But the need to identify in advance the extended constants of the hidden nodes, i.e. the width. The algorithm is universal and has been applied, the disadvantage of requiring a pre-determined hidden nodes or width and other parameters, and these parameters is not a complete and effective method, often in order to obtain the suitable parameter values required a great deal of trouble to work hard.
     Due to the influence of many factors, many precipitation impact factor makes the model training and prediction of the larger size, the training time of the network long, slow convergence speed, reduces the model to predict the performance of. Therefore it is necessary for these dozens or hundreds of precipitation forecast factor for dimensionality reduction. In the influence of precipitation factor of numerous data, different kernel functions for each input factor is not the same degree of impact, how to choose a good kernel function to weigh these factors, help to improve the accuracy of precipitation forecast model.
     Based on the radial basis function neural network (RBF network) of the node numbers of hidden layer, the center and width of the difficulty, and different kernel function with different input variables affecting the degree is not a factor, input of numerous and large amounts of data problems. To build a common and effective precipitation forecast model. The research object of this paper is based on the radial basis function neural network (RBF network) precipitation forecast model, the main research work is as follows:
     (1) in response to precipitation influence factor of numerous problems, firstly, using kernel principal component analysis (KPCA) for screening major precipitation influence factor, experimental results show that the KPCA dimensionality reduction to extract the factor is feasible and effective.
     (2) in the radial basis function neural network (RBF network) and the number of hidden layer nodes, centers and widths are difficult to identify problems, in order to improve the network performance, analysis of the common learning algorithms, proposed one kind based on the fuzzy clustering algorithm to determine the hidden nodes of RBF optimization algorithm. Firstly using fuzzy clustering algorithm to determine the sample data of hidden nodes; and then the K mean clustering algorithm to classify, identify the hidden node centers; again with the hidden nodes of minimum distance between extended as constants, i.e. the width; finally, in guarantee minimum network performance metric, to realize linear interpolation, the use of the least squares algorithm to train the network, determining the output layer of the network weights, so that network performance in some sense optimal. The algorithm study does not need to determine the hidden nodes, center and width, solved these parameters problem.
     (3) a mixture of six RBF kernel functions, build more nuclear radial basis function neural network integrated model, on the6forecast model for simple average, stepwise multiple regression forecasting integration, thereby taking into account the different kernel functions to those input variables.
     (4) in Guangxi in May3area daily precipitation real data prediction experiment, experimental results show that, the model has good generalization performance, the forecasting accuracy is higher than that of T213precipitation prediction model, which has the value of application.
引文
[1]周慧,朱国强,陈江民等GRAPES模式对西南季风爆发的数值模拟及初值影响试验[J].热带气象学报,2010,26(1):98-104.
    [2]杨成荫,王汉杰,周林等.基于全场信息的数值预报产品释用方法研究[J].应用气象学报,2009,20(2):232-239.
    [3]JIANSHENG WU, ENHONG CHEN. A Novel Nonparametric Regression Ensemble for Rainfall Forecasting Using Particle Swarm Optimization Technique Coupled with Artificial Neural Network[J]. Lecture Notes in Computer Science,2009,5553(3):49-58.
    [4]YUBIN YANG, HUI LIN, ZHONGYANG GUO, et al. A Data Mining Approach for Heavy Rainfall Forecasting Based on Satellite Image Sequence Analysis[J]. Computers and Geosciences,2007,33:20-30.
    [5]吴建生.基于遗传算法的BP神经网络气象预报建模.硕士学位论文.桂林:广西师范大学,2004
    [6]袁曾任.人工神经网络及其应用[M].清华大学出版社,1998.
    [7]Hongping Liu, V. Chandrasekar and Gang Xu. An adaptive Neural Network Scheme for Radar Rainfall Estimation form WSR-88D Observations [J]. Journal of Applied Meteorology,2001,40(11):2038-2050.
    [8]梁斌梅,韦琳娜.改进的径向基神经网络预测模型[J].计算机仿真,2009,26(11):0191-04:191-194
    [9]张兴兰,曹长修,梅彬.一种新型径向基函数神经网络学习算法——递归正交最小二乘法(ROLS)[J].重庆大学学报(自然科学版),2002,25(10):0056-05
    [10]伍春香,刘琳,王葆元.三层BP网络隐层节点数确定方法的研究[J].武汉测绘科技大学学报,1999,24(2):177-179
    [11]吴成茂,范九伦.三层BP网络隐层节点数确定方法的研究[J].计算机工程与应用,2004,20(2):77-79
    [12]许世刚,索丽生.确定前向神经网络隐层节点数的模糊聚类分析法[J].河海大学学报,2001,29(3):17-20
    [13]夏克文,李昌彪,沈菌毅.前向神经网络隐含层节点数的一种优化算法[J].计算机 科学,2005,32(10):143-145
    [14]农吉大,金龙.基于MATLAB的主成分RBF神经网络降水预报模型[J].热带气象学报,2008,24(16):0713-05
    [15]蒋中明,徐卫亚,张新敏.滑坡地下水位动态预测的径向基函数法都建立了有效的预测模型[J].岩石力学与工程学报,2003,22(9):1500-1504
    [16]金龙,,林振山等.基于人工神经网络的集成预报方法的研究和比较[J].气象学报,1999,57(2):198-207
    [17]罗芳琼,吴建生,金龙.基于最小二乘支持向量机集成的降水预报模型[J].热带气象学报,2011,27(4):577-584
    [18]吴建生,陈恩红.基于K近邻非参数回归的神经网络集成降水预测模型[J].计算机应用与软件,2010,27(7):114-118
    [19]熊伟,等.基于聚类的核主成分分析方法在地震属性降维中的应用[J].中国地球物理,2011(17):590
    [20]王和勇,等.基于聚类的核主成分分析在特征提取中的应用[J].计算机科学,2005,32(4):64-66
    [21]吴洪艳,黄道平.基于特征向量提取的核主元分析法[J].计算机科学,2009,36(7):185-187
    [22]Simon Haykin著,叶世伟,史忠植译.神经网络原理[M].北京:机械工业出版社,2004:101:109
    [23]蒋宗礼著.人工神经网络导论[M].北京:高等教育出版社.2001.
    [24](美)Fredirc M. Ham Ivica Kostanic著,叶世伟,王海娟译.神经计算原理[M].北京:机械工业出版社,2007:101-108
    [25]王惠文.偏最小二乘回归方法及其应[M].北京:国防上业出版社,1999.9-12.
    [26]Jiansheng Wu, Enhong Chen, "A novel nonparametric regression ensemble for rainfall forecasting using particle swarm optimization technique coupled with artificial neural network," Lecture Note in Computer Science, vol.5553, No.3, pp.49-58,Springer-Verlag Berlin Heidelberg,2009.Springer-Verlag Berlin Heidelberg,2009.
    [27]G. F. Lin, L. H. Chen, "A non-linear rainfall-runoff model using radial basis function network," Journal of Hydrology, vol.289, No.4, pp.1-8,2004.
    [28]Jiansheng Wu, "A novel artificial neural network ensemble model based on K-nn nonparametric estimation of regression function and its application for rainfall forecasting," Proeedings of the 2nd Internatioal Joint Conference on Computational Sciences and Optimization, eds. Lean Yu, K. K. Lai and S. K. Mishra, IEEE Computer Society Press, vol.2, pp.44-48,2009.
    [29]Wang Hu, Li Enying, Li Guang Yao, "The least square support vector regression coupled with parallel sampling scheme metamodeling technique and application in sheet forming optimization," Materials and Design, vol.30, pp.1468-1479,2009.
    [30]Jiansheng Wu, Mingzhe Liu, Long Jin,"A hybrid support vector regression approach for rainfall forecasting using particle swarm optimization and projection pursuit technology," International Journal of Computational Intelligence and Applications, vol. 9, no.2, pp.87-104,2010.
    [31]M. H. Fredric and K. Ivica, Principles of Neurocomputing for Science & Engineering. the McGraw-Hill, New York,2009.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700