基于k-最近邻筛选的BMA集合预报模型研究
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Study on the bayesian model averaging coupling with the k-nearest neighbor selection
  • 作者:刘开磊 ; 李致家 ; 姚成 ; 韩通 ; 钟栗 ; 孙如飞
  • 英文作者:LIU Kailei;LI Zhijia;YAO Cheng;HAN tong;ZHONG li;SUN rufei;The Huaihe River Commission of the Ministry of Water Resources P.R.C;College of Hydrology and Water Resources,Hohai University;
  • 关键词:集合预报 ; 样本筛选 ; k-最近邻 ; 贝叶斯模型平均法 ; 高斯混合模型
  • 英文关键词:ensemble forecast;;sample selection method;;k-nearest neighbor;;Bayesian Model Averaging;;Gaussian mixture model
  • 中文刊名:SLXB
  • 英文刊名:Journal of Hydraulic Engineering
  • 机构:淮河水利委员会水文局(信息中心);河海大学水文水资源学院;
  • 出版日期:2017-04-12 10:09
  • 出版单位:水利学报
  • 年:2017
  • 期:v.48;No.487
  • 基金:国家重点研发计划项目(2016YFC0400909);; 国家自然科学基金项目(41130639,51179045,41101017,41201028)
  • 语种:中文;
  • 页:SLXB201704002
  • 页数:9
  • CN:04
  • ISSN:11-1882/TV
  • 分类号:16-23+33
摘要
针对冗余训练样本会降低BMA参数求解效率与精度问题,本文提出在BMA运算之前采用k-最近邻(k-nearest neighbor)算法筛选有价值训练样本,并用于BMA参数求解的改进模型。模拟试验在淮河王家坝站进行,分别以k-最近邻筛选、不筛选两种方案为BMA提供训练样本,统计分析两种方案中王家坝站流量模拟结果,评价BMA改进法的性能。模拟结果显示,采用k-最近邻样本筛选方法后,BMA模型对洪水过程以及洪峰的预报精度提升明显;概率预报结果的离散程度降低的同时,可靠性程度获得提升。k-最近邻样本筛选方法的引入,能够有效去除BMA模型训练样本中的冗余数据,以少量的样本获得更可靠的模型参数,改善集合预报性能。
        The BMA(Bayesian model averaging) is a multi-model ensemble forecasting algorithm based onthe Bayesian formula to estimate the posterior probability distribution of forecasting variables. The perfor-mance of BMA depends largely on the quality of its training datasets. However, there are a lot of redun-dant samples, which are inconsistent with the current flow state and affect the accuracy and the reliabilityof BMA forecasts. In this study, the k-nearest neighbor(KNN) method is applied to address the similari-ties between the historical samples and the most recent flood process to reduce the influence of redundantsamples on the parameter estimation of BMA. Two cases of BMA,i.e. with the use of KNN sample selec-tion(namely KBMA) and the original one, are investigated and compared at the Wangjiaba catchment lo-cated in the upper region of the Huai River basin. The ensemble means of these two cases were examinedagainst the observations and the forecasts from their ensemble members to test the efficiency of their deter-ministic forecasts. Additionally, the probabilistic forecasts from these two cases were intercompared on thebasis of two assessment criteria including Coverage Rate and Ranked Probability Score. The results indicatethat the KBMA can produce improved deterministic and probabilistic forecasts as compared to the originalBMA. By employing the KNN sample selection method,the KBMA is able to adjust its parameters accord-ing to the real time state of the flood processes and ensemble members,rather than adjusting them throughthe use of all samples. Our analysis demonstrates that the KNN sample selection method has the potentialto substantially improve BMA ensemble forecasts.
引文
[1]丛树铮.水科学技术中的概率统计方法[M].北京:科学出版社,2010.
    [2]MONOMOY Goswami,KIERAN M.Real-time flow forecasting in the absence of quantitative precipitation fore-casts:A multi-model approach[J].Journal of Hydrology,2007,334:125-140.
    [3]TODINI Ezio.Rainfall-runoff Models for Real-time Forecasting[C]//Encyclopedia of Hydrological Sciences.England:John Wiley&Sons Ltd,2006.
    [4]LEAMER EDWARD E.Specification Searches[M].New York:Wiley,1978.
    [5]KRZYSZTOFOWICZ Roman.Bayesian models of forecasted time series[J].Journal of the American Water Re-sources Association.1985,21(5):805-814.
    [6]RAFTERY ADRIAN E,GENEITING Tilmann,BALABDAOUI Fadoua,et al.Using Bayesian model averagingto calibrate forecast ensembles[J].Monthly Weather Review,2005,113:1155-1174.
    [7]GEORGE EDWARD I.Bayesian Model Selection[M]//Encyclopedia of Statistical Sciences.New York:Wiley,1999.
    [8]QINGYUN DUAN,NEWSHA K.Ajami,Xiaogang Gao,Soroosh Sorooshian.Multi-model ensemble hydrologicprediction using Bayesian model averaging[J].Advances in Water Resources,2007,30:1371-1386.
    [9]SáNCHEZ,JOSéSALVADOR,BARANDELA,et al.Analysis of new techniques to obtain quality training sets[J].Pattern Recognition Letters,2003,24(7):1015-1022.
    [10]郝红卫,蒋蓉蓉.基于最近邻规则的神经网络训练样本选择方法[J].自动化学报,2007,33(12):1247-1251.
    [11]KRüGER BJ?RN,Tautges Jochen,Weber Andreas,et al.Fast local and global similarity searches in large mo-tion capture databases[C]//Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Ani-mation.Eurographic sedure Association,2010:1-10.
    [12]LI Leping,Weinberg Clarice R,DARDEN Tomos A,et al.Gene selection for sample classification based ongene expression data:study of sensitivity to choice of perameters of the GA/KNN method[J].bioinformatics,2001,17(12):1131-1142.
    [13]LIU Ksilei,LI Zhijia,YAO Cheng,et al.Coupling the k-nearest neighbor procedure with the Kalman filter fortral-time updating of the hydraulic model in flood forecasting[J].International journal of Sediment Research,2016,31(2):149-158.
    [14]KELLY Karen S,Krzysztofo wicz Roman.A bivariate meta-Gaussian density for use in hydrology[J].StochasticHydrology and Hydraulics,1997(11):17-31.
    [15]戴荣.贝叶斯模型平均法在水文模型综合中的应用研究[D].南京:河海大学,2008.
    [16]黄鹏年,李致家,姚成,等.半干旱半湿润流域水文模型应用与比较[J].水力发电学报,2013(4):4-9.
    [17]胡友兵,李致家,冯杰,等.三峡库区生态屏障范围界定[J].水利学报,2012,43(10):1248-1253.
    [18]LIU Kailei,YAO Cheng,CHEN Ji,et al.Comparison of three updating models for real time forecasting:a casestudy of flood forecasting at the middle reaches of the Huai River in East China[J].Stochastic Environmental Re-search and Risk Assessment,2016,DOI:10.1007/s00477-016-1267-x.
    [19]李致家,张珂,姚成.基于GIS的DEM和分布式水文模型的应用比较[J].水利学报,2006,37(8):1022-1028.
    [20]XIONG Lihua,SHAMSELDIN Asaad Y,O’CONNOR KIERAN M.A non-linear combination of the forecasts ofrainfall-runoff models by the first-order Takagi-Sugeno fuzzy system[J].Journal of Hydrology,2001,245:196-217.
    [21]BUIZZA ROBERTO,PALMER TIM N.Impact of ensemble size on ensemble prediction[J].Monthly WeatherReview,1998,126(9):2503-2518.
    [22]武新宇,程春田,赵鸣雁.基于并行遗传算法的新安江模型参数优化率定方法[J].水利学报,2004(11):85-90.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700