基于PCA和AP的嵌套式KNN金融时间序列预测模型
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A Financial Time Series Prediction Model Integrating Principal Component Analysis,Affinity Propagation and Nested K-Nearest Neighbor Regression
  • 作者:唐黎 ; 潘和平 ; 姚一永
  • 英文作者:TANG Li;PAN He-ping;YAO Yi-yong;School of Intelligent Finance,Tianfu College of Southwestern University of Finance and Economics;Business School,Chengdu University;Intelligent Finance Research Center,Chongqing Institute of Finance;
  • 关键词:金融时间序列 ; 主成分分析 ; 仿射传播聚类 ; 最邻近元 ; 金融市场预测
  • 英文关键词:financial time series;;principal component analysis;;affinity propagation;;k-nearest neighbor;;financial market prediction
  • 中文刊名:YUCE
  • 英文刊名:Forecasting
  • 机构:西南财经大学天府学院智能金融学院;成都大学商学院;重庆金融学院智能金融研究中心;
  • 出版日期:2019-01-27
  • 出版单位:预测
  • 年:2019
  • 期:v.38;No.226
  • 基金:国家社会科学基金资助项目(17BGL231)
  • 语种:中文;
  • 页:YUCE201901013
  • 页数:6
  • CN:01
  • ISSN:34-1013/N
  • 分类号:94-99
摘要
本文提出一种金融时间序列预测的数据降维与信息融合计算智能模型-PANK模型。该模型由三个部分组成:(1)主成分分析(Principal Component Analysis,PCA),用于减少冗余信息;(2)仿射传播聚类(Affinity Prop-agation,AP),用于找到聚类中心和相应的聚类作为特征提取;(3)嵌套式k-最邻近元(Nested k-Nearest Neighbor,Nested KNN)用于回归预测。PANK模型先采用滑动窗口技术截取最近期的金融时间序列作为输入数据,再经过PCA减少冗余信息,提取富含有效信息的主成分,并将其放入AP中进行聚类,最后采用两层嵌套式NestedKNN预测。本文特别提出了一种新的嵌套式Nested KNN,可以有效解决KNN中的两个主要问题:计算量大和不均衡样本问题。通过对模型在欧元兑美元汇率和中国沪深300股指上的实证,结果表明PANK预测模型可达到80%的最佳命中率。
        This paper advances a complex computational intelligence model for financial time series prediction,called PANK for PCA-AP-Nested KNN. As an information fusion and computational intelligence model,PANK integrates Principal Component Analysis( PCA),Affinity Propagation Clustering( AP),and a nested reformulation of k-Nearest Neighbor regression( Nested KNN). PANK model uses a sliding window to capture a certain length of recent time series data,then applies PCA to reduce the dimensionality of the data and transform the time series into principal components with rich information as the input for AP clustering,and uses Nested KNN for prediction modeling. The original KNN is updated to Nested KNN which can tackle the large amount of computation and disequilibrium samples problem of original KNN. Two specific PANK models are trained and tested on EUR/USD exchange rate and Chinese HS300 index with15-year historical data,achieving best hit rate of 80%,higher than other reference models based on KNN.
引文
[1] Box G E P,Jenkins G M. Time series analysis:forecas-ting and control[M]. San Francisco:Holden-Day,1970. 23-124.
    [2] Engle R F. Autoregressive conditional heteroscedasticitywith estimates of the variance of United Kingdom inflation[J]. Econometrica,1982,50(4):987-1007.
    [3] Bollerslev T. Generalized autoregressive conditional het-eroskedasticity[J]. Journal of Econometrics,1986,31(3):307-327.
    [4] Ravi V,Pradeepkumar D,Deb K. Financial time seriesprediction using hybrids of chaos theory, multi-layerperceptron and multi-objective evolutionary algorithms[J]. Swarm and Evolutionary Computation,2017,36:136-149.
    [5] Sermpinis G,Stasinakis C,Theofilatos K,et al.. Model-ing,forecasting and trading the EUR exchange rates withhybrid rolling genetic algorithms-support vector regressionforecast combinations[J]. European Journal of Opera-tional Research,2015,247:831-846.
    [6] Galeshchuk S. Neural networks performance in exchangerate prediction[J]. Neurocomputing,2016,172:446-452.
    [7] Zhang N N,Lin A J,Shang P J. Multidimensional k-nearest neighbor model based on EEMD for financial timeseries forecasting[J]. Physica A:Statistical Mechanicsand its Applications,2017,477:161-173.
    [8] Cover T,Hart P. Nearest neighbor pattern classification[J]. IEEE Transaction on Information Theory,1967,13(1):21-27.
    [9] Bannayan M,Hoogenboom G. Weather analogue:a toolfor real-time prediction of daily weather data realizationsbased on a modified k-nearest neighbor approach[J].Environmental Modelling and Software,2008,23(6):703-713.
    [10] Tsai C F,Hsiao Y C. Combining multiple feature selec-tion methods for stock prediction:union,intersection,and multi-intersection approaches[J]. Decision SupportSystems,2010,50(1):258-269.
    [11] Pearson K F R S. On lines and planes of closest fit tosystems of points in space[J]. The London,Edin-burgh,and Dublin Philosophical Magazine and Journalof Science,1901,2(11):559-572.
    [12] Hotelling H. Analysis of a complex of statistical varia-bles into principal components[J]. Journal of Educa-tional Psychology,1933,24(7):498-520.
    [13]张承钊,潘和平.基于前向滚动EMD技术的预测模型[J].技术经济,2015,34(5):70-77.
    [14] Frey B J,Dueck D. Clustering by passing messagesbetween data points[J]. Science,2007,315(5814):972-976.
    [15]王开军,张军英,李丹,等.自适应仿射传播聚类[J].自动化学报,2007,33(12):1242-1246.
    [16] Arroyo J,Mate C. Forecasting histogram time serieswith k-nearest neighbours methods[J]. InternationalJournal of Forecasting,2009,25(1):192-207.
    [17] Ghaderyan P,Abbasi A,Sedaaghi M H. An efficientseizure prediction method using KNN-based undersam-pling and linear frequency measures[J]. Journal ofNeuroscience Methods,2014,232:134-142.
    [18] Schubert A L,Hagemann D,Voss A,et al.. Evaluatingthe model fit of diffusion models with the root meansquare error of approximation[J]. Journal of Mathemati-cal Psychology,2017,77:29-45.
    [19] Myttenaere A,Golden B,Grand,B L,et al.. Meanabsolute percentage error for regression models[J].Neurocomputing,2016,192:38-48.
    [20] Pan H P,Haidar I,Kulkarni S. Daily prediction ofshort-term trends of crude oil prices using neural net-works exploiting multimarket dynamics[J]. Frontiers ofComputer Science in China,2009,3(2):177-191.
    [21] Chen T. Applying a fuzzy and neural approach for fore-casting the foreign exchange rate[J]. InternationalJournal of Fuzzy System Applications,2011,1(1):36-48.
    [22] Sermpinis G,Theofilatos K,Karathanasopoulos A,etal.. Forecasting foreign exchange rates with adaptiveneural networks using radial-based functions and particleswarm optimization[J]. European Journal of OperationalResearch,2013,225(3):528-540.
    [23]徐国祥,杨振建. PCA-GA-SVM模型的构建及应用研究———沪深300指数预测精度实证分析[J].数量经济技术经济研究,2011,28(2):135-147.
    [24] Krogh A,Vedelsby J. Neural network ensembles,crossvalidation,and active learning[A]. In Tesauro G,Touretzky D S,Leen T K,eds. Advances in NeuralInformation Processing Systems 7[C]. MIT Press,Cambridge,1995. 231-238.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700