摘要
面对海量的海表面温度数据,如何使用大数据处理平台和新的处理技术来实时处理、分析并预测海表面温度数据,是一个亟待解决的问题。本文基于现阶段的时间序列方法和专家意见,首先,将类比合成方法引入到海表面温度预测应用中;其次,基于Spark平台提出了一种改进的快速DTW算法SparkDTW;最后,为了充分利用通过时间序列挖掘得到的信息,将SparkDTW与SVM相结合,提出了SparkDTW+SVM混合模型,为海表面温度预测的应用研究提供了较好的理论基础和技术支持。实验结果表明,SparkDTW算法预测精度优于SVM,提高了海表面温度预测效率,验证了将类比合成方法应用在海表面温度预测的可行性;SparkDTW+SVM在精度方面要优于SparkDTW和SVM,表明SVM模型能充分利用时间序列挖掘的信息,验证了SparkDTW+SVM在海表面温度预测的有效性。
Faced with massive sea surface temperature data, how to use Spark platform and new processing technology to analyze, predict and process sea surface temperature data in real time has become a hot topic. Based on the current time series method and expert opinion, this paper first introduced the analog complexing method into the sea surface temperature prediction application.Then, based on the Spark platform, an improved fast DTW algorithm SparkDTW was proposed. Finally,in order to make full use of the information obtained through time series mining, SparkDTW and SVM were combined, and the SparkDTW +SVM hybrid model was proposed, which provides a good theoretical basis and technical support for the application research of global sea surface temperature prediction. The experimental results show that the prediction accuracy of SparkDTW algorithm is better than SVM, which improves the sea surface temperature prediction efficiency and verifies the feasibility of applying analogy complexing method to sea surface temperature prediction; Spark DTW +SVM is superior to SparkDTW and SVM in accuracy, which indicates that SVM model can make full use of time series mining information to verify the effectiveness of SparkDTW+SVM in sea surface temperature prediction.
引文
Agrawal R,Faloutsos C,Swami A,1993.Efficient similarity search in sequence databases.ProcFodo,730:69-84.
Berndt D J,1994.Using dynamic time warping to find patterns in time series.Kdd Workshop.359-370.
Casaroli G,Villa T,Bassani T,et al,2017.Numerical Prediction of the Mechanical Failure of the Intervertebral Disc under Complex Loading Conditions..Materials,10(1):31
Jaramillo J,Velasquez J D,Franco C J,2017.Research in Financial Time Series Forecasting with SVM:Contributions from Literature.IEEELatin America Transactions,15(1):145-153.
Li M,Liu X,Ding F,2017.The maximum likelihood least squares based iterative estimation algorithm for bilinear systems with autoregressive moving average noise.Journal of the Franklin Institute,354(12).
Qiu X,Ren Y,Suganthan P N,et al,2017.Empirical Mode Decomposition based Ensemble Deep Learning for Load Demand Time Series Forecasting.Applied Soft Computing,54(C):246-255.
Vapnik V N,1995.The Nature of Statistical Learning Theory.The nature of statistical learning theory.Springer:988-999.
Wu J,Deng L,Jeon G,2017.Image Autoregressive Interpolation Model using GPU-Parallel Optimization.IEEE Transactions on Industrials Informatics,PP(99):1-1
Yan H,Moradkhani H,Zarekarizi M,2017.A Probabilistic Drought Forecasting Framework:A Combined Dynamical and Statistical Approach.Journal of Hydrology,548:291-304.
Zaharia M,Chowdhury M,Das T,et al,2012.Resilient distributed datasets:a fault-tolerant abstraction for in-memory cluster computing.Usenix Conference on Networked Systems Design and Implementation.USENIX Association:2-2.
陈亮,王震,王刚,2017.深度学习框架下LSTM网络在短期电力负荷预测中的应用.电力信息与通信技术,15(5):8-11
刘娜,王辉,凌铁军,等,2018.一个基于MOM的全球海洋数值同化预报系统.海洋通报,37(2):21-30.
李其杰,李嘉康,赵颖,等,2017.EEMD、CEEMD算法与SVM在SST时间序列研究中的应用.数学的实践与认识,(7):221-228.
李琰,范文静,骆敬新,等,2018.2017年中国近海海温和气温气候特征分析.海洋通报,37(3):296-302.
孙建乐,2014.基于时间序列相似性的股价趋势预测研究.重庆:重庆交通大学.
张建华,2003.海温预报知识讲座第一讲海水温度预报概况.海洋预报,(4):81-85.
张美英,何杰,2011.时间序列预测模型研究综述.数学的实践与认识,41(18):000189-195.
赵祥鸿,暴景阳,黄辰虎,等,2017.基于经验模态分解削弱多波束残余误差的方法.海洋通报,36(6):662-667.