基于压缩感知的时间序列缺失数据预测算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Missing Data Prediction Based on Compressive Sensing in Time Series
  • 作者:宋晓祥 ; 郭艳 ; 李宁 ; 王萌
  • 英文作者:SONG Xiao-xiang;GUO Yan;LI Ning;WANG Meng;College of Communications Engineering,Army Engineering University;
  • 关键词:时间序列 ; 缺失数据 ; 压缩感知
  • 英文关键词:Time series;;Missing data;;Compressive sensing
  • 中文刊名:JSJA
  • 英文刊名:Computer Science
  • 机构:陆军工程大学通信工程学院;
  • 出版日期:2018-07-17 15:09
  • 出版单位:计算机科学
  • 年:2019
  • 期:v.46
  • 基金:国家自然科学基金(61571463,61371124,61472445);; 江苏省自然科学基金(BK20171401)资助
  • 语种:中文;
  • 页:JSJA201906004
  • 页数:6
  • CN:06
  • ISSN:50-1075/TP
  • 分类号:41-46
摘要
数据缺失在时间序列采集过程中频繁发生,已经严重阻碍了精确的数据分析。然而,现有的缺失数据预测算法多是从采集到的数据中发现某种规律,从而预测缺失的数据,并不适用于缺失数据较多的情况。基于此,提出了一种基于压缩感知的缺失数据预测算法。首先,该算法利用时间序列的时域平滑特性设计稀疏表示基,从而将缺失数据预测问题转化成稀疏向量恢复问题。其次,根据未缺失数据的位置特点设计了与稀疏表示基相关性低的观测矩阵,从而保证了算法的重构性能。仿真结果表明,即使数据缺失率高达90%,所提方法依然可以非常有效地预测出缺失数据。
        The frequent occurrence of data loss in time series acquisition process has seriously hindered the accurate data analysis. However,most of the existing methods mainly find a certain pattern from the collected data to predict the missing data,which are only feasible to be applied to the case where only a low ratio of collected data are missing. In view of the problem above,this paper proposed an algorithm of missing data prediction based on compressive sensing. The missing data prediction problem is formulated as the multiple sparse vectors recovery problem. Firstly,the sparse representation basis is designed by making use of the temporal smoothness of time series,thus transforming the missing data prediction problem into the problem of the sparse vector recovery. Secondly,the observation matrix is designed based on the location characteristics of the data that are not missing,which is lowly coherent with the designed representation bases,thus ensuring the reconstruction performance of the proposed algorithm. The simulation results show that the proposed algorithm can predict the missing data very effectively even if the ratio of data loss is as high as 90%.
引文
[1] SHI W,ZHU Y,ZHANG J,et al.Improving Power Grid Monitoring Data Quality:An Efficient Machine Learning Framework for Missing Data Prediction [C]//IEEE International Con-ference on High Performance Computing and Communications.IEEE,2015:417-422.
    [2] BATINI C,CAPPIELLO C,FRANCALANCI C,et al.Methodo- logies for data quality assessment and improvement [J].Acm Computing Surveys,2009,41(3):1-52.
    [3] LUEBBERS D,GRIMMER U,JARKE M.Systematic Development of Data Mining-Based Data Quality Tools[C]//Procee-dings of the 29th VLDB Conference.Morgan Kaufmann:San Francisco,2003:548-559.
    [4] WU S F,CHANG C Y,LEE S J.Time series forecasting with missing values[C]//2015 1st International Conference on Industrial Networks and Intelligent Systems (INISCom).2015:151-156.
    [5] BALOUJI E,SALOR Q,ERMIS M.Exponential smoothing of multiple reference frame components with GPUs for real-time detection of time-varying harmonics and interharmonics of EAF currents [C]//IEEE Industry Applications Society Meeting.IEEE,2017:1-8.
    [6] KOZERA R,WILKOLAZKA M.Natural spline interpolation and exponential parameterization for length estimation of curves [C]//International Conference of Numerical Analysis & Applied Mathematics.AIP Publishing LLC,2017:1-140.
    [7] JUNNINEN H,NISKA H,TUPPURAINEN K,et al.Methods for imputation of missing values in air quality data sets[J].Atmospheric Environment,2004,38(18):2895-2907.
    [8] HONG S T,CHANG J W.A New Data Filtering Scheme Based on Statistical Data Analysis for Monitoring Systems in Wireless Sensor Networks[C]//IEEE International Conference on High Performance Computing and Communications.IEEE,2011:635-640.
    [9] FUNG D S.Methods for the estimation of missing values in time series[J/OL].Theses Doctoratos & Masters,2006.http://ro.ecu.edu.au/theses/63.
    [10] LAO W,WANG Y,PENG C,et al.Time series forecasting via weighted combination of trend and seasonality respectively with linearly declining increments and multiple sine functions[C]//2014 International Joint Conference on Neural Networks (IJCNN).2014:832-837.
    [11] NEWSHAM G R,BIRT B J.Building-level occupancy data to improve arima-based electricity use forecasts[C]//Proceedings of the 2nd ACM Workshop on Embedded Sensing Systems for Energy-Efficiency in Building.ACM,New York,USA,2010:13-18.
    [12] SHI W,ZHU Y,ZHANG J,et al.Improving power grid monitoring data quality:An efficient machine learning framework for missing data prediction[C]//2015 IEEE 17th International Conference on High Performance Computing and Communications.IEEE,2015:417-422.
    [13] WEI G,KUN N,MAN C,et al.A data prediction algorithm based on BP neural network in telecom industry[C]//2011 International Conference on Computer Science and Service System (CSSS).2011.
    [14] LI L,LI Y,LI Z.Efficient missing data imputing for traffic flow by considering temporal and spatial dependence [J].Transportation Research Part C,2013,34(9):108-120.
    [15] QU L,LI L,ZHANG Y,et al.PPCA-based missing data imputation for traffic flow volume:a systematical approach[J].IEEE Transactions on Intelligent Transportation Systems,2009,10(3):512-522.
    [16] SHI W,ZHU Y,YU P,et al.Effective Prediction of Missing Data on Apache Spark over Multivariable Time Series[J].IEEE Transactions on Big Data,2017,PP(99):1.
    [17] CAI Y,TONG H,FAN W,et al.Fast mining of a network of coevolving time series[C]//The 2015 SIAM International Conference on Data Mining.2015:298-306.
    [18] FONOLLOSA J,SHEIK S,HUERTA R,et al.Reservoir computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring[J].Sensors & Actuators,2015,215:618-629.
    [19] RHEE I,SHIN M.Mobility traces[OL].http://carwdad.org/ncsu/mobilitymodels.
    [20] WU X,LIU M.In-situ soil moisture sensing:Measurement scheduling and estimation using compressive sensing [C]//Proceedings of the 11th ACM International Conference on Information Processing in Sensor Networks.IEEE,2012:1-12.
    [21] CHEN S S,DONOHO D L,SAUNDERS M A.Atomic decomposition by basis pursuit[J].SIAM Review,2001,43(1):129-159.
    [22] TROPP J A,GILBERT A C.Signal recovery from random measurements via orthogonal matching pursuit[J].IEEE Transactions Information Theory,2007,53(12):4655-4666.
    [23] ZHANG Z,RAO B D.Sparse Signal Recovery With Temporally Correlated Source Vectors Using Sparse Bayesian Learning [J].IEEE Journal of Selected Topics in Signal Processing,2011,5(5):912-926.
    [24] Al-SHOUKAIRI M,SCHNITER P,RAO B D.A GAMP Based Low Complexity Sparse Bayesian Learning Algorithm [J].IEEE Transactions on Signal Processing,2018,66(2):294-308.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700