基于函数型数据的模型探测与估计理论
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近二十年来随着科学技术的飞速迅猛发展,人们需要处理越来越多的具有函数特征的数据(简称为函数型数据)。因此,发展函数型数据分析使之适用于这种数据是十分必要的。函数型数据分析具有许多自身优点。例如,函数型数据分析可以对来自无限维空间的曲线数据进行统计分析、函数型数据分析会通过自己特有的方法挖掘出更多的数据信息、函数型数据分析允许不同观测对象具有不等的观测次数、函数型数据分析方法对某些非函数型数据仍然适用等等。同时,因为函数型数据分析在增长分析、气象学、生物力学、经济学、医学等许多领域具有广阔的应用前景。因此,在最近二十年来,函数型数据分析一直是统计热点研究领域。国内外很多研究人员一直致力于这方面的研究,并取得了许多理论和应用成果。但是,由于函数型数据内在的无限维特征给许多问题的统计推断带来了巨大的挑战。因此,关于函数性数据的研究仍处于起步阶段,许多问题还需要做进一步的研究。
     本文主要考虑的是函数型数据回归模型。近年来,很多统计学家提出了一些函数型数据的回归模型,但大多数集中于函数线性模型和函数非参数模型的研究。因此本文试图丰富和发展一些函数回归模型,并对一些函数回归模型提出了模型探测的思想。
     首先,我们利用函数主成分方法结合多元分析中的Group Lasso思想,探测了函数多项式模型中的重要的阶,并将提出的方法应用到谱数据中,通过分析实际数据,我们发现了一些很有趣的结论。接着,因为实际的需要,我们发展了一个带有自回归误差的函数多项式模型,对所发展的模型,我们面临着两方面的问题:探测多项式中重要的阶和探测自回归误差中重要的阶,因此,我们发展了一个联合探测方法,它能同时解决这两个问题。
     第二,我们发展了一个函数多项式乘积模型,这一模型是对Yao和Muller(2010)提出的函数多项式模型的一个有用的替代。我们将最小完全相对误差标准(LARE)推广到这个模型的估计中来。并对函数多项式乘积模型进行模型探测。我们利用提出的模型和方法应用到加拿大气候数据中,得到了很好的预测结果。
     第三,我们考虑了函数型数据的部分线性模型。由于在纵向数据分析中,对每个个体的观测数目是稀疏的。但是,在实际生活中,可能出现对某些个体观测稀疏,而对某些个体观测稠密的情形。而且,在实际处理时,我们也没有稀疏和稠密的绝对的标准,以至于在处理实际数据时,难于判断。于是我们将稀疏的函数数据(即纵向数据)部分线性模型和稠密的函数数据部分线性模型建立了一个统一的估计方法,并探讨了估计的大样本性质。
     最后,我们基于函数奇异元分析的基础上对协变量和响应变量都为函数型的数据建立了函数可加模型。因为主成分分析是建立在自协方差函数分解的基础上的,这一分解只能反映协变量和响应变量自身的信息,而没有考虑响应变量和自变量的相依关系。而函数奇异值分析是基于响应变量和自变量的互协方差函数的分解,因此,它更能反映响应变量和自变量的相依关系。我们利用建立的模型探讨了模型的估计及估计的渐近性质
With the rapid development of science and technology during the past twenty years, people encounter a lot of the data with functional feature (referred to as functional data). It is very necessary to develop functional data analysis because functional data analysis has many unique advantages. For example, it can carry on statistical analysis for the data which come from the infinite dimensional functional space. More information can be dig by functional data analysis. It allows which the observation object has different observation times. The method still apply for non-functional data, and so on. At the same time, functional data analysis has a wide application in growth analysis, meteorology, biomechanics, economics etc. Functional data analysis has become a hot field of statistics during the recent twenty years. Many researchers have been devoted to this aspect and has achieved many results in theory and application. However, due to the infinite dimensional feature of functional data, it has also encountered great challenges for statistical inference of functional data. Therefore, research is still in its infancy on the functional data, many problems still need further research.
     In this paper, we mainly consider the function regression model. In recent years, many researchers proposed some regression models with functional data. But most of them focus on the functional linear model and function nonpara-metric model. Therefore, this paper tries to enrich and develop some functional regression model and puts forward the idea of model detection.
     First, we use the method of functional principal component and Group Lasso to detected the significant order of function polynomial model. We apply the proposed method to the spectral data and found some very interesting conclusion. Then, we develop a functional polynomial regression models with auto-regression error due to the actually needing. We are facing two Problems for the proposed model, namely:detect significant order of functional polynomial regression model and detect significant order of auto-regression error. We develop a joint detection method, it can simultaneously solve the two problem.
     Secondly, we develop the function polynomial multiplication model. It is a useful alternative for the model of Yao and Muller (2010). We extend the least absolute relative errors criterion (LARE) to the estimation of the model and de-velop the new method to detect the order of functional polynomial multiplicative model. We also apply the proposed the model and methods to the Canadian climate data and some good results are obtained.
     Thirdly, we consider the partially linear model for functional data. The number of observations for each individual is sparse in the longitudinal data analysis. However, it is possible that some subjects are densely observed while others are sparsely observed in practice. Moreover, in dealing with real data, it may even be difficult to classify which scenario we are faced with and hence to decide which methodology to use. So we establish a unified estimation method for partially linear model for sparse functional data and dense function data and some large sample properties are obtain.
     Finally, we establish functional additive models based on the functional sin-gular component analysis when both covariates and the response are functional data. Principal component analysis can only reflect the own information of the covariates and response variables and ignore their dependencies because it base on the decomposition of their auto-covariance function. However, cross-covariance function of the covariates and response variables is decomposed for function sin- gular value analysis, it better reflect their dependencies. The estimators and asymptotic properties are investigated for the build functional additive models.
引文
[1]Abraham,C., Cornillion,P.A., Matzner-Lober,E., and Molinari,N. Unsupervised curve clustering using B-splines [J]. Scandinavian Journal of statistics.2003,30:581-595.
    [2]Ait-Saidi, A., Ferraty, F., Kassa, R., Vieu, P. Cross-validated estimations in the single-functional index model [J]. Statistics,2008,42:475-494.
    [3]Akaike, H. A Bayesian analysis of the minimum AIC procedure [J]. Annals of the Institute of Statistical Mathematics,1978,30A:9-14.
    [4]Albertos, J. Fraiman, R. Impartial Trimmed k-means for Functional Data [J]. Com-putational Statistics and Data Analysis,2007,51:4864-4877.
    [5]Amatoa, U., Antoniadisb, A., Feisa, I, D. Dimension reduction in functional regression with applications [J]. Computational Statistics and Data Analysis,2006,50:2422-2446.
    [6]Aneiros Perez, G., Vieu, P. Semi-functional partial linear regression [J]. Statistics and Probability Letters,2006,76:1102-1110.
    [7]Aneiros Perez, G., Vieu, P. Nonparametric time series prediction:A semi-functional partial linear modeling [J]. Journal of Multivariate Analysis,2008,100:834-857
    [8]Antoniadis, A. Wavelets in statistics:A review (with discussion) [J]. Journal of Italian Statistic Society,1997,6:97-144.
    [9]Bhansali, R. J. Order selection for linear time series models:a review. In Develop-ments in Time Series Analysis, London (ed. T. Subba Rao), London, UK:Chapman and Hall.1993:50-56.
    [10]Borggaard, C., Thodberg. H. Optimal minimal neural interpretation of spectra [J]. Analytical Chemistry,1992,64:545-551.
    [11]Bosq,D. Modelization, nonparametric estimation and prediction for continuous time proeesses [J]. Nata Asi C,2003,35:509-52.
    [12]Breiman, L. "Better Subset Regression Using the Nonnegative Gar rote" [J]. Tech-nometrics,1995,37:373-384.
    [13]Brockwell, P.J., Davis, R.A. Time Series:Theory and Methods [M]. Springer, New York,1991.
    [14]Brumback,B. Rice, J.A. Smoothing spline models for the analysis of nested and crossed samples of curves (with diseussion) [J]. Journal of the American Statistical Association.1998,93:961-994.
    [15]Candes, E., Tao, T. "The Dantzig Selector:Statistical Estimation When p is Much Larger Than n" (with discussion) [J]. Annals of Statistics,2007,35:2313-2404.
    [16]Cai,T.T., Hall, P. Prediction in functional linear regression [J]. Annals of Statistics. 2006,34:2159-2179.
    [17]Cardot, H., Ferraty, F., Sarda, P. Spline estimators for the functional linear model [J]. Statistical Sinica,2003,13:571-591.
    [18]Cardot, H., Conditional functional principal components analysis [J]. Scandinavian Journal of Statistics,2007,34:317-335.
    [19]Cardot,H., Crambes,C, Kneip,A., Sarda,P. Smoothing spline estimators in function-al Iinear regression with errors in variables [J]. Computational Statistics and Data Analysis,2007,51:4832-4848.
    [20]Chen, D., Hall, P., Muller, H.G. Single and multiple index functional regression models with nonparametric link [J]. Annals of Statistics.2011,39:1720-1747.
    [21]Chen, K., Guo, S., Lin, Y., Ying, Z. Least absolute relative error estimation [J]. Journal of the American Statistical Association,2010,105:1104-1112.
    [22]Chiou, J.M., Miiller, H.G., Wang, J.L. Functional response models [J]. Statistical Sinica,2004,14:659-677.
    [23]Cho, H., Goude, Y., Brossat, X., Yao, Q. Modelling and forecasting daily electricity load curves:a hybrid approach [J]. Journal of the American Statistical Assoeiation. (to appear)
    [24]Cook,R. D., Forzani, L., Yao, A. F. Necessary and sufficient conditions for smooth functional inverse regression [J]. Statistica Sinica,2010,20:235-238.
    [25]Crambes, C. Total least squares for functional data [J]. ASMDA Brest,2005, Proceed-ings, Franee:619-626.
    [26]Dabo-Liang, S., Ferraty, F., Vieu, P. On the using of model curves for radar waveforms classifications [J]. Computational Statistics and Data Analysis,2007,51:4832-4848.
    [27]Delaigle, A., Hall, P. Methodology and theory for partial least square applied to functional data [J]. Annals of Statistics,2012,40:322-352.
    [28]Delaigle, A., Hall, P., Bathia, N. Componentwise classification and clustering of func-tional data [J]. Biometrika,2012,99:299-313.
    [29]Efron, B., Hastie, T., Johnstone, I., Tibshirani, R. Least Angle Regression [J]. Annals of Statistics,2004,32:407-489.
    [30]Fan, J. Comments on "Wavelets in statistics:A review," by A. Antoniadis [J]. Journal of Italian Statistics. Society,1997,6:131-138.
    [31]Fan, J., Li, R. Variable Selection via Non-concave Penalized Likelihood and Its Oracle Properties [J]. Journal of the American Statistical Assoeiation,2001,96:1348-1360.
    [32]Fan, J. Peng, H. Nonconcave penalized likelihood with a diverging number of param-eters [J]. Annals of Statistics,2004,32:928-961.
    [33]Fan, J., Li, R. New estimation and models selection procedures for semiparamet-ric modeling in longitudinal data analysis [J]. Journal of the American Statistical Assoeiation,2004,99:710-723.
    [34]Fan, J., Lv, J. "Sure Independence Screening for Ultrahigh Dimensional Feature Space" (with discussion) [J]. Journal of the Royal Statistical Society, Series B,2008, 70:849-911.
    [35]Fan, J., Song, R. "Sure Independence Screening in Generalized Linear Models With NP-Dimensionality" [J]. The Annals of Statistics,2010,38:3567-3604.
    [36]Faraway, J, J. Regression analysis for a functional respons [J]. Technometries,1997, 39:254-262.
    [37]Ferraty,F., Vieu, P. Curves discrimination:A nonparametric functional approach [J]. Computational Statistics and Data Analysis,2003,44:161-173.
    [38]Ferraty, F., Vieu, P. Nonparametric Functional Data Analysis. Theory and Practice [M]. Springer-Verlag, New York,2006.
    [39]Ferraty, F., Hall, P., Vieu, P. Most predictive design points for functional data pre-dictor [J]. Biometrika,2010,97:807-824.
    [40]Ferraty, F., Romain, Y. The Oxford Handbook of Functional Data Analysis [M]. Oxford University Press,2011.
    [41]Ferraty, F., Keilegom,L., Vieu, P. Regression when both response and predictor are functions [J]. Journal of Multivariate Analysis,2012,109:10-28.
    [42]Ferraty, F., Goia, A., Salinelli, E., Vieu, P. Functional projection pursuit regression [J]. Test,2012.
    [43]Ferre, L., Yao, A. F. Functional sliced inverse regression analysis [J]. Statistics,2003, 37:475-488.
    [44]Ferre, L., Yao, A. F. Smoothed functional inverse regression [J]. Statistica Sinica, 2005,15:665-683.
    [45]Forzani, L., Cook, R. D. A note on smooth functional inverse regression [J]. Statistica Sinica,2007,17:1677-1681.
    [46]Frank, I.E., Friedman, J.H. A Statistical View of Some Chemometrics Regression Tools [J]. Technometrics,1993,35:109-148.
    [47]Gabrys, R. Kokoszka, P. Portmanteau test of independence for functional observa-tions [J]. Journal of the American Statistical Association,2007,102:1338-1348.
    [48]Gabrys, R., Horvath, L., Kokoszka, P. Tests for error correlation in the functional linear model [J]. Journal of the American Statistical Association.2010,105:1113-1125.
    [49]Hall,P., Henklnan,N. Estimating and depicting the structure of adistribution of ran-dom functions [J]. Biomemetrika,2002,89:145-158.
    [50]Hall, P., Horowitz,J.L. Methodology and convergence rates for functional linear re-gression [J]. Annals of Statistics,2007,35:70-91.
    [51]Hall, P., Hosseini-Nasab, M. On properties of functional principal components anal-ysis [J]. Journal of the Royal Statistical Society, Series B,2006,68:109-126.
    [52]Hannan, E.J., Quinn, B.G. The determination of the order of an autoregression [J]. Journal of the Royal Statistical Society, Series B,1979,41:190-195.
    [53]Hannan, E.J. The estimation of the order of an ARMA process [J]. The Annals of Statistics,1980,8:1071-81.
    [54]Hannan, E.J., Rissannen, J. Recursive estimation of mixed autoregressive-moving average order [J]. Biometrika,1982,69:81-94.
    [55]Harezlak,J., Coull,B., Laird,N.,Magari., Christiani,D. Penalized solutions to fune-tional regression Problems [J]. Computalional Statistics and Data Analysis,2007,15: 4911-4925.
    [56]Hastie,T., Buja.A. Tibshirani,R. Penalized diseriminant analysis [J]. Annalsis of s-tatistics,1995,23:73-102.
    [57]He,G., Miiller, H., Wang,J.L. Extending correlation and regression from multivariate to functional data [J]. Asymtotic in Statistics and Probability,2000,52:197-210.
    [58]Horvath, L., Kokoszka, P. and Reimherr, M. Two sample inference in functional linear models [J]. Canadian Journal of Statistics,2009,37:571-591.
    [59]Horvath, L., Huskovd, M., Kokoszka, P. Testing the stability of the functional au-toregressive process [J]. Journal of Multivariate Analysis.2010,101:352-367.
    [60]Horvath, L., Kokoszka, P., Reeder, R. Estimation of the mean of functional time series and a two sample problem [J]. Journal of the Royal Statistical Society, B,2012, 74:1-20.
    [61]Horvath, L., Reeder, R. A test of significance in functional quadrat-ic regression. Technical Report. University of Utah, preprint available at http://arxiv.org/abs/1105.0014,2011.
    [62]Horvath, L., Kokoszka, P. Inference for Functional Data with Applications [M]. Springer, New York,2012.
    [63]James,G.M., Sugar,C.A. Clustering sparsely sampled functional data [J]. Journal of the Ameriean Statistical Association,2003,98:397-40.
    [64]James,G.M., and Silverman,B.W. Funetional adaptive model estimation [J]. Journal of the Ameriean Statistical Association,2005,100:565-576.
    [65]Khoshgoftaar, T.M., Bhattacharyya, B. B., Richardson, G.D. Predicting Software Errors, During Development, Using Nonlinear Regression Models:A Comparative Study [J]. IEEE Transactions on Reliability,1992,41:390-395.
    [66]Kokoszka, P. Reimherr, M. Determining the order of the functional autoregressive model [J]. Journal of Time Series Analysis,2012,34:116-129.
    [67]Lee, E. R., Park, B. U. Sparse estimation in functional linear regression [J]. Journal of Multivariate Analysis,2012,105:1-17.
    [68]Li, Y.H., Wang, Naisyin., Carroll, R.J. Generalized functional linear models with semiparametric single-index interactions [J]. Journal of the Ameriean Statistical As-sociation,2010,105:621-633.
    [69]Li, R., Zhong, W., Zhu, L. Feature screening via distance correlation learning [J]. Journal of the American Statistical Association.2012,107:1129-1139.
    [70]Lin, X.H., Carroll, R.J. Semiparametric regression for clustered data:using general-lized estimating equations [J]. Journal of the Ameriean Statistical Association,2001, 96:1045-1056.
    [71]Lian, H.,2009. Functional Partial Linear Regression [J]. Journal of Nonparametric Statistics,139:3405-3418
    [72]Lian, H. Shrinkage estimation and selection for multiple functional regression [J]. Statistics Sinica,2013,23:51-74.
    [73]Liang, H., Li, R. Variable selection for partially linear models with measurement errors [J]. Journal of the American Statistical Association,2009,104:234-248.
    [74]Marx, B.D. Eilers P.H. Generalized linear regression on sampled signals with pe-nalized lekelihood. In A. Forcina, G. M. Marchetti, R. Hatzinger, and G. Galmacci (Eds.):Statistical Modelling. Proceeding of the 11th Interna-tional workshop on statistical modelling, Orvieto.1996.
    [75]Miiller, H.G., Stadtmuller,U. Generalized functional linear models [J]. Annals of S-tatistics,2005,33:774-805.
    [76]Miiller, H.G. Functional modelling and classification of longitudinal data [J]. Scandi-navian Journal of Statistics,2006,32:223-246.
    [77]Miiller,H.-G., Yao,F. Functional additive models [J]. Journal of the American Statis-tical Association,2008,103:1534-1544.
    [78]Miiller, H.-G., Zhang. Y. Time-varying functional regression for predicting remaining lifetime distributions from longitudinal trajectories [J]. Biometrics,2005,61:1064-1075.
    [79]Narula, S.C., Wellington, J.F. Prediction, linear regression and the minimum sum of relative errors [J]. Technometrics,1977,19:185-190.
    [80]Park, H., Stefanski, L.A. Relative-Error Prediction [J]. Statistics and Probability Letters,1998,40:227-236.
    [81]Preda,C. Saporta,G. PLS regression on an stochastic proeess [J]. Computational S-tatistics and Data Analysis,2005,48:149-158.
    [82]Ramsay J, Dalzell, C. Some Tools for Functional Data Analysis [J]. Journal of the Royal Statistical Society:Series B,1991,53:539-572.
    [83]Ramsay, J., Silverman, B. Functional Data Analysis [M], Springer, New York,1997.
    [84]Ramsay, J., Silverman, B. Functional Data Analysis, second ed [M]. Springer, Berlin, 2005.
    [85]Rice.J., Silverman.B. Estimating the mean and covariance structure nonparametri-cally when the data are curves [J]. Journal of the Royal Statistical Society:Series B, 1991,53:233-243.
    [86]Rice, J. Functional and longitudinal data analysis:Perspectives on smoothing [J]. Statistica Sinica,2004,14:631-647.
    [87]Shen, Q., Faraway, J. An f test for linear models with functional responses [J]. Sta-tistica Sinica,2004,14:1239-1257.
    [88]Shibata, R. Asymptotically efficient selection of the order of the model for estimating parameters of a linear process [J]. The Annals of Statistics,1980,8:147-64.
    [89]Shin, H. Partial functional linear regression [J]. Journal of Statistical Planning and Inference,2009,139:3405-3418.
    [90]Shin, H. Partial functional linear regression [J]. Journal of Statistical Planning and Inference,2009,139:3405-3418.
    [91]Tarpey,T., Kinateder,K.K.J. Clustering functional data [J]. Journal of classification. 2003,20:93-114.
    [92]Tibshirani, R.1996. Regression Shrinkage and Selection via the LASSO [J]. Journal of the Ameriean Statistical Association,101:1418-1429.
    [93]Wang, H., Xia, Y. Shrinkage Estimation of the Varying Coefficient Model [J]. Journal of the Ameriean Statistical Association,2009,104:747-757.
    [94]Wang, H., Li, Guodong, Tsai, Chih-Ling. Regression Coefficient and Autoregressive Order Shrinkage and Selection via Lasso [J]. Journal of the Royal Statistical Society: Series B,2007,69:63-78.
    [95]Wang, G., Lin, N., Zhang, B. Functional contour regression [J]. Journal of Multivari-ate Analysis,2013,116:1-13.
    [96]Wu, C.F. Asymptotic theory of nonlinear least squares estimation [J]. Annals of Statistics,1981,9:501-513
    [97]Wu, Y., Fan, J., Muller, H.G. Varying-coefficient functional linear regression [J]. Bernoulli,2010:16:730-758.
    [98]Yang, W., Muller, H.G., Stadtmuller, U. Functional singular component analysis [J]. Journal of the Royal Statistical Society:Series B,2011,73:303-324.
    [99]Yao,F., Muller, H.G., Wang, J.L. Functional data analysis for sparse longitudinal data [J]. Journal of the Ameriean Statistical Association,2005,100:577-590.
    [100]Yao, F., Muller, H.G. Functional quadratic regression [J]. Biometrika,2010,97:49-64.
    [101]Yuan, M., Lin, Y. Model Selection and Estimation in Regression with Grouped Variables [J]. Journal of the Royal Statistical Society:Ser. B.2006,68:49-67.
    [102]Yuan, M., Cai, T. A Reproducing Kernel Hilbert Space Approach to Functional Linear Regression [J]. Annals of Statistics.2010,38:3412-3444.
    [103]Zeger, S.L., Diggle, P.J. Semiparametric models for longitudinal data with applica-tion to CD4 cell numbers in hivseroconverters [J]. Biometrics,1994,50:689-699.
    [104]Zhang, H.H., Lu, W.,2007. Adaptive lasso for Cox's proportional hazard model [J]. Biometrika,94:691-703.
    [105]Zhang, Q.Z., Wang, Q. H. Local least absolute relative error estimating approach for partially linear multiplicative model[J]. Statistica Sinica,2013,23:1091-1116.
    [106]Zhou, J., Wang, N., Wang, N. Functional linear model with zero-value coefficient function at sub-region [J]. Statistica Sinica,2013,23:25-50.
    [107]Zhao, X., Marron, J.S., Wells, M.T. The functional data analysis view of longitudinal data [J]. Statistica Sinica.2004,14:789-808
    [108]Zou, H., Hastie, T. "Regularization and Variable Selection via the Elastic Net" [J], Journal of the Royal Statistical Society, Series B,2005,67:301-320.
    [109]Zou, H. The Adaptive LASSO and Its Oracle Properties [J]. Journal of the Ameriean Statistical Association,2006,101:1418-1429.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700