摘要
超高维数据下的特征筛选是模型降维建模的重要环节.基于条件分位数的改进超高维特征筛选方法在给定分位点有扰动情况下可能会导致筛选变量不稳定,针对该问题,引入全局条件分位数的思想,提出基于条件区间分位数的超高维特征筛选方法,并通过理论及数值模拟证明其特征筛选的确定性独立筛选性质和所提方法的有限样本性质.
Feature screening was an important step for model dimension reduction of ultrahigh dimensional data. Focusing on this problem,to tackle the instability of the feature screening procedure based on the conditional quantile technique when the given quantile values had small disturbance,one global quantile technique was introduced. The generalized feature screening procedure based on the conditional interval quantile was proposed. The theoretical proof and numerical simulations were completed to prove the proposed screening procedure could processe the sure screening property and,showed its finite sample properties.
引文
[1] TIBSHIRANI R. Regression shrinkage and selection via the lasso[J]. Journal of the royal statistical society:series B,1996,58(3):267-288.
[2] FAN J Q,LI R Z. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. Journal of the American statistical association,2001,96(456):1348-1360.
[3] ZOU H. The adaptive lasso and its oracle properties[J]. Journal of the American statistical association,2006,101(476):1418-1429.
[4] FAN J Q,LV J C. Sure independence screening for ultrahigh dimensional feature space[J]. Journal of the royal statistical society:series B,2008,70(5):849-911.
[5] FAN J Q,SONG R. Sure independence screening in generalized linear models with NP-dimensionality[J]. The annals of statistics,2010,38(6):3567-3604.
[6] ZHU L P,LI L X,LI R Z,et al. Model-free feature screening for ultrahigh-dimensional data[J]. Journal of the American statistical association,2011,106(496):1464-1475.
[7] LI R Z,ZHONG W,ZHU L P. Feature screening via distance correlation learning[J]. Journal of the American statistical association,2012,107(499):1129-1139.
[8] WU Y S,YIN G S. Conditional quantile screening in ultrahigh-dimensional heterogeneous data[J]. Biometrika,2015,102(1):65-76.
[9]脱倩娟,赵红.基于局部邻域嵌入的无监督特征选择[J].郑州大学学报(理学版),2016,48(3):57-62.
[10] KOENKER R,BASSETT G. Regression quantiles[J]. Econometrica:journal of the econometric society,1978,46(1):33-50.
[11] ZHENG Q,PENG L M,HE X M. Globally adaptive quantile regression with ultra-high dimensional data[J]. Annals of statistics,2015,43(5):2225-2258.