一个带协变量调整多响应比较的高效方法及其在基因组数据中的应用
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A powerful procedure for multiple outcomes comparison with covariate adjustment and its application to genomic data
  • 作者:张胜虎 ; 朱家砚 ; 张三国
  • 英文作者:ZHANG Shenghu;ZHU Jiayan;ZHANG Sanguo;School of Mathematical Sciences,University of Chinese Academy of Sciences;Key Laboratory of Big Data Mining and Knowledge Management of Chinese Academy of Sciences;School of Information and Communication,Wuhan College;
  • 关键词:多响应比较 ; 协变量调整 ; 伪F检验 ; 功效
  • 英文关键词:multiple outcomes comparison;;covariate adjustment;;pseudo F test;;power
  • 中文刊名:ZKYB
  • 英文刊名:Journal of University of Chinese Academy of Sciences
  • 机构:中国科学院大学数学科学学院;中国科学院大数据挖掘与知识管理重点实验室;武汉学院信息系;
  • 出版日期:2019-03-14
  • 出版单位:中国科学院大学学报
  • 年:2019
  • 期:v.36
  • 基金:Special Fund of University of Chinese Academy of Sciences for Scientific Research Cooperation(Y652022Y00)
  • 语种:英文;
  • 页:ZKYB201902001
  • 页数:7
  • CN:02
  • ISSN:10-1131/N
  • 分类号:14-20
摘要
目前在文献中有很多关于多响应比较的研究方法,但是对带协变量调整的非参数检验的研究较少。一种直观的想法是将数据先投影到协变量的正交空间中,然后再利用秩和检验、调整的秩和检验或最大值检验方法。但是,功效普遍不高。在调整的秩和检验和伪F检验两种方法基础上,构建MIN2检验。大量模拟和实际数据表明,MIN2检验的效果优于现有的非参数检验方法。
        Although there are many procedures developed for handling multiple outcomes comparison in the literature,the nonparametric methodology for group comparison with covariate adjustment is still in its infancy. One can use rank-sum test,adjusted rank-sum test,or max-type test by analyzing the processed data orthogonal to the space spanned by covariates. However,the power is not satisfactory. In this work,we combine the adjusted rank-sum test and pseudo F test and then construct a MIN2 test to handle this issue. The performances of MIN2 are thoroughly explored by extensive computer simulations and a real example.
引文
[1] Brunner E, Domhof S and Langer F. Nonparametric analysis of longitudinal data in factorial Experiments[M]. New York:Wiley, 2002.
    [2] Li Z, Cao F, Zhang J, et al. Summation of absolute value test for multiple outcome comparison with moderate effect[J]. Journal of Systems Science and Complexity, 2013, 26(3): 462-469.
    [3] Lu T, Pan Y, Kao S Y, et al. Gene regulation and DNA damage in the ageing human brain[J]. Nature, 2004, 429(6 994): 883-891.
    [4] Hotelling H. The generalization of Student’s ratio[J]. The Annals of Mathematical Statistics, 1931, 2(3): 360-378.
    [5] O’Brien P C. Procedures for comparing samples with multiple endpoints[J]. Biometrics, 1984, 40(4): 1 079-1 087.
    [6] Huang P, Tilley B C, Woolson R F, et al. Adjusting O’Brien’s test to control type I error for the generalized nonparametric Behrens-Fisher problem[J]. Biometrics, 2005, 61(2): 532-539.
    [7] Liu A, Li Q, Liu C, et al. A rank-based test for comparison of multidimensional outcomes[J]. Journal of the American Statistical Association, 2010, 105(490): 578-587.
    [8] Grouin J M, Day S, Lewis J. Adjustment for baseline covariates: an introductory note[J]. Statistics in medicine, 2004, 23(5): 697-699.
    [9] Koch G G, Tangen C M, Jung J W, et al. Issues for covariance analysis of dichotomous and ordered categorical data from randomized clinical trials and non-parametric strategies for addressing them[J]. Statistics in medicine, 1998, 17(15/16): 1 863-1 892.
    [10] Lesaffre E, Senn S. A note on non-parametric ANCOVA for covariate adjustment in randomized clinical trials[J]. Statistics in medicine, 2003, 22(23): 3 583-3 596.
    [11] Tsiatis A A, Davidian M, Zhang M, et al. Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: A principled yet flexible approach[J]. Statistics in medicine, 2008, 27(23): 4 658-4 677.
    [12] Zhang M, Tsiatis A A, Davidian M. Improving efficiency of inferences in randomized clinical trials using auxiliary covariates[J]. Biometrics, 2008, 64(3): 707-715.
    [13] McArdle B H, Anderson M J. Fitting multivariate models to community data: a comment on distance-based redundancy analysis[J]. Ecology, 2001, 82(1): 290-297.
    [14] Li Q, Wacholder S, Hunter D J, et al. Genetic background comparison using distance-based regression, with applications in population stratification evaluation and adjustment[J]. Genetic epidemiology, 2009, 33(5): 432-441.
    [15] Pan W. Relationship between genomic distance-based regression and kernel machine regression for multi-marker association testing[J]. Genetic epidemiology, 2011, 35(4): 211-216.
    [16] Wessel J, Schork N J. Generalized genomic distance-based regression methodology for multilocus association analysis[J]. The American Journal of Human Genetics, 2006, 79(5): 792-806.
    [17] Zapala M A, Schork N J. Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables[J]. Proceedings of the national academy of sciences, 2006, 103(51): 19 430-19 435.