Preferred analysis methods for Affymetrix GeneChips. II. An expanded, balanced, wholly-defined spike-in dataset
详细信息    查看全文
  • 作者:Qianqian Zhu (1) (2) (7)
    Jeffrey C Miecznikowski (2) (5)
    Marc S Halfon (1) (3) (4) (6)
  • 刊名:BMC Bioinformatics
  • 出版年:2010
  • 出版时间:December 2010
  • 年:2010
  • 卷:11
  • 期:1
  • 全文大小:5716KB
  • 参考文献:1. MAQC Consortium: The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements. / Nat Biotech 2006, 24:1151鈥?161. CrossRef
    2. McCall MN, Irizarry RA: Consolidated strategy for the analysis of microarray spike-in data. / Nucleic Acids Research 2008, 36:e180. CrossRef
    3. Affymetrix Latin square data [http://www.affymetrix.com/support/technical/sample_data/datasets.affx]
    4. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. / Nucleic Acids Research 2003, 31:e15. CrossRef
    5. Choe S, Boutros M, Michelson A, Church G, Halfon M: Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. / Genome Biology 2005, 6:R16. CrossRef
    6. Pearson R: A comprehensive re-analysis of the Golden Spike data: Towards a benchmark for differential expression methods. / BMC Bioinformatics 2008, 9:164. CrossRef
    7. Schuster E, Blanc E, Partridge L, Thornton J: Correcting for sequence biases in present/absent calls. / Genome Biology 2007, 8:R125. CrossRef
    8. Schuster E, Blanc E, Partridge L, Thornton J: Estimation and correction of non-specific binding in a large-scale spike-in experiment. / Genome Biology 2007, 8:R126. CrossRef
    9. Chen Z, McGee M, Liu Q, Scheuermann RH: A distribution free summarization method for Affymetrix GeneChip(R) arrays. / Bioinformatics 2007, 23:321鈥?27. CrossRef
    10. Turro E, Bochkina N, Hein A-M, Richardson S: BGX: a Bioconductor package for the Bayesian integrated analysis of Affymetrix GeneChips. / BMC Bioinformatics 2007, 8:439. CrossRef
    11. Hochreiter S, Clevert D-A, Obermayer K: A new summarization method for affymetrix probe level data. / Bioinformatics 2006, 22:943鈥?49. CrossRef
    12. Irizarry R, Cope L, Wu Z: Feature-level exploration of a published Affymetrix GeneChip control dataset. / Genome Biology 2006, 7:404. CrossRef
    13. Dabney A, Storey J: A reanalysis of a published Affymetrix GeneChip control dataset. / Genome Biology 2006, 7:401. CrossRef
    14. Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. / Nucleic Acids Res 2002, 30:207鈥?10. CrossRef
    15. Human Gene 1.0 ST Array Performance [http://www.affymetrix.com/support/technical/whitepapers/hugene_perf_whitepaper.pdf]
    16. Affymetrix Statistical Algorithms Description Document [http://www.affymetrix.com/support/technical/whitepapers/sadd_whitepaper.pdf]
    17. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. / Biostatistics 2003, 4:249鈥?64. CrossRef
    18. Wu Z, Irizarry RA: Stochastic Models Inspired by Hybridization Theory for Short Oligonucleotide Arrays. / Journal of Computational Biology 2005, 12:882鈥?93. CrossRef
    19. Choe S, Boutros M, Michelson A, Church G, Halfon M: Correspondence: response to Irizarry, Cope and Wu. / Genome Biology 2006, 7:404. CrossRef
    20. Peppel J, Kemmeren P, van Bakel H, Radonjic M, van Leenen D, Holstege FC: Monitoring global messenger RNA changes in externally controlled microarray experiments. / EMBO Rep 2003, 4:387鈥?93. CrossRef
    21. Hannah MA, Redestig H, Leisse A, Willmitzer L: Global mRNA changes in microarray experiments. / Nat Biotechnol 2008, 26:741鈥?42. CrossRef
    22. McClish DK: Analyzing a portion of the ROC curve. / Med Decis Making 1989, 9:190鈥?95. CrossRef
    23. Storey JD: The positive false discovery rate: A Bayesian interpretation and the q-value. / Ann Stat 2003, 31:2013鈥?035. CrossRef
    24. Benjamini Y, Hochberg Y: Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. / Journal of the Royal Statistical Society B 1995, 57:289鈥?00.
    25. Choe S, Boutros M, Michelson A, Church G, Halfon M: Correspondence: response to Dabney and Storey. / Genome Biology 2006, 7:401. CrossRef
    26. Fodor A, Tickle T, Richardson C: Towards the uniform distribution of null P values on Affymetrix microarrays. / Genome Biology 2007, 8:R69. CrossRef
    27. Dobbin KK, Simon RM: Sample size planning for developing classifiers using high-dimensional DNA microarray data. / Biostatistics 2007, 8:101鈥?17. CrossRef
    28. Ferreira JA, Zwinderman A: Approximate sample size calculations with microarray data: An illustration. / Stat Appl Genet Mo B 2006, 5:25.
    29. Jorstad TS, Langaas M, Bones AM: Understanding sample size: what determines the required number of microarrays for an experiment? / Trends Plant Sci 2007, 12:46鈥?0. CrossRef
    30. Jorstad TS, Midelfart H, Bones AM: A mixture model approach to sample size estimation in two-sample comparative microarray experiments. / BMC Bioinformatics 2008, 9:117. CrossRef
    31. Lee MLT, Whitmore GA: Power and sample size for DNA microarray studies. / Stat Med 2002, 21:3543鈥?570. CrossRef
    32. Pavlidis P, Li QH, Noble WS: The effect of replication on gene expression microarray experiments. / Bioinformatics 2003, 19:1620鈥?627. CrossRef
    33. Pawitan Y, Michiels S, Koscielny S, Gusnanto A, Ploner A: False discovery rate, sensitivity and sample size for microarray studies. / Bioinformatics 2005, 21:3017鈥?024. CrossRef
    34. Wei CM, Li JN, Bumgarner RE: Sample size for detecting differentially expressed genes in microarray experiments. / Bmc Genomics 2004, 5:87. CrossRef
    35. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. / J Mol Biol 1990, 215:403鈥?10.
    36. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, / et al.: Bioconductor: open software development for computational biology and bioinformatics. / Genome Biol 2004, 5:R80. CrossRef
    37. Sing T, Sander O, Beerenwinkel N, Lengauer T: ROCR: visualizing classifier performance in R. / Bioinformatics 2005, 21:3940鈥?941. CrossRef
    38. Li C, Hung Wong W: Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. / Genome Biol 2001, 2:RESEARCH0032.
    39. Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. / Bioinformatics 2003, 19:185鈥?93. CrossRef
    40. Huber W, von Heydebreck A, Sultmann H, Poustka A, Vingron M: Variance stabilization applied to microarray data calibration and to the quantification of differential expression. / Bioinformatics 2002, 18:S96鈥?04.
    41. Baldi P, Long AD: A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. / Bioinformatics 2001, 17:509鈥?19. CrossRef
    42. Smyth GK: Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments. / Stat Appl Genet Mo B 2004, 3:3.
    43. Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. / Proceedings of the National Academy of Sciences 2001, 98:5116鈥?121. CrossRef
    44. Liu X, Milo M, Lawrence ND, Rattray M: A tractable probabilistic model for Affymetrix probe-level analysis across multiple chips. / Bioinformatics 2005, 21:3637鈥?644. CrossRef
    45. Zhang L, Miles MF, Aldape KD: A model of molecular interactions on short oligonucleotide microarrays. / Nat Biotech 2003, 21:818鈥?21. CrossRef
    46. R Development Core Team: R: A Language and Environment for Statistical Computing. 2008.
  • 作者单位:Qianqian Zhu (1) (2) (7)
    Jeffrey C Miecznikowski (2) (5)
    Marc S Halfon (1) (3) (4) (6)

    1. Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY, 14214, USA
    2. Department of Biostatistics, State University of New York at Buffalo, Buffalo, NY, 14214, USA
    7. Current Address: Center for Human Genome Variation, Duke University, Durham, NC, 27708, USA
    5. Department of Biostatistics, Roswell Park Cancer Institute, Buffalo, NY, 14263, USA
    3. Department of Biology, State University of New York at Buffalo, Buffalo, NY, 14260, USA
    4. New York State Center of Excellence in Bioinformatics and the Life Sciences, Buffalo, NY, 14203, USA
    6. Department of Molecular and Cellular Biology, Roswell Park Cancer Institute, Buffalo, NY, 14263, USA
  • ISSN:1471-2105
文摘
Background Concomitant with the rise in the popularity of DNA microarrays has been a surge of proposed methods for the analysis of microarray data. Fully controlled "spike-in" datasets are an invaluable but rare tool for assessing the performance of various methods. Results We generated a new wholly defined Affymetrix spike-in dataset consisting of 18 microarrays. Over 5700 RNAs are spiked in at relative concentrations ranging from 1- to 4-fold, and the arrays from each condition are balanced with respect to both total RNA amount and degree of positive versus negative fold change. We use this new "Platinum Spike" dataset to evaluate microarray analysis routes and contrast the results to those achieved using our earlier Golden Spike dataset. Conclusions We present updated best-route methods for Affymetrix GeneChip analysis and demonstrate that the degree of "imbalance" in gene expression has a significant effect on the performance of these methods.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700