caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
详细信息    查看全文
  • 作者:Richard A Moffitt (1)
    Qiqin Yin-Goen (2)
    Todd H Stokes (1)
    R Mitchell Parry (1)
    James H Torrance (1)
    John H Phan (1)
    Andrew N Young (2)
    May D Wang (1) (3) (4)
  • 刊名:BMC Bioinformatics
  • 出版年:2011
  • 出版时间:December 2011
  • 年:2011
  • 卷:12
  • 期:1
  • 全文大小:1791KB
  • 参考文献:1. Shi L, Tong W, Goodsaid F, Frueh F, Fang H, Han T, Fuscoe J, Casciano D: class="a-plus-plus">QA/QC: challenges and pitfalls facing the microarray community and regulatory agencies. / Expert review of molecular diagnostics 2004, class="a-plus-plus">4:761鈥?77. class="external" href="http://dx.doi.org/10.1586/14737159.4.6.761">CrossRef
    2. Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY, Luo Y, Sun YA, Willey JC, Setterquist RA, Fischer GM, Tong W, Dragan YP, Dix DJ, Frueh FW, Goodsaid FM, Herman D, Jensen RV, Johnson CD, Lobenhofer EK, Puri RK, Schrf U, Thierry-Mieg J, Wang C, Wilson M, Wolber PK, / et al.: class="a-plus-plus">The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. / Nat Biotechnol 2006, class="a-plus-plus">24:1151鈥?161. class="external" href="http://dx.doi.org/10.1038/nbt1239">CrossRef
    3. Shi L, Campbell G, Jones W, Campagne F, Wen Z, Walker S, Su Z, Chu T, Goodsaid F, Pusztai L: class="a-plus-plus">The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. / Nature Biotechnology 2010, class="a-plus-plus">28:827. class="external" href="http://dx.doi.org/10.1038/nbt.1665">CrossRef
    4. Parry R, Jones W, Stokes T, Phan J, Moffitt R, Fang H, Shi L, Oberthuer A, Fischer M, Tong W: class="a-plus-plus">k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction. / The Pharmacogenomics Journal 2010, class="a-plus-plus">10:292鈥?09. class="external" href="http://dx.doi.org/10.1038/tpj.2010.56">CrossRef
    5. Li C, Wong WH: class="a-plus-plus">Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. / Proceedings of the National Academy of Sciences 2001, class="a-plus-plus">98:31. class="external" href="http://dx.doi.org/10.1073/pnas.011404098">CrossRef
    6. Li C, Wong WH: class="a-plus-plus">DNA-chip analyzer (dChip). / The analysis of gene expression data: methods and software New York: Springer 2003., class="a-plus-plus">504:
    7. Affymetrix: class="a-plus-plus">Statistical Algorithms Description Document. 2002.
    8. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: class="a-plus-plus">Summaries of affymetrix GeneChip probe level data. / Nucleic Acids Research 2003., class="a-plus-plus">31:
    9. Bolstad BM, Irizarry RA, Astrand M, Speed TP: class="a-plus-plus">A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. / Bioinformatics 2003, class="a-plus-plus">19:185鈥?93. class="external" href="http://dx.doi.org/10.1093/bioinformatics/19.2.185">CrossRef
    10. Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: class="a-plus-plus">Exploration, normalization, and summaries of high density oligonucleotide array probe level data. / Biostatistics 2003, class="a-plus-plus">4:249鈥?64. class="external" href="http://dx.doi.org/10.1093/biostatistics/4.2.249">CrossRef
    11. Affymetrix I: class="a-plus-plus">Guide to Probe Logarithmic Intensity Error (PLIER) Estimation. 2005.
    12. Stokes TH, Moffitt RA, Phan JH, Wang MD: class="a-plus-plus">chip artifact CORRECTion (caCORRECT): A Bioinformatics System for Quality Assurance of Genomics and Proteomics Array Data. / Annals of Biomedical Engineering 2007, class="a-plus-plus">35:1068鈥?080. class="external" href="http://dx.doi.org/10.1007/s10439-007-9313-y">CrossRef
    13. Reimers M, Weinstein JN: class="a-plus-plus">Quality assessment of microarrays: Visualization of spatial artifacts and quantitation of regional biases. / Bmc Bioinformatics 2005., class="a-plus-plus">6:
    14. Buness A, Huber W, Steiner K, Sultmann H, Poustka A: class="a-plus-plus">arrayMagic: two-colour cDNA microarray quality control and preprocessing. In / Book arrayMagic: two-colour cDNA microarray quality control and preprocessing. / Volume 21. City: Oxford Univ Press; 2005:554鈥?56. (Editor ed.^eds.) 554鈥?56
    15. Su谩rez-Fari帽as M, Pellegrino M, Wittkowski KM, Magnasco MO: class="a-plus-plus">Harshlight: a" corrective make-up" program for microarray chips. / BMC Bioinformatics 2005, class="a-plus-plus">6:294. class="external" href="http://dx.doi.org/10.1186/1471-2105-6-294">CrossRef
    16. Suarez-Farinas M, Haider A, Wittkowski KM: class="a-plus-plus">"Harshlighting" small blemishes on microarrays. / BMC Bioinformatics 2005., class="a-plus-plus">6:
    17. Arteaga-Salas JM, Harrison AP, Upton GJG: class="a-plus-plus">Reducing spatial flaws in oligonucleotide arrays by using neighborhood information. / Statistical Applications in Genetics and Molecular Biology 2008, class="a-plus-plus">7:29. class="external" href="http://dx.doi.org/10.2202/1544-6115.1383">CrossRef
    18. Torrance JH, Moffitt RA, Stokes TH, Wang MD: class="a-plus-plus">Can We Trust Biomarkers? Visualization and Quantification of Outlier Probes in High Density Oligonucleotide Microarrays. / Life Science Systems and Applications Workshop, 2007 IEEE/NIH BISTI 2007, 251鈥?54.
    19. Stokes TH: / Development of a visualization and information management platform in translational biomedical informatics. Georgia Institute of Technology, Electrical and Computer Engineering; 2009.
    20. Cope LM, Irizarry RA, Jaffee HA, Wu Z, Speed TP: class="a-plus-plus">A benchmark for Affymetrix GeneChip expression measures. / Bioinformatics 2004, class="a-plus-plus">20:323. class="external" href="http://dx.doi.org/10.1093/bioinformatics/btg410">CrossRef
    21. McCall MN, Murakami PN, Lukk M, Huber W, Irizarry RA: class="a-plus-plus">Assessments of Affymetrix GeneChip Microarray Quality for Laboratories and Single Samples. / Bmc Bioinformatics 2011, class="a-plus-plus">12:137. class="external" href="http://dx.doi.org/10.1186/1471-2105-12-137">CrossRef
    22. Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman RB: class="a-plus-plus">Missing value estimation methods for DNA microarrays. / Bioinformatics 2001, class="a-plus-plus">17:520. class="external" href="http://dx.doi.org/10.1093/bioinformatics/17.6.520">CrossRef
    23. Moffitt RA: / Quality control for translational biomedical informatics. Georgia Institute of Technology; 2011.
    24. Fare TL, Coffey EM, Dai HY, He YDD, Kessler DA, Kilian KA, Koch JE, LeProust E, Marton MJ, Meyer MR, Stoughton RB, Tokiwa GY, Wang YQ: class="a-plus-plus">Effects of atmospheric ozone on microarray data quality. / Analytical Chemistry 2003, class="a-plus-plus">75:4672鈥?675. class="external" href="http://dx.doi.org/10.1021/ac034241b">CrossRef
    25. Schuetz A, Yin-Goen Q, Amin M, Moreno C, Cohen C, Hornsby C, Yang W, Petros J, Issa M, Pattaras J: class="a-plus-plus">Molecular classification of renal tumors by gene expression profiling. / Journal of Molecular Diagnostics 2005, class="a-plus-plus">7:206. class="external" href="http://dx.doi.org/10.1016/S1525-1578(10)60547-8">CrossRef
    26. Hess KR, Anderson K, Symmans WF, Valero V, Ibrahim N, Mejia JA, Booser D, Theriault RL, Buzdar AU, Dempsey PJ: class="a-plus-plus">Pharmacogenomic Predictor of Sensitivity to Preoperative Chemotherapy With Paclitaxel and Fluorouracil, Doxorubicin, and Cyclophosphamide in Breast Cancer. / Journal of Clinical Oncology 2006, class="a-plus-plus">24:4236. class="external" href="http://dx.doi.org/10.1200/JCO.2006.05.6861">CrossRef
    27. Stokes T, Torrance J, Li H, Wang M: class="a-plus-plus">ArrayWiki: an enabling technology for sharing public microarray data repositories and meta-analyses. / Bmc Bioinformatics 2008, class="a-plus-plus">9:S18. class="external" href="http://dx.doi.org/10.1186/1471-2105-9-S6-S18">CrossRef
    28. West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA Jr, Marks JR, Nevins JR: class="a-plus-plus">Predicting the clinical status of human breast cancer by using gene expression profiles. / Proceedings of the National Academy of Sciences 2001, class="a-plus-plus">98:11462. class="external" href="http://dx.doi.org/10.1073/pnas.201162998">CrossRef
  • 作者单位:Richard A Moffitt (1)
    Qiqin Yin-Goen (2)
    Todd H Stokes (1)
    R Mitchell Parry (1)
    James H Torrance (1)
    John H Phan (1)
    Andrew N Young (2)
    May D Wang (1) (3) (4)

    1. The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, Atlanta, GA, 30332, USA
    2. Department of Pathology and Laboratory Medicine, Emory University School of Medicine, Grady Health System, Grady Memorial Hospital, Atlanta, GA, 30303, USA
    3. Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30332, USA
    4. Winship Cancer Institute, Emory University, Atlanta, GA, 30322, USA
  • ISSN:1471-2105
文摘
Background In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replacement strategy and integration with typical microarray workflows via caCORRECT's web portal and caBIG grid services. In this report, we demonstrate that caCORRECT improves the reproducibility and reliability of experimental results across several common Affymetrix microarray platforms. caCORRECT represents an advance over state-of-art quality control methods such as Harshlighting, and acts to improve gene expression calculation techniques such as PLIER, RMA and MAS5.0, because it incorporates spatial information into outlier detection as well as outlier information into probe normalization. The ability of caCORRECT to recover accurate gene expressions from low quality probe intensity data is assessed using a combination of real and synthetic artifacts with PCR follow-up confirmation and the affycomp spike in data. The caCORRECT tool can be accessed at the website: http://cacorrect.bme.gatech.edu. Results We demonstrate that (1) caCORRECT's artifact-aware normalization avoids the undesirable global data warping that happens when any damaged chips are processed without caCORRECT; (2) When used upstream of RMA, PLIER, or MAS5.0, the data imputation of caCORRECT generally improves the accuracy of microarray gene expression in the presence of artifacts more than using Harshlighting or not using any quality control; (3) Biomarkers selected from artifactual microarray data which have undergone the quality control procedures of caCORRECT are more likely to be reliable, as shown by both spike in and PCR validation experiments. Finally, we present a case study of the use of caCORRECT to reliably identify biomarkers for renal cell carcinoma, yielding two diagnostic biomarkers with potential clinical utility, PRKAB1 and NNMT. Conclusions caCORRECT is shown to improve the accuracy of gene expression, and the reproducibility of experimental results in clinical application. This study suggests that caCORRECT will be useful to clean up possible artifacts in new as well as archived microarray data.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700