Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context
详细信息    查看全文
  • 作者:Gad Abraham (1) (2)
    Adam Kowalczyk (2)
    Sherene Loi (3) (4)
    Izhak Haviv (5) (6) (7)
    Justin Zobel (1) (2)
  • 刊名:BMC Bioinformatics
  • 出版年:2010
  • 出版时间:December 2010
  • 年:2010
  • 卷:11
  • 期:1
  • 全文大小:1118KB
  • 参考文献:1. van't Veer LJ, Dai H, Vijver MJ, He YD, Hart AAM, Mao M, Peterse HL, Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH: Gene expression profiling predicted clinical outcome of breast cancer. / Nature 2002, 415:530鈥?36. CrossRef
    2. Vijver MJ, He YD, van 't Veer LJ, Dai H, Hart AAM, Voskuil DW, Schreiber GJ, Peterse JL, Roberts C, Marton MJ, Parrish M, Atsma D, Witteveen A, Glas A, Delahaye L, Velde T, Bartelink H, Rodenhuis S, Rutgers ET, Friend SH, Bernards R: A gene-expression signature as a predictor of survival in breast cancer. / New Engl J Med 2002, 347:1999鈥?009. CrossRef
    3. Ein-Dor L, Kela I, Getz G, Givol D, Domany E: Outcome signature genes in breast cancer: is there a unique set? / Bioinformatics 2005, 21:171鈥?78. CrossRef
    4. Michiels S, Koscielny S, Hill C: Prediction of cancer outcome with microarrays: a multiple random validation study. / The Lancet 2005, 365:488鈥?92. CrossRef
    5. Fan C, Oh DS, Wessels L, Weigelt B, Nuyten DSA, Nobel AB, van't Veer LJ, Perou CM: Concordance among Gene-Expression-Based Predictors for Breast Cancer. / N Engl J Med 2006, 355:560鈥?69. CrossRef
    6. Reyal F, van Vliet MH, Armstrong NJ, Horlings HM, de Visser KE, Kok M, Teschendorff AE, Mook S, van 't Veer L, Caldas C, Salmon RJ, Vijver MJ, Wessels LFA: A comprehensive analysis of prognostic signatures reveals the high predictive capacity of the Proliferation, Immune response and RNA splicing modules in breast cancer. / Breast Cancer Res 2008, 10:R93. CrossRef
    7. Yu JX, Sieuwerts AM, Zhang Y, Martens JWM, Smid M, Klijn JGM, Wang Y, Foekens JA: Pathway analysis of gene signatures predicting metastasis of node-negative primary breast cancer. / BMC Cancer 2007, 7:182. CrossRef
    8. Haibe-Kains B, Desmedt C, Sotiriou C, Bontempi G: A comparative study of survival models for breast cancer prognostication based on microrarray data: does a single gene beat them all? / Bioinformatics 2008, 24:2200鈥?208. CrossRef
    9. Lai C, Reinders MJT, van't Veer LJ, Wessels LFA: A comparison of univariate and multivariate gene selection techniques for classification of cancer datasets. / BMC Bioinfo 2006, 7:235. CrossRef
    10. Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. / Nat Genet 2003, 34:166鈥?76. CrossRef
    11. Yousef M, Jung S, Showe LC, Showe MK: Recursive Cluster Elimination (RCE) for classification and feature selection from gene expression data. / BMC Bioinfo 2007, 8:article 144.
    12. van Vliet MH, Klijn CN, Wessels LFA, Reinders MJT: Module-Based Outcome Prediction Using Breast Cancer Compendia. / PLoS ONE 2007., 2:
    13. Chuang HY, Lee E, Liu YT, Lee D, Ideker T: Network-based classification of breast cancer metastasis. / Mol Sys Biol 2007., 3:
    14. Svensson JP, Stalpers LJA, van Lange REEE, Franken NAP, Haveman J, Klein B, Turesson I, Vrieling H, Giphart-Gassler M: Analysis of Gene Expression Using Gene Sets Discriminates Cancer Patients with and without Late Radiation Toxicity. / PLoS Medicine 2006, 3:e422. CrossRef
    15. Ashburner M, Ball CA, Blake JA, Botstein D, an J M Cherry HB, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene Ontology: tool for the unification of biology. / Nat Genet 2000, 25:25鈥?9. CrossRef
    16. Kim SY, Kim YS: A gene sets approach for identifying prognostic gene signatures for outcome prediction. / BMC Genomics 2008., 9:
    17. Lee E, Chuang HY, Kim JW, Ideker T, Lee D: Inferring Pathway Activity toward Precise Disease Classification. / PLoS Comput Biol 2008., 4:
    18. Guo Z, Zhang T, Li X, Wang Q, Xu J, Yu H, Zhu J, Wang H, Wang C, Topol EJ, Wang Q, Rao S: Towards precise classification of cancers based on robust gene functional expression profiles. / BMC Bioinfo 2005, 6:article 58.
    19. Bild AH, Yao G, Chang JT, Wang Q, Potti , Chasse D, Joshi MB, Harpole D, Lancaster JM, Berchuck A, Olson JA, Marks JR, Dressman HK, West M, Nevins JR: Oncogenic pathway signatures in human cancers as a guide to targeted therapies. / Nature 2006,439(7074):353鈥?57. CrossRef
    20. T枚r枚nen P, Ojala PJ, Maartinen P, Holm L: Robust extraction of functional signals from gene set analysis using a generalized threshold free scoring function. / BMC Bioinfo 2009, 10:307. CrossRef
    21. Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. / Proc Natl Acad Sci 2005, 102:15545鈥?5550. CrossRef
    22. Ackermann M, Strimmer K: A general modular framework for gene set enrichment analysis. / BMC Bioinfo 2009, 10:47. CrossRef
    23. Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. / Nucl Acid Res 2002, 30:207鈥?10. CrossRef
    24. Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, van Gelder MM, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. / The Lancet 2005, 365:671鈥?79.
    25. Ivshina AV, George J, Senko O, Mow B, Putti TC, Smeds J, Lindahl T, Pawitan Y, Hall P, Nordgren H, Wong JE, Liu ET, Bergh J, Kuznetsov VA, Miller LD: Genetic reclassification of histologic grade delineates new clinical subtypes of breast cancer. / Cancer Res 2006, 66:10292鈥?0301. CrossRef
    26. Loi S, Haibe-Kains B, Desmedt C, Lallemand F, Tutt AM, Gillet C, Ellis P, Harris A, Bergh J, Foekens JA, Klijn JGM, Larsimont D, Buyse M, Bontempi G, Delorenzi M, Piccart MJ, Sotiriou C: Definition of Clinically Distinct Molecular Subtypes in Estrogen Receptor-Positive Breast Carcinomas Through Genomic Grade. / J Clin Oncol 2007, 25:1239鈥?246. CrossRef
    27. Loi S, Haibe-Kains B, Desmedt C, Wirapati P, Lallemand F, Tutt AM, Gillet C, Ellis P, Ryder K, Reid JF, Daidone MG, Pierotti MA, Berns EM, Jansen MP, Foekens JA, Delorenzi M, Bontempi G, Piccart MJ, Sotiriou C: Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. / BMC Genomics 2008., 9:
    28. Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, Haibe-Kains B, Viale G, Delorenzi M, Zhang Y, d'Assignies MS, Bergh J, Lidereau R, Ellis P, Harris AL, Klijn JG, Foekens JA, Cardoso F, Piccart MJ, Buyse M, Sotiriou C, Consortium T: Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. / Clin Cancer Res 2007, 13:3207鈥?214. CrossRef
    29. Schmidt M, B枚hm D, von T枚rne C, Steiner E, Puhl A, Pilch H, Lehr HA, Hengstler JG, K枚lbl J, Gehrmann M: The Humoral Immune System Has a Key Prognostic Impact in Node-Negative Breast Cancer. / Cancer Res 2008, 68:5405鈥?413. CrossRef
    30. Harrell FE: / Regression Modeling Strategies. Springer; 2001.
    31. Kanehisa M, Goto S: KEGG: Kyoto Encyclopedia of Genes and Genomes. / Nucl Acid Res 2000, 28:27鈥?0. CrossRef
    32. Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ: Discovering statistically significant pathways in expression profiling studies. / Proc Natl Acad Sci 2005, 102:13544鈥?3549. CrossRef
    33. Efron B, Tibshirani R: On testing the significance of sets of genes. / Annal Stat 2007, 1:107鈥?29. CrossRef
    34. Park MY, Hastie T, Tibshirani R: Averaged gene expressions for regression. / Biostatistics 2007, 8:212鈥?27. CrossRef
    35. Lehmann EL: / Nonparametrics. Statistical Methods Based on Ranks. McGraw-Hill; 1975.
    36. Goeman JJ, B眉hlmann P: Analyzing gene expression data in terms of gene sets: methodological issues. / Bioinformatics 2007, 23:980鈥?87. CrossRef
    37. Barry WT, Nobel AB, Wright FA: A statistical framework for testing functional categories in microarray data. / Ann Appl Stat 2008, 2:286鈥?15. CrossRef
    38. Sch枚lkopf B, Smola AJ: / Learning with Kernels. MIT Press; 2002.
    39. Bedo J, Sanderson C, Kowalczyk A: An Efficient Alternative to SVM Based Recursive Feature Elimination with Applications in Natural Language Processing and Bioinformatics. In / Proc Aust Joint Conf AI Edited by: Sattar A, Kang BH. 2006.
    40. Tibshirani R, Hastie T, Narasimhan B, Chu G: Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays. / Stat Sci 2003, 18:104鈥?17. CrossRef
    41. Dabney AR, Storey JD: Optimality driven nearest centroid classification from genomic data. / PLoS One 2007, 2:e1002. CrossRef
    42. McLachlan GJ, Do KA, Ambroise C: / Analyzing Microarray Gene Expression Data. Wiley Interscience; 2004.
    43. Varma S, Simon R: Bias in error estimation when using cross-validation for model selection. / BMC Bioinfo 2006., 7:
    44. Binder H, Schumacher M: Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples. / Statist Appl Genet Mol Biol 2008., 7:
    45. Hastie T, Tibshirani R, Friedman J: / The Elements of Statistical Learning. Springer; 2001.
    46. Downward J: Targeting RAS signalling pathways in cancer therapy. / Nat Rev Cancer 2003, 3:11鈥?2. CrossRef
    47. Dai H, van't Veer L, Lamb J, He YD, Mao M, Fine BM, Bernards R, Vijver M, Deutsch P, Sachs A, Stoughton R, Friend S: A cell proliferation signature is a marker of extremely poor outcome in a subpopulation of breast cancer patients. / Cancer Res 2005, 65:4059鈥?066. CrossRef
    48. Mosley JD, Keri RA: Cell cycle correlated genes dictate the prognostic power of breast cancer gene lists. / BMC Med Genom 2008, 1:11. CrossRef
    49. van Diest PJ, Wall E, Baak JPA: Prognostic value of proliferation in invasive breast cancer: a review. / J Clin Pathol 2004, 57:675鈥?81. CrossRef
    50. S酶rlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM, L酶nning PE, Brown PO, B酶rresen-Dale AL, Botstein D: Repeated observation of breast tumor subtypes in independent gene expression data sets. / Proc Natl Acad Sci 2003, 100:8418鈥?423. CrossRef
    51. Desmedt C, Haibe-Kains B, Wirapati P, Buyse M, Larsimont D, Bontempi G, Delorenzi M, Piccart M, Sotiriou C: Biological processes associated with breast cancer clinical outcome depend on the molecular subtypes. / Clin Cancer Res 2008, 14:5158鈥?165. CrossRef
    52. Buyse M, Loi S, van 't Veer L, Viale G, Delorenzi M, Glas AM, d'Assignies MS, Bergh J, Lidereau R, Ellis P, Harris A, Bogaerts J, Therasse P, Floore A, Amakrane M, Piette F, Rutgers E, Sotiriou C, Cardoso F, Piccart MJ, Consortium T: Validation and clinical utility of a 70-gene prognostic signature for women with node-negative breast cancer. / JNCI 2006, 98:1183鈥?192.
    53. Dent R, Trudeau M, Pritchard KI, Hanna WM, Kahn HK, Sawka CA, Lickley LA, Rawlinson E, Sun P, Narod SA: Triple-negative breast cancer: clinical features and patterns of recurrence. / Clin Cancer Res 2007, 13:4429鈥?434. CrossRef
    54. Teschendorff AE, Naderi A, Barbosa-Morais NL, Pinder SE, Ellis IO, Aparicio S, Brenton JD, Caldas C: A consensus prognostic gene expression classifier for ER positive breast cancer. / Genome Biol 2006, 7:R101. CrossRef
    55. Goeman J: [http://www.msbi.nl/goeman] / penalized. L1 (lasso) and L2 (ridge) penalized estimation in GLMs and in the Cox model. 2008. [R package version 0.9鈥?2]
    56. Karatzoglou A, Smola A, Hornik K, Zeileis A: kernlab - An S4 Package for Kernel Methods in R. [http://www.jstatsoft.org/v11/i09/] / Journal of Statistical Software 2004,11(9):1鈥?0.
    57. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, Cheung VG: Genetic analysis of genome-wide variation in human gene expression. / Nature 2004, 430:743鈥?47. CrossRef
    58. Brentani H, Caballero OL, Camargo AA, da Silva AM, da Silva WA, Neto ED, Grivet M, Gruber A, Guimaraes PEM, Hide W, Iseli C, Jongeneel CV, Kelso J, Nagai MA, Ojopi EPB, / et al.: The generation and utilization of a cancer-oriented representation of the human transcriptome by using expressed sequence tags. / Proc Natl Acad Sci 2003, 100:13148鈥?3423. CrossRef
  • 作者单位:Gad Abraham (1) (2)
    Adam Kowalczyk (2)
    Sherene Loi (3) (4)
    Izhak Haviv (5) (6) (7)
    Justin Zobel (1) (2)

    1. Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, 3010, VIC, Australia
    2. NICTA Victoria Laboratory, The University of Melbourne, Parkville, 3010, VIC, Australia
    3. Department of Translational Research and Functional Genomics Unit, Jules Bordet Institute, Brussels, Belgium
    4. Department of Medical Oncology, Peter MacCallum Cancer Centre, East Melbourne, VIC, 3002, Australia
    5. Metastasis Research Laboratory, Peter MacCallum Cancer Centre, East Melbourne, VIC, 3002, Australia
    6. The Blood and DNA Profiling Facility, Baker IDI Institute, Prahran, VIC, 3004, Australia
    7. Department of Biochemistry, School of Medicine, University of Melbourne, VIC, 3010, Australia
  • ISSN:1471-2105
文摘
Background Different microarray studies have compiled gene lists for predicting outcomes of a range of treatments and diseases. These have produced gene lists that have little overlap, indicating that the results from any one study are unstable. It has been suggested that the underlying pathways are essentially identical, and that the expression of gene sets, rather than that of individual genes, may be more informative with respect to prognosis and understanding of the underlying biological process. Results We sought to examine the stability of prognostic signatures based on gene sets rather than individual genes. We classified breast cancer cases from five microarray studies according to the risk of metastasis, using features derived from predefined gene sets. The expression levels of genes in the sets are aggregated, using what we call a set statistic. The resulting prognostic gene sets were as predictive as the lists of individual genes, but displayed more consistent rankings via bootstrap replications within datasets, produced more stable classifiers across different datasets, and are potentially more interpretable in the biological context since they examine gene expression in the context of their neighbouring genes in the pathway. In addition, we performed this analysis in each breast cancer molecular subtype, based on ER/HER2 status. The prognostic gene sets found in each subtype were consistent with the biology based on previous analysis of individual genes. Conclusions To date, most analyses of gene expression data have focused at the level of the individual genes. We show that a complementary approach of examining the data using predefined gene sets can reduce the noise and could provide increased insight into the underlying biological pathways.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700