Atlas2 Cloud: a framework for personal genome analysis in the cloud
详细信息    查看全文
  • 作者:Uday S Evani (1)
    Danny Challis (1)
    Jin Yu (1)
    Andrew R Jackson (2) (3)
    Sameer Paithankar (2) (3)
    Matthew N Bainbridge (1)
    Adinarayana Jakkamsetti (1)
    Peter Pham (1)
    Cristian Coarfa (2) (3)
    Aleksandar Milosavljevic (2) (3)
    Fuli Yu (1) (3)
  • 刊名:BMC Genomics
  • 出版年:2012
  • 出版时间:October 2012
  • 年:2012
  • 卷:13
  • 期:6-supp
  • 全文大小:498KB
  • 参考文献:1. Tucker T, Marra M, Friedman JM: Massively parallel sequencing: the next big thing in genetic medicine. / Am J Hum Genet 2009,85(2):142鈥?54. CrossRef
    2. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, / et al.: Targeted capture and massively parallel sequencing of 12 human exomes. / Nature 2009,461(7261):272鈥?76. CrossRef
    3. Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE, Dudley JT, Ormond KE, Pavlovic A, Morgan AA, / et al.: Clinical assessment incorporating a personal genome. / Lancet 2010,375(9725):1525鈥?535. CrossRef
    4. Bainbridge MN, Wiszniewski W, Murdock DR, Friedman J, Gonzaga-Jauregui C, Newsham I, Reid JG, Fink JK, Morgan MB, Gingras MC, / et al.: Whole-genome sequencing for optimized patient management. / Sci Transl Med 2011,3(87):87re83. CrossRef
    5. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, / et al.: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. / Genome Res 2010,20(9):1297鈥?303. CrossRef
    6. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. / Bioinformatics 2009,25(16):2078鈥?079. CrossRef
    7. Schatz MC: CloudBurst: highly sensitive read mapping with MapReduce. / Bioinformatics 2009,25(11):1363鈥?369. CrossRef
    8. Langmead B, Schatz MC, Lin J, Pop M, Salzberg SL: Searching for SNPs with cloud computing. / Genome Biol 2009,10(11):R134. CrossRef
    9. Afgan E, Baker D, Coraor N, Chapman B, Nekrutenko A, Taylor J: Galaxy CloudMan: delivering cloud compute clusters. / BMC Bioinformatics 2010,11(Suppl 12):S4. CrossRef
    10. Challis D, Yu J, Evani US, Jackson AR, Paithankar S, Coarfa C, Milosavljevic A, Gibbs RA, Yu F: An integrative variant analysis suite for whole exome next-generation sequencing data. / BMC Bioinformatics 2012,13(1):8. CrossRef
    11. Siva N: 1000 Genomes project. / Nat Biotechnol 2008,26(3):256.
    12. Karolchik D, Hinrichs AS, Kent WJ: The UCSC Genome Browser. / Curr Protoc Bioinformatics 2009, Chapter 1:Unit1 4.
    13. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, Depristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, / et al.: The variant call format and VCFtools. / Bioinformatics 2011,27(15):2156鈥?158. CrossRef
    14. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, Mangan M, Nekrutenko A, Taylor J: Galaxy: a web-based genome analysis tool for experimentalists. / Curr Protoc Mol Biol 2010, Chapter 19:Unit 19.10.1鈥?1.
    15. Wang K, Li M, Hakonarson H: ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. / Nucleic Acids Res 2010,38(16):e164. CrossRef
    16. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, Goldman M, Barber GP, Clawson H, Coelho A, / et al.: The UCSC Genome Browser database: update 2011. / Nucleic Acids Res 2011,39(Database issue):D876鈥?82. CrossRef
    17. Evani US, Challis D, Yu J, Jackson AR, Paithankar S, Bainbridge MN, Coarfa C, Milosavljevic A, Yu F: Enabling Atlas2 personal genome analysis on the cloud. / Genomic Signal Processing and Statistics (GENSIPS), 2011 IEEE International Workshop on: 4鈥? December 2011 2011, 117鈥?20. CrossRef
  • 作者单位:Uday S Evani (1)
    Danny Challis (1)
    Jin Yu (1)
    Andrew R Jackson (2) (3)
    Sameer Paithankar (2) (3)
    Matthew N Bainbridge (1)
    Adinarayana Jakkamsetti (1)
    Peter Pham (1)
    Cristian Coarfa (2) (3)
    Aleksandar Milosavljevic (2) (3)
    Fuli Yu (1) (3)

    1. The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
    2. Bioinformatics Research Laboratory, Epigenome Center, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
    3. Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
文摘
Background Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700