Pacific biosciences sequencing technology for genotyping and variation discovery in human data
详细信息    查看全文
  • 作者:Mauricio O Carneiro (1)
    Carsten Russ (2)
    Michael G Ross (2)
    Stacey B Gabriel (1)
    Chad Nusbaum (2)
    Mark A DePristo (1)
  • 刊名:BMC Genomics
  • 出版年:2012
  • 出版时间:December 2012
  • 年:2012
  • 卷:13
  • 期:1
  • 全文大小:318KB
  • 参考文献:1. Durbin RM, Altshuler DL, Durbin RM, / et al.: A map of human genome variation from population-scale sequencing. / Nature 2010, 467:1061鈥?073. CrossRef
    2. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TFC, McCarroll SA, Visscher PM: Finding the missing heritability of complex diseases. / Nature 2009, 461:747鈥?53. CrossRef
    3. Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, Shendure J: Targeted capture and massively parallel sequencing of 12 human exomes. / Nature 2009, 461:272鈥?76. CrossRef
    4. Musunuru K, Pirruccello JP, Do R, Peloso GM, Guiducci C, Sougnez C, Garimella KV, Fisher S, Abreu J, Barry AJ, Fennell T, Banks E, Ambrogio L, Cibulskis K, Kernytsky A, Gonzalez E, Rudzicz N, Engert JC, DePristo MA, Daly MJ, Cohen JC, Hobbs HH, Altshuler D, Schonfeld G, Gabriel SB, Yue P, Kathiresan S: Exome sequencing, ANGPTL3 mutations, and familial combined hypolipidemia. / N Engl J Med 2010, 363:2220鈥?227. CrossRef
    5. Teer JK, Mullikin JC: Exome sequencing: the sweet spot before whole genomes. / Hum Mol Genet 2010, 19:R145鈥?1. CrossRef
    6. Meyerson M, Gabriel S, Getz G: Advances in understanding cancer genomes through second-generation sequencing. / Nat Rev Genet 2010, 11:685鈥?96. CrossRef
    7. Ding L, Wendl MC, Koboldt DC, Mardis ER: Analysis of next-generation genomic data in cancer: accomplishments and challenges. / Hum Mol Genet 2010, 19:R188鈥?6. CrossRef
    8. Stratton MR: Exploring the genomes of cancer cells: progress and promise. / Science 2011, 331:1553鈥?558. CrossRef
    9. Boehm JS, Hahn WC: Towards systematic functional characterization of cancer genomes. / Nat Rev Genet 2011, 12:487鈥?98. CrossRef
    10. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA: The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. / Genome Res 2010, 20:1297鈥?303. CrossRef
    11. Li H: A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. / Bioinformatics 2011. http://bioinformatics.oxfordjournals.org/content/early/2011/09/08/bioinformatics.btr509.abstract
    12. DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA, del Angel G, Rivas MA, Hanna M, McKenna A, Fennell TJ, Kernytsky AM, Sivachenko AY, Cibulskis K, Gabriel SB, Altshuler D, Daly MJ: A framework for variation discovery and genotyping using next-generation DNA sequencing data. / Nat Genet 2011, 43:491鈥?98. CrossRef
    13. Conrad DF, Bird C, Blackburne B, Lindsay S, Mamanova L, Lee C, Turner DJ, Hurles ME: Mutation spectrum revealed by breakpoint sequencing of human germline CNVs. / Nat Genet 2010, 42:385鈥?91. CrossRef
    14. Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C, Gabriel S, Jaffe DB, Lander ES, Nusbaum C: Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. / Nat Biotechnol 2009, 27:182鈥?89. CrossRef
    15. Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, Sharpe T, Hall G, Shea TP, Sykes S, Berlin AM, Aird D, Costello M, Daza R, Williams L, Nicol R, Gnirke A, Nusbaum C, Lander ES, Jaffe DB: High-quality draft assemblies of mammalian genomes from massively parallel sequence data. / Proc Natl Acad Sci 2011, 108:1513鈥?518. CrossRef
    16. Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. / Bioinformatics 2010, 26:589鈥?95. CrossRef
    17. Grad YH, Lipsitch M, Feldgarden M, Arachchi HM, Cerqueira GC, Fitzgerald M, Godfrey P, Haas BJ, Murphy CI, Russ C, Sykes S, Walker BJ, Wortman JR, Young S, Zeng Q, Abouelleil A, Bochicchio J, Chauvin S, Desmet T, Gujja S, McCowan C, Montmayeur A, Steelman S, Frimodt-M酶ller J, Petersen AM, Struve C, Krogfelt KA, Bingen E, Weill F-X, Lander ES, Nusbaum C, Birren BW, Hung DT, Hanage WP: Genomic epidemiology of the Escherichia coli O104:H4 outbreaks in Europe, 2011. / Proc Natl Acad Sci 2012, 109:3065鈥?070. CrossRef
    18. Thorvaldsd贸ttir H, Robinson JT, Mesirov JP: Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. / Brief Bioinform 2012. http://bib.oxfordjournals.org/content/early/2012/04/18/bib.bbs017.short?rss=1&%3bssource=mfr
    19. Robinson JT, Thorvaldsd贸ttir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP: Integrative genomics viewer. / Nat Biotechnol 2011, 29:24鈥?6. CrossRef
  • 作者单位:Mauricio O Carneiro (1)
    Carsten Russ (2)
    Michael G Ross (2)
    Stacey B Gabriel (1)
    Chad Nusbaum (2)
    Mark A DePristo (1)

    1. Broad Institute of MIT and Harvard, Medical and Population Genetics Program, 301 Binney St, Cambridge, MA, 02141, USA
    2. Broad Institute of MIT and Harvard, Genome Sequencing and Analysis Program, 320 Charles St, Cambridge, MA, 02141, USA
文摘
Background Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome some limitations of current next generation sequencing platforms by providing significantly longer reads, single molecule sequencing, low composition bias and an error profile that is orthogonal to other platforms. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical amplicon resequencing projects. Results We evaluated the Pacific Biosciences technology for SNP discovery in medical resequencing projects using the Genome Analysis Toolkit, observing high sensitivity and specificity for calling differences in amplicons containing known true or false SNPs. We assessed data quality: most errors were indels (~14%) with few apparent miscalls (~1%). In this work, we define a custom data processing pipeline for Pacific Biosciences data for human data analysis. Conclusion Critically, the error properties were largely free of the context-specific effects that affect other sequencing technologies. These data show excellent utility for follow-up validation and extension studies in human data and medical genetics projects, but can be extended to other organisms with a reference genome.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700