DGraph: Algorithms for Shortgun Reads Assembly Using De Bruijn Graph
详细信息    查看全文
  • 作者:Jintao Meng (20) (21) (23)
    Jianrui Yuan (21) (22)
    Jiefeng Cheng (21)
    Yanjie Wei (21)
    Shengzhong Feng (21)
  • 关键词:De Bruijn graph ; graph algorithm ; short read assembler
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2012
  • 出版时间:2012
  • 年:2012
  • 卷:7513
  • 期:1
  • 页码:22-32
  • 全文大小:546KB
  • 参考文献:1. Metzker, M.L., Lu, J., Gibbs, R.A.: Electrophoretically Uniform Fluorescent Dyes for Automated DNA Sequencing. Science?5254(271), 1420-422 (2009)
    2. Margulies, M., et al.: Genome sequencing in microfabricated high-density picolitre reactors. Nature?441(4) (2006)
    3. Bentley, D.R.: Whole-genome re-sequencing. Current Opinion in Genetics & Development?6(16), 545-52 (2006) CrossRef
    4. Sutton, G.G., White, O., Adams, M.D., Kerlavage, A.R.: TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects. Genome Science and Technology?1(1), 9-9 (1995) CrossRef
    5. Green, P.: http://bozeman.mbt.washington.edu/phrap.docs/phrap.html
    6. Huang, X., Madan, A.: CAP3: A DNA Sequence Assembly Program. Genome Research?(9), 868-77 (1990)
    7. Kreuze, J.F., Perez, A., Untiveros, M., Quispe, D., Fuentes, S., Barker, I., Simon, R.: Complete viral genome sequence and discovery of novel viruses by deep sequencing of small RNAs: A generic method for diagnosis, discovery and sequencing of viruses. Virology?1(388), 1- (2009) CrossRef
    8. Warren, R.L., Sutton, G.G., Jones, S.J.M., et al.: Assembling millions of short DNA sequences using SSAKE. Bioinformatics?4(23), 500-01 (2007) CrossRef
    9. Zerbino, D.R., Birney, E.: Velvet: algorithms for de novo short read assembly using De Bruijn graphs. Genome. Res.?5(18), 821-29 (2008) CrossRef
    10. Idury, R.M., Waterman, M.S.: A New Algorithm for DNA Sequence Assembly. Journal of Computational Biology (1995)
    11. blast, http://www.ncbi.nlm.nih.gov/blast/producttable.shtml#mega
  • 作者单位:Jintao Meng (20) (21) (23)
    Jianrui Yuan (21) (22)
    Jiefeng Cheng (21)
    Yanjie Wei (21)
    Shengzhong Feng (21)

    20. Institute of Computing Technology, CAS, Beijing, 100190, P.R. China
    21. Shenzhen Institutes of Advanced Technology, CAS, Shenzhen, 518055, P.R. China
    23. Graduate University of Chinese Academy of Sciences, Beijing, 100049, China
    22. Central South University, Changsha, 410083, P.R. China
  • ISSN:1611-3349
文摘
Massively parallel DNA sequencing platforms have become widely available, reducing the cost of DNA sequencing by over two orders of magnitude, and democratizing the field by putting the sequencing capacity of a major genome center in the hands of individual investigators. New challenges include the development of robust protocols for generating sequencing libraries, building effective new approaches to resequence and data-analysis. In this paper we demonstrate a new sequencing algorithm, named DGraph, which has two modules, one module is responsible to construct De Bruijn graph by cutting reads into k-mers, and the other’s duty is to simplify this graph and collect all long contigs. The authors didn’t adapt the sequence graph reductions operations proposed by RAMANA M.IDURY or Finding Eulerian Superpaths proved by Pavel A.Pevzner or bubble remove steps suggested by Danial Zerbino, As the first operations was computing expensive, and the second one was impractical, and the last one did not benefit either the quality of contigs or the efficiency of the assembler. Our assembler was focused only on efficient and effective error removal and path reduction operations. Applying DGraph to the simulation data of fruit fly Drosophila melanogaster chromosome X, DGraph (3min) is about six times faster than velvet 0.3 (19 mins), and its coverage (92.5%) is also better than velvet (78.2%) when k = 21. Compare to velvet, the results shows that the algorithm of DGraph is a faster program with high quality results.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700