Linear-time superbubble identification algorithm for genome assembly
详细信息    查看全文
文摘
DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilise de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called superbubble  . In this paper, we propose an class="mathmlsrc">class="formulatext stixSupport mathImg" data-mathURL="/science?_ob=MathURL&_method=retrieve&_eid=1-s2.0-S0304397515009147&_mathId=si1.gif&_user=111111111&_pii=S0304397515009147&_rdoc=1&_issn=03043975&md5=d7419cdf8d583cc23353fabcfba59f4c" title="Click to view the MathML source">O(n+m)class="mathContainer hidden">class="mathCode">O(n+m)-time algorithm to detect all superbubbles in a directed acyclic graph with n vertices and m   (directed) edges, improving the best-known class="mathmlsrc">class="formulatext stixSupport mathImg" data-mathURL="/science?_ob=MathURL&_method=retrieve&_eid=1-s2.0-S0304397515009147&_mathId=si2.gif&_user=111111111&_pii=S0304397515009147&_rdoc=1&_issn=03043975&md5=25ad98842bd69916f8718fb36e5bc853" title="Click to view the MathML source">O(mlog⁡m)class="mathContainer hidden">class="mathCode">O(mlogm)-time algorithm by Sung et al.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700