A novel locally guided genome reassembling technique using an artificial ant system
详细信息    查看全文
  • 作者:Susobhan Baidya ; Rajat Kumar De
  • 关键词:ACO ; NP hard problem ; DNA ; Short reads ; Hierarchical BAC ; by ; BAC sequencing
  • 刊名:Applied Intelligence
  • 出版年:2015
  • 出版时间:September 2015
  • 年:2015
  • 卷:43
  • 期:2
  • 页码:397-411
  • 全文大小:1,113 KB
  • 参考文献:1.Garca S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametric tests for analyzing the evolutionary algorithms behaviour: a case study on the CEC2005 Special Session on Real Parameter Optimization. Journal of Heuristcs 15:617-644View Article
    2.Indumathy R, Uma Maheswari S (2014) Solving DNA Sequence Assembly Using Particle Swarm Optimization With Inertia Weight and Constriction Factor. International Journal of Soft Computing and Artificial Intelligence 2(1):90-4
    3.Verma RS, Singh V, Kumar S (2011) DNA Sequence Assembly using Particle Swarm Optimization. Int J Comput Appl 28(10):34-8
    4.Fang S-C, Wang Y, Zhong J (2005) A Genetic Algorithm Approach to Solving DNA Fragment Assembly Problem. J Comput Theor Nanosci 2:1-View Article
    5.Parsons RJ, Forrest S, Burks C (1995) Genetic algorithms, operators, and DNA fragment assembly. Mach Learn 21(1-2):11-3View Article
    6.Nebro AJ, Luque G, Luna F, Alba E (2008) DNA fragment assembly using a grid-based genetic algorithm. Comput Oper Res 35(9):2776-790View Article MATH
    7.Luque G, Alba E, Khuri S (2005) Parallel Computing for Bioinformatics and Computational Biology, WILEY, Chapter-12: Assembling DNA Fragments with a Distributed Genetic Algorithm
    8.Karaboga D, Akay B (2009) A comparative study of Artificial Bee Colony algorithm. Appl Math Comput 25:108-32MathSciNet View Article
    9.Karaboga D, Ozturk C, Karaboga N, Gorkemli B (2012) Artificial bee colony programming for symbolic regression. Inf Sci 209:01-5View Article
    10.Firoz JS, Sohel Rahman M, Saha TK (2012) Bee Algorithms for Solving DNA Fragment Assembly Problem with Noisy and Noiseless data. GECCO -2 Proceedings 14th Annual Conference on Genetic and Evolutionary Computation. ACM, NY, pp 201-08
    11.Ansorge WJ (2009) Next generation DNA sequencing techniques. New Biotechnol 25(4):167-60View Article
    12.Blazewicz J, Bryjaa M, Figlerowicz M, Gawrona P, Kasprzak M, Kirton E, Platt D, Przybytek J, Swiercz A, Szajkowski L (2009) Whole genome assembly from 454 sequencing output via modified DNA graph concept. Comput Biol Chem 33:224-30View Article
    13.Blum C, Valles MY, Blesa MJ (2008) An ant colony optimization algorithm for DNA sequencing by hybridization. Comput Oper Res 35:362-635View Article
    14.Brun Y (2008) Solving NP-complete problems in the tile assembly model. Theor Comput Sci 395:31-6MathSciNet View Article MATH
    15.Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB (2008) Allpaths: De novo assembly of whole-genome shotgun micro reads. Genome Res 18:810-20View Article
    16.Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Systems Man Cybern Part B 26:29-1
    17.Dorigo M, Stutzle T (2004) Ant Colony Optimization. MIT Press, LondonView Article MATH
    18.Isakov O, Shomron N Deep sequencing data analysis: Challenges and solutions. Bioinformatics Trends and Methodologies, Intech, November 2011, ch-29:Deep Sequencing Data Analysis
    19.Joshi N, Srivastava S, Kumar M, Kavalan J, Karandikar SK, Saraph A (2011) Parallelization of velvet, a de-novo genome sequence assembler. IEEE International Conference on High Performance Computing
    20.Kurniawan TB, Ibrahim Z, Saaid MFM, Yahya A (2008) Implementation of ant system for DNA sequence optimization. NANO-SciTech, Shah Alam
    21.Ma X, Lombardi F (2008) Combinatorial optimization problem in designing DNA self-assembly tile sets. 2008 IEEE International Workshop on Design and Test of Nano Devices, Circuits and Systems, pp 73-6
    22.Medvedev P, Georgiou K, Myers G, Brudno M (2007) Computability models of sequence assembly. Workshop on Algorithms in Bioinformatics, Philadelphia, 289-01
    23.Meksangsouy P, Chaiyaratana N (2003) DNA fragment assembly using an ant colony system algorithm. Proceedings Evolutionary Computation. CEC -3 3:1756-763
    24.Miller JR, Koren S, Sutton G (2010) Assembly algorithms for next generation sequencing data. Genomics 95:315-27View Article
    25.Myers G (1999) Whole-genome dna sequencing. Comput Sci Eng 1:33-3View Article
    26.Myllykangas S, Buenrostro J, Ji HP (2012) Overview of sequencing technology platforms. Bioinformatics for High Throughput Sequencing, 11-5
    27.Quail MA, Smith M, Coupland P, Otto TD, Harris SR, Connor TR, Bertoni A, Swerdlow HP, Yong G (2012) A tale of three next generation sequencing platforms: comparison of ion torrent, pacific biosciences and illumina Miseq sequencers. BMC Genom 13:1471-164View Article
    28.Scheibye-Alsing K, Hoffmann S, Frankel AM, Jensen P, Stadler PF (2009) Sequence assembly. Comput Biol Che:33
    29.Stupar M, Vidovi V, Luka D (2011) Functions of human non-coding DNA sequences. Arch Oncol 19:3-View Article
    30.Treangen TJ, Salzberg SL (2011) Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet 113(1):36-6
    31.Wei L-T, Yang C-B, Ann H-Y, Peng Y-H (
  • 作者单位:Susobhan Baidya (1)
    Rajat Kumar De (2)

    1. Department of Information Technology, Heritage Institute of Technology, Kolkata, India
    2. Machine Intelligence Unit, Indian Statistical Institute, Kolkata, India
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Mechanical Engineering
    Manufacturing, Machines and Tools
  • 出版者:Springer Netherlands
  • ISSN:1573-7497
文摘
DNA reassembling is an NP-hard problem (Brun, Theor Comput Sci 395:31-6, 2008; Medvedev et al 2007; Ma and Lombardi 2008). The present article presents a locally guided global learning system to solve the problem of genome reassembling. We have used a reference DNA sequence which is 99 % similar to an unknown DNA sequence. Two different sequences from the same organism generally have around 99 % similarity (Wei et al 2007). We have considered different DNA sequences from NCBI website (http://?www.?ncbi.?nlm.?nih.?gov). Then we have simulated the tasks of cloning the sequence, followed by shearing the clones to a number of short reads. In our algorithm, we have introduced a new concept in the task of DNA reassembling using Ant Colony Optimization, where pheromone concentration is proportional to the score of assembled DNA fragments with some known reference sequences within the same organism. Unlike local overlapping, we have used here local alignment score of short reads with some known local reference region as the heuristic information. The result shows that our algorithm is capable of reassembling at par with the state-of-the-art. DNA reassembling techniques may need a massive parallel computation and huge memory space (Kurniawan et al 2008) because of size ~109bp of DNA sequences of mammals (Miller et al, Genomics 95:315-27, 2010; Blazewicz et al, Comput Biol Chem 33:224-30, 2009; Butler et al, Genome Res 18:810-20, 2008; Joshi et al 2011; Stupar et al, Arch Oncol 19:3-, 2011; Quail et al, BMC Genomics 13:1471-164, 2012), and ACO is inherently concurrent in nature (Dorigo and Stutzle 2004). Due to lack of appropriate computational resources, we had to confine ourselves to deal with the sequences of length up to ?05 b p. We have considered 22 sequences of different organism, including Homo sapiens BRCA1 (127429bp) gene. For large sequences, we have applied hierarchical BAC-by-BAC sequencing (Fig. 2) (Myers, Comput Sci Eng 1:33-3, 1999), to stitch the individual segments to retrieve the original DNA sequence.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700