遗传模拟退火算法在系统发育树构建中的应用研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
系统发生是指生物形成或进化的历史。系统发生学研究物种之间的进化关系,其结果往往是以系统发生树表示。系统发生树是描述物种进化顺序和进化关系的一种拓扑结构。一个可靠的系统发生的推断,将揭示出有关生物进化过程的顺序,有助于我们了解生物进化的历史和进化机制。
     发生树的构建问题是一个NP完全问题,因此,研究构造发生树的近似最优算法有着重要意义。目前常用的构建发生树的方法有三种,即距离法、最大简约法和最大似然法。
     本文针对最大简约法,提出了一种新的搜索方法即遗传算法与模拟退火算法相结合的启发示搜索。随机产生初始群体,然后通过遗传退火算子对初始群体进行优化,从中寻找更优树,不断地更新当前最优树,直到无法找到更优树或者达到了搜索次数的上限,算法停止。
     对改进算法采用了评价建树算法中最常用的计算机模拟法来测试其性能,从实验结果来看,改进算法的准确性都有较大提高。
Phylogenetic is history of biology’s evolution.phylogenetic study the relationship of the different species and the result was express by the phylogenetic tree. Phylogenetic tree is a kind of typological structure for describing the sequence and relationship of species revolution. A reliable phylogenetic inference will manifest the sequence of evolution, and facilitate our understanding of the history and mechanism of species evolution.
     Constructing an evolutionary tree is a typical NP-complete problem,Therefore it is of great significance to construct an algorithm capable of getting optimal approximate solutions. There are three frequently used method: distance method, maximum parsimony method, and maximum likelihood method.
     A new heuristic search method is devised for maximum parsimony method, which first constructs a group of starting tree by random, and then through Genetic Algorithm and Simulated Annealing Algorithm for trees best resembling the present optimal tree and locates a better tree. Repeat this process, till no more better trees are generated or the upper iteration limit is arrived and the program ends.
     The adapted algorithms are tested with the most popular method of computer simulation to evaluate tree constructing algorithms. The results show that the accuracy of the improved algorithm is greatly improved.
引文
[1]吕宝忠, 钟扬等. 分子进化与系统发育[M].高等教育出版社, 2002:76~142.
    [2]张阳德.生物信息学[M].北京:科学出版社,2004:1-85、123-173.
    [3]钟扬,王莉,张亮.生物信息学[M].高等教育出版社, 2003:212~248.
    [4]孙啸,陆祖宏,谢建明.生物信息学基础[M].北京:清华大学出版社,2005:1-35、219-243.
    [5]J. Huelsenbeck, B. Rannala. Phylogenetic methods of testing hypotheses in an evolutionary context[J]. Science. 1997, 276(53):27~32 .
    [6]J.Felsenstein . Inferring phylogenies from protein sequences by parsimony,distance and likelihood methods[J]. Methods Enzyme.1996,2(66):418~427.
    [7]李建伏,郭茂祖.系统发生树构建技术综述[J].电子学报,第 11 期,2006 vol.34.
    [8]B. Chor, M. D. Hendy, B. R. Holland and D. Penny, Multiple maxima of likelihood in phylogenetic trees:Ananaly ticapproach[J].Mol.Biol.2000,17:1529~1541.
    [9]谭严芳,金人超.一种基于 NJ 的高效构建系统进化树算法[J].计算机工程与应用,2004,21.
    [10]唐志学.种系发生树构建算法的研究[D]:[硕士学位论文].哈尔滨:哈尔滨工业大学, 2006.
    [11]陆正福,徐丽华,姜成林.构建微生物分子分类系统进化树的快速运算法与数据结构[J].微生物学通报 1997,24(1).
    [12] J. Lockhart, A. Steel, M. Hendy and D. Penny. Recovering evolutionary trees under a more realistic model of sequence evolution [J]. Mol. Biol. 1999,11:605~612.
    [13]李涛,赖旭龙,钟扬.利用 DNA 序列构建系统树的方法[J].遗传,2004 ,26(2):205-210 .
    [14]潘星华,傅继粱. 基因的分子进化原理与方法[J].自然杂志,第 17 卷 4 期.
    [15]J. Felsenstein. Phylogenies from molecular sequences: Inference and reliability[J]. Rev. Genet. 1988,22:524~565.
    [16] Wim Hordijk,Olivier Gascuel.Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood[J]. Bioinformatics, 2005,21(24):4338-4337.
    [17] C. Cotta, P. Moscato Inferring phylogenetic trees using evolutionary algorithms[J].Lecture Notes in Computer Science, 2002,249:720~729.
    [18] C. B. Congdon. Phylogenetic trees using evolutionary search:Initial progress inextending gaphyl to work with genetic data[J].In Congress on Evolutionary Computation,Canberra, Australia. 2003:56~81.
    [19] L. Brocchieri. Phylogenetic inferences from molecular sequences review and critique[J].Theor. Popul. Biol. 2001,59(1):27~40.
    [20] N. Saitou, M.Nei. The neighbor joining method: a new method for reconstruction phylogenetic trees[J].Mol.Biol. 2001,4: 406~425.
    [21] J. Felsenstein. Inferring Phylogenies[J].Sinauer Associates,Inc,Sunderland, Mass. 2004:2~35 .
    [22] J. Sourdis, C. Krimbas. Accuracy of phylogenetic trees estimated from DNA sequence data [J].Mol. Biol. 1988,4:159~166.
    [23] S. Holmes. Statistics for phylogenetic trees[J] .Theor Pop. Biol,2003,63:17~32.
    [24]Sebastien Roch.A short proof that phylogenetic tree reconstruction by maximum likelihood is Hard[J] .IEEE/ACM Transactions on Computational Biology and Bioinformatics,2006,3(1):92-94.
    [25] S.Whelan, P. Li`o and N. Goldman. Molecular phylogenetics: State-of-the-artmethods for looking into the past[J] .Trends Genet, 2001,17:262~272.
    [26] P.Arndt,C.Burge. DNA sequence evolution with neighbor~dependent mutation[R]. Proceedings of the Sixth Annual International Conference on Computational Biology,2002:32~38.
    [27] D. L. Swofford,J. Olsen. Phylogeny reconstruction Molecular Systematics[J].Sun2 derland: Sinauer Associates Inc, 1990:411~501.
    [28] A. Moilanen. Searching for most parsimonious tree with simulated evolutionary optimisation [J] .Clad 1999,15:39~50.
    [29] Z.Yang,N. Goldman,A. Friday.Comparison of models for nucleotide substutionused in maximum likelihood phylogenetic estimation [J] . Biol, 1994,11(2):316~224.
    [30] 李敏强.遗传算法的基本理论与应用[M].北京:科学出版社,2002.17-59.
    [31] K Tieng,Jperter,F Ophir. Parallel computation in biological sequence analysis [J]. IEEE Transaction on parallel and Distributed Systems,1998,9(3): 21-25.
    [32] 王小平,曹立明.遗传算法—--理论、应用与软件实现[M].西安:西安交通大学出版社,2002.1-15.
    [33] Wright A H, Vose M D. Fitness of the fixed point set for the simple genetic algorithm [J]. Evolutionary Computation,1996,3: 299-309.
    [34] Sechen C. Placement and global routing of integrated circuits using the simulated annealing algorithm [J]. Ph D Dissertation,1986.
    [35] Vose M D. Modeling simple genetic algorithms. Foundations of Genetic Algorithms II [J]. Morgan Kaufmann Publishers, 1993:63-73.
    [36] 张文修,梁怡。遗传算法的数学基础[M].西安:西安交通大学出版社,2000.
    [37] 丁永生.计算智能—理论、技术与应用[M].北京:科学出版社,2004.163-188.
    [38] Chang C,C Wu.Optimal frame patter design of a TDMA mobile communication system using a simulated annealing algorithm [C].IEEE Trans,Vehicular Technology,1993,42(5):205-211.
    [39] Harold H Szu, Ralph L,Hartley.Nonconvex optimization bay fast simulated annealing [J]. Proceedings of the IEEE,1997:117-128.
    [40] Goffe W L,G D Ferrier,J Rogers. Simulated annealing: an initial application in econometrics [J].Computational Economics,1992,5(2):133-146.
    [41] Wong K P, Y W Wong. Genetic and genetic/simulated-annealing approaches to economic [J]. IEEE Proc-C. Generation,Transmission,and Distribution,1994,141(5):507-513.
    [42] Nilar S H. Applications of the simulated annealing method to intermolecular interactions [J].Journal of Computational Chemistry,1991,12(8):1008-1013.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700