基因组完美重组算法及一类QP-Free可行域方法的研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
本文研究了计算生物学中的基因组完美重组问题及QP-free可行域方法.全文共分为四章.
     基因组完美重组问题在生物种族进化研究、生物分类学研究和生物制药研究等领域中显示出重要的研究价值.在第一章里,我们将简单介绍基因组完美重组问题的提出、相关概念及研究现状.同时,在第一章,我们还简单介绍了QP-free可行域方法—一类求解非线性约束规划问题的方法.非线性约束规划问题是最优化领域中重要的研究课题之一,许多实际问题都可以归结为非线性约束规划问题,并且很多实际应用,比如工程设计、经济平衡等问题都要求数据仅仅在可行域内取值.本章最后,我们给出了本文的主要结果.
     第二章研究了含有n条染色体的基因组的完美重组问题.对于单条染色体的完美重组问题,Bérard等[4]利用strong interval树的结构,给出了O(2~pn(?))时间的算法.后来,Bérard等[5]改进了此算法,给出了O(2~dn(?))时间的算法,这里d≤p.同时,Bérard等在[5]提出了一个问题:再优化strong interval树,也很难找到一个标号基因组的完美翻转重组问题的多项式算法.本文把strong interval树和断点图结合起来,对含有n条染色体的基因组的完美重组问题,给出了一个O(n~2)时间的算法.部分地解决了Bérard等[5]中提出的问题.
     设源染色体和目标染色体分别为π和r,它们含有不同基因集合.在第三章里,我们研究了含有不同基因集合的染色体π和r的完美重组问题.很明显,在这种情况下,“插入”或“删除”将成为必需的操作.我们推广了Hannehali和Pevzner[26]提出的多项式算法(简称HP算法),使之能够处理含有不同基因集合π和r的完美重组问题.
     在最后一章,我们研究了一类求解非线性约束规划问题的方法—QP-free可行域方法.我们知道SQP方法,即序列二次规划方法,是解决非线性约束规划问题的一种有效方法,但是,这种方法在每次迭代点处都要解决二次规划子问题,计算量很大,并且在某些迭代点处不一定有解.而QP-free可行域方法,是把求解一个非线性约束规划问题等价于线性方程组的求解问题.在每次迭代中,只需解若干个具有相同系数矩阵的线性方程组,因此减少了计算量,并且QP-free可行域方法能保证在每一个迭代点处都有解.1988年,Partier等[43]在前人的基础上提出了一类有效的QP-free方法,并且证明了这类方法的收敛性.之后,有很多的研究者对其进行了改进.我们利用光滑化的Fischer-Burmeister(FB)非线性互补函数,修改了QP-free可行域方法线性方程组系数矩阵的近似Jacobian矩阵V_k,提出了一类新方法,并且在比较弱的条件下证明该方法的可行性及收敛性.通过理论证明和例子的数据结果分析可以看出,我们提出的QP-free可行域方法比较令人满意.
In this paper, we mainly study the problem of genome perfect rearrangements in bioinformatics and a kind of QP-free feasible method. This paper consists of four chapters.
     Recently, the studies of genome perfect rearrangements have displayed importance in many fields, such as evolutionary histories of species, taxonomy of biology, biomedicine, etc. Therefore, in Chapter 1, we first give a brief review of genome perfect rearrangements and introduce the corresponding conceptions. We also introduce a kind of QP-free feasible method-which is used to solve a nonlinear constrained optimization problem. The nonlinear constrained optimization problem is an important research topic in mathemtical programming fields. Many practical problems can be reduced to be the nonlinear constrained optimization problems. Moreover, some real-life applications such as in engineering design and economical equilibrium of supply and demand require the datas only defined in the feasible region. Finally, we introduce the main results obtained in our paper.
     The objective of Chapter 2 is to deal with the problem of genomes perfect rearrangements. For the problem of uni-chromosome (permutation) perfect rearrangements , Berard et al. [4] gave an O(2~pn(?)) algorithm with the strong interval tree framework. Later, Berard et al. [5] improved the algorithm. They can compute a parsimonious perfect scenario for the permutation in worst-case time O(2~dn(?)), where d≤p. In this paper, we connect a strong interval tree with a breakpoint graph. For the problem of TV-chromosomal genomes perfect rearrangements , we design an algorithm and get an optimal perfect sorting sequence for the genomes in worst-case time O(n~2). We solved this problem which was proposed in [5], i.e., it seems difficult to optimize the strong interval tree framework in order to compute perfect scenarios in polynomial time for larger classes of signed permutations.
     The problem of sorting signed permutation (chromosome) by reversals has been studied intensively. In 1995, Hannehali and Pevzner [26] developed the first polynomial algorithm, denoted by HP algorithm in our paper, and they gave the formula of the reversal distance. Letπand r be the source chromosome (permutation ) and target chromosome (permutatio), respectively. Moreover, they have the different gene content. In Chapter 3, we study the problem of perfectly sortingπand r. Obviously, in such case, some additional steps (insertions or deletions) are needed. We show how to extend the HP algorithm to a polynomial algorithm which can compute the perfect scenarios forπandγby reversals and deletions.
     In the last chapter, we concern a method of solving the nonlinear constrained optimization problem, namely QP-free feasible method. In 1988, Panier [43] brought forward a kind of available QP-free method on the basis of the former and proved convergence of the method. Subsequently, some people mended this method. We base on the results of the formers to modify the coefficient matrix V_k of linear systems of equations with smoothing Fischer-Burmeister nonlinear complementarity problem (NCP) function. Then we propose a new kind of QP-free feasible method. Under some weaker conditions, our method is implementable and globally convergent . Moverover, some numerical test results are given to indicate that the algorithm is quite promising.
引文
[1] D.A. Bader, B. M. E. Moret and M. Yan, A linear-time algorithm for computing inversion distance between two signed permutations with an experimental study,Journal of Computational Biology, 2001, 8(5): 483-491.
    
    [2] V. Bafna and P. Pevzner, Genome rearrangements and sorting by reversals, Proc.34th IEEE Symp. of the Foundations of Computer Science, 1994, 148-157.
    
    [3] S. Berard, A. Bergeron and C. Chauve, Conserved structures in evolution scenarios, In Comparative Genomics, RECOMB 2004 International Workshop,RCG2004, Volume 3388 of Lecture Notes in Bioinformations, Springer-Verlag,2005, 1-15.
    
    [4] S. Berard, A. Bergeron, C. Chauve and C. Paul, Perfect sorting by reversals is not always difficult, IEEE/ACM Transactions on Computational Biology Bioinformations, 2007, 4(1): 4-16.
    
    [5] S. Berard, C. Chauve and C. Paul, A more efficient algorithm for perfect sorting by reversals, Information Processing Letters, 2008, 106(3): 90-95.
    
    [6] A. Bergeron, A very elementary presentation of the Harrenhalli-Pevzner theory,Proc. 12th Annual Symp. on Combinatorial Pattern Matching, Jerusalem, Israel,2001, 106-117.
    
    [7] A. Bergeron, C. Chauve, F. de Montgolfiter and M. Raffinot, Computing common intervals of k permutations, with applications to modular de composition of graphs, In Algorithms-ESA 2005, 13th Annual European Symposium, Volume 3669 of Lecture Notes in Comput. Sci., Springer-Verlag, 2005, 779-790.
    
    [8] A. Bergeron, J. Mixtacki and J. Stoye, Reversal distance without hurdles and fortresses [A], Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM 2004)[C], Berlin: Springer-Verlag, 2004, 388-399.
    [9] A. Bergeron, J. Mixtacki and J. Stoye, The inversion distance problem, In Mathematics of Evolution and Phytogeny (0. Gascuel Ed.), Oxford University Press,2005.
    
    [10] P. Berman, S. Hannenhalli and M. Karpinski, 1.375-approximation algorithm for sorting by reversals, Proc. 10th European Symposium on Algorithms, Rome, Italy,September 2002, 200-210.
    
    [11] P. Berman and M. Karpinski, On some tighter inapproximability results, Lecture Notes in Computer Science 1999, 1644: 200-209.
    
    [12] G. Bourque, P. A. Pevzner and G. Tesher, Reconstructing the genomic architecture of ancestral mammals: lessons from human, mouse and rat genomues.Genome Research, 2004, 14(4): 507-516.
    
    [13] A. Caprara, Sorting by reversals is difficult, Proc. lth Annual International Conference on Computational Molecular Biology, Santa Fc, New Mexico, January 1997, 75-83.
    
    [14] B. Chen, X. Chen and C. Kanzow, A penalized Fischer-Burmeister NCP function,Mathematical Programming, 2000, 88: 211-216.
    
    [15] X. J. Chen, Smoothing methods for complementarity problems and their applications: a survey, Journal of the operations research society of Japan, 2000, 43:32-47.
    
    [16] D.A. Christic, A 3/2-approximation algorithm for sorting by reversals, Proc. 9th Annual ACM-SIAM Symp. on Discrete Algorithms, ACM Press, 1998, 244-252.
    
    [17] Y. Diekmann, M. Sagot and E. Tannier, Evolution under reversals: Parsimony and Conservation of Common inervals, IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2007, 4(2): 301-309.
    
    [18] T. Dobzhansky and A. H. Sturtevant, Inversions in the chromosomes of drosophila pscudoobscura, Genetics, 1938, 23: 28-64.
    [19] R.G. Downey and M. R. Fellows, Parameterized Complexity, Springer, 1999.
    
    [20] N. El-Mabrouk, Genome rearrangement by reversals and insertions/deletions of contiguous segments, In Proceeding 11th Annual symposium on Combinaotrial Pattern Matching (CPM'00), London: Springer, 2000, 222-234.
    
    [21] F. Facchinei and S. Lucidi, Quadratically and superlinearly convergent algorithms for the solution of inequality constrained minimization problems, Journal of Optimization Theory and Applications, 1995, 85: 265-289.
    
    [22] M. C. Ferris and J. S. Pang, Engineering and economic applications of complementarity problems, SIAM Review, 1997, 39: 669-713.
    
    [23] M. Figeac and J. -S. Varre, Sorting by reversals with common intervals. Proceedings of WABI 2004, Lecture Notes in Computer Sciences, Berlin: Springer, 2004,3240: 26-37.
    
    [24] Z. Gao, G. He and F. Wu, Sequential systems of linear equations algorithm for nonlinear optimization problems with general constraints, Journal of Optimization Theory and Applications, 1997, 95: 371-397.
    
    [25] R. A. Gibbs et al, Genome sequence of the brown norway rat yields insights into mammalian evolution, Nature, 2004, 428 (6982): 493-521.
    
    [26] S. Hannenhali and P. A. Pevzner, Transforming cabbage into turnip (Polynomial algorithm for sorting signed permutation by reversals), Proceedings of the 27th Annual ACM Symposium on the Theory of Computing. New York: ACM Press,1995, 178-189.
    
    [27] S. Hannenhalli and P. A. Pevzner, Transforming men into mice paly nomial algorithm for genomic distance problem. Annual IEEE Symposium on Foundations of Computer Science, Washington: IEE Computer Society, 1995, 581-592.
    [28] P. T. Harker and J. S. Pang, Finite-dimensional variational and nonlinear complementarity problems: a survey of theory, algorithm and applications, Mathematical programming, 1990, 48: 161-220.
    
    [29] S. Heber and J. Stoye, Finding all common intervals of k permutations, In Combinatorial Pattern Matching, 12th Annual Symposium, CPM 2001, Lecture Notes in Comput. Sci., Springer-Verlag, 2001, 2089: 207-218.
    
    [30] W. Hock and K. Schittkowski, Test examples for nolinear programming codes,Lecture Notes in Econom. and Math. Systems 187, Springer-Verlag, Berlin, 1981.
    
    [31] S. B. Hoot and J. D. Palmer, Structural rearrangements, including parallel inversions, within the chloroplast genome of Anemone and relaed genera, J. Molecular Evolution, 1994, 38: 274-281.
    
    [32] C. Kanzow and H. -D. Qi, A QP-free constrained Newton-type method for variational inequality problems, Mathematical Programming, 1999, 85: 81-106.
    
    [33] J. Kececioglu and D. Sankoff, Exact and approximation algorithms for the reversal distance between two permutations, Lecture Notes in Computer Science 684, 1993.
    
    [34] W.C. Lathe3rd, B. Snel and P. Bork, Gene context conservation of a highier order that operons, Trends Biochem. Sci., 2000, 25(10): 474.
    
    [35] G. Li, X. Qi, X. Wang and B. Zhu, A linear-time algorithm for computing translo-cation distance between signed genomces [A], Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM 04)[C]. Berlin: Springer-Verlag, 2004, 323-332.
    
    [36] T. Liu, B. M. E. Moret and D. A. Bader, An exact linear-time algorithm for computing genomic distances under inversions and deletions, U. New Mexico,TR-CS-2003-31.
    [37] T. De. Luca, F. Facchinei and C. Kanzow, A semismooth equation approach to the solution of nonlinear complementarity problems, Mathematical Programming,1996, 75: 407-439.
    
    [38] M. Marron, K. M. Swenson and B. M. E. Moret, Genomic distance under deletions and insertions, Theoretical Computer Science, 2004, 325(3): 347-60.
    
    [39] J.D. Palmer and L. A. Herbon, Tricircular mitochondrial genomes of Brassica and Raphanus: reversal of repeat configurations by inversion, Nucleic Acids Research,1986, 14: 9755-9764.
    
    [40] J.D. Palmer and L. A. Herbon, Unicirular structure of the Brassica hirta mitochondrial genome, Current Genetics, 1987, 11: 565-570.
    
    [41] J.D. Palmer and L. A. Herbon, Plant mitochondrial DNA evolves rapidly in structure, but slowly in sequence, J. Molecular Evolution, 1988, 14: 87-97.
    
    [42] J.D. Palmer, B. Osorio and W.R. Thompson, Evolutionary significance of inversions in legume chloroplast DNAs, Current Genetics, 1988, 14: 65-74.
    
    [43] E. R. Panier, A. L. Tits and J. N. Herskovits, A QP-free globally, locally superlin-ear convergent method for inequality constrained optimization problems, SIAM Journal on Control Optimization, 1988, 36: 788-811.
    
    [44] P. A. Pevzner, Computational molecular biology: an algorithmic approach, MIT Press, Cambridge, 2000.
    
    [45] D. Pu and W. Tian, A class of Broyden Algorithms with revised search direction,Asia-Pacific Journal of Operation Research, 1997, 14: 93-109.
    
    [46] X. Qi, Combinatorial algorithms of genome rearrangements in bioinformatics,Shandong University Doctoral Dissertation, 2006.
    
    [47] X. Qi, G. Li and S. Li, A fast algorithm for genomic sorting problem, Mathematica Applicata (Chinese), 2006, 19(1): 66-74.
    [48] H. Qi and L. Qi, A New QP-free, globally convergent, locally superlinear convergent algorithm for inequality constrained optimization, SIAM Journal on Optimization, 2000, 11: 113-132.
    
    [49] X. Qi, Y. Xu and G. Li, Sorting genomes by translocations and deletions, Computational Systems Bioinformatics, CSB 2006 Conference Proceedings, Stanford CA, 14-18 August 2006.
    
    [50] M.-F., Sagot, E. Tannier, Perfect sorting by reversals, Proceedings of COCOON 2005, Lecture Notes in Computer Science, Berlin: Springer, 2005, 3595: 42-51.
    
    [51] D. Sankoff and N. El-Mabrouk, Genome rearrangement, In current topics in computational molecular biology (eds T. Jiang, T. Smith, Y. Xu and M. Q. Zhang),MIT Press, 2002.
    
    [52] J. Sctubal and J. Meidanis, Introduction to computational molecular biology,PWS Publishing Company, 1997.
    
    [53] R. Shamir, Algorithms in molecular biology: Lecture notes, Available at http:www.math.tau.ac.il/ rshamir/algmb/01/algmb01.html, 2002.
    
    [54] E. Tannier, A. Bergeron and M.-F. Sagot, Advances on sorting by reversals, Discrete Applied Mathematics, 2007, 155: 881-888.
    
    [55] E. Tannier and M.-F.Sagot, Sorting by reversals in subquadratic time [A]. Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching (CPM 2004)[C]: 1-13, Berlin: Springer-Verlag, 2004.
    
    [56] T. Uno and M. Yagiura, Fast algorithms to enumerate all common intervals of two permutations, Algorithmica, 2000, 26(2): 290-309.
    
    [57] L. Wang, D. Zhu, X. Liu and S. Ma, An O(n~2) algorithm for signed translocation,Journal of Computer and System Sciences, 2007, 70: 284-299.
    [58] N. H. Xiu and J. Zhang, Some recent advances in projection-type methods for variational inequalities, Journal of Computing and Applied Mathematics, 2003,152: 559-585.
    
    [59] Z. Zhu, A globally and superlinearly convergent feasibel QP-free method for nonlinear programming, Applied Mathematics and Computation, 2005, 168: 519-539.
NGLC 2004-2010.National Geological Library of China All Rights Reserved.
Add:29 Xueyuan Rd,Haidian District,Beijing,PRC. Mail Add: 8324 mailbox 100083
For exchange or info please contact us via email.