基于堆积的RNA假结预测算法

英文题名：RNA Pseudoknots Prediction Algorithm Based on Stack
作者：李恒武
论文级别：博士
学科专业名称：计算机软件与理论
中文关键词：RNA结构预测 ; 假结 ; 计算生物学 ; 近似算法 ; 近似方案 ; 动态规划
英文关键词：RNA Structure Prediction ; Pseudoknots ; Computational Biology ; Approximation Algorithm ; Approximation Scheme ; Dynamic Programming
学位年度：2008
导师：朱大铭
学科代码：081202
学位授予单位：山东大学
论文提交日期：2008-04-05

摘要

RNA结构预测是计算生物学的基本课题之一。RNA二级结构预测是由RNA序列预测其三级结构的第一步。早期采用序列对比分析方法预测RNA二级结构,对于在不同有机体中起相同生物功能的一级结构进行比较得到RNA序列的二级结构。许多RNA分子的同源序列不易得到,需要耗费大量人力,因而序列对比分析方法的预测效率较低。目前人们广泛采用最小能量方法预测RNA二级结构。
     最小能量方法是基于热动力学模型寻找序列所能形成的各种构象中具有最小能量的结构。Zuker提出的MFOLD算法是目前应用较为广泛的二级结构预测算法,其预测正确率为73%左右。但该算法不能预测假结和更复杂的结构。
     假结(pseudoknot)是RNA中最广泛的三级结构单元,是较复杂但稳定的RNA结构。假结预测是目前RNA结构预测研究的关键点和研究热点。假结在不同的RNA分子中具有调节、催化、构造等重要功能。预测包含任意假结的RNA二级结构属于NP-hard类。
     预测简单假结的最小能量算法是人们讨论较多的含假结二级结构预测方法。Rivas和Eddy提出的PKNOTS算法使用O(n~6)时间和O(n~4)空间预测任意的平面假结和部分非平面假结,Jens和Robert提出的PknotsRG算法使用O(n~4)时间和O(n~2)空间预测简单的嵌套假结,Lyngse和Pedersen提出的LP算法使用O(n~5)时间和O(n~3)空间预测一个平面假结。
     组合优化算法只考虑RNA序列中的部分主要作用,如Cary和Storm提出的最大权匹配算法可以折叠RNA假结,但以预测正确率降低为代价。也有将遗传算法、模拟退火算法、神经网络算法应用到RNA假结预测中的计算尝试。
     相邻基对构成堆积(stack),碱基配对和堆积作用是稳定RNA结构的主要作用。堆积最大化问题也是近年来人们十分关注的含假结RNA二级结构预测问题。Ieong等人于2001年提出最大堆积基对数问题,并设计出该问题的近似性能比为3的近似算法。Lyngse于2004年给出了最大堆积基对数问题的精确算法,并且提出最大堆积数问题,证明最大堆积数问题属于NP-hard类,给出了其多项式时间近似方案。
     Lyngsφ将所有堆积等同看待,但RNA结构实验的结果表明,不同类型的堆积有不同的能量,堆积的能量由构成堆积的基对类型所决定。因此我们基于生物学的碱基配对和基对堆积作用,提出了最大权堆积问题,并讨论最大权堆积问题的求解算法和计算复杂性。
     我们将堆积划分为(AA,BB)和(AB,BA)两类,利用(AB,BA)类构成团的特性计算其最优结构,利用堆积的划分特性计算(AA,BB)类的3/2近似的结构,取权值最大者作为整个序列的近似结构,给出了预测任意假结的多项式时间近似算法,其近似性能比为3,时间复杂度为O(nlogn),空间复杂度为O(n)。
     我们将序列划分为长度小于K+1(2≤K)的子序列,使用动态规划计算和保存各类子序列的已配对权值和未配对子序列数,删除不可能结构和权值小的结构,计算长度小于K+1的子序列构成的最优结构作为整个序列的近似结构,给出预测任意假结的多项式时间近似方案。
     目前预测含假结RNA二级结构的多项式时间算法,难以计算大的RNA分子。连续的堆积构成茎(stem),茎的交叉构成假结。因此基于茎区组合来寻找RNA最优结构成为新的RNA结构预测方法,如Benedeti等人提出的茎区堆积算法和李伍举等人提出的茎区随机堆积算法,预测嵌套的RNA二级结构。最近Ruan等人提出基于茎区的启发式算法预测包含假结的二级结构,其时间复杂度为O(n~4),空间复杂度为O(n~2)。
     我们设计了一个预测任意假结的茎最大化的启发式算法。首先搜索序列可能构成的全部茎,找出具有最小能量的最大茎,然后在序列中标记最大茎对应的碱基,使其不再参与后面的配对,再在剩余的碱基中找出次最大茎,依次类推,直至无茎为止。该算法的时间复杂度为O(n~3),空间复杂度为O(n),可以预测长度达5000个碱基的RNA序列。
     我们的算法使用茎代替基对,使用标记代替删除,消除了大量冗余基对和无意义的堆积,提高了预测的正确率。相对O(n~3)时间和O(n~2)空间的最大权匹配算法,其空间复杂度由O(n~2)降至O(n);实验表明其敏感性由80%提高到87.8%,特异性由53.7%提高到75.9%。与83.3%的预测正确率的遗传算法和79.7%的预测正确率的模拟退火算法相比,我们的算法将预测正确率提高到87.5%。
     最小自由能量方法预测RNA结构的关键是结构的表示建模。基于RNA分子茎区相对稳定的结构特征,我们引入半扩展结构和k茎,使用k茎计算半扩展结构,使用半扩展结构的交叉计算嵌套和交叉假结,建立新的RNA假结表示模型。
     平面假结是最广泛的假结子类,PseudoBase数据库中仅有一个非平面假结。基于新的RNA假结表示模型,我们引入假结同轴堆积作用,设计和实现动态规划算法,预测包含任意平面假结和简单非平面假结的RNA二级结构。使用该算法折叠PseudoBase假结数据库中的全部245个序列,173个序列的假结预测正确,其正确率为70.6%,敏感性为83.6%,特异性为76.6%;其中对于3′-UTR其它病毒类的84个序列,71个序列的假结预测正确,其正确率为84.5%,敏感性为91.0%,特异性为91.2%。实验结果表明该算法具有较好的假结预测正确率。
     与目前最好的PKNOTS算法相比,我们的算法计算的序列长度由140个碱基提高到800个碱基,其时间复杂度由O(n~6)降低为O(n~4),空间复杂度由O(n~4)降低为O(n~3)。对于PKNOTS算法的测试序列,其敏感性由82.8%提高为84.8%,特异性由78.9%提高为80.7%。
     与计算简单的嵌套假结的PknotsRG算法相比,我们的算法可计算复杂的嵌套和交叉假结,而时间复杂度与PknotsRG算法相同。对PseudoBase数据库的测试表明,PknotsRG算法的假结预测正确率为68%,而我们的假结预测正确率为70.6%。与计算一个平面假结的LP算法相比,我们的算法将时间复杂度由O(n~5)降低为O(n~4),并且可以预测更复杂的多假结。
     本文包括五部分内容和结果。在第一章中概述RNA结构预测的研究意义、研究现状和基本概念。在第二章中回顾和总结了已有的RNA二级结构预测算法和假结预测算法。在第三章中给出了最大权堆积问题的多项式时间近似算法和近似方案。在第四章中给出预测任意假结的茎最大化的启发式算法和实验结果。在第五章中给出预测平面假结的动态规划算法和实验结果。最后进行简要总结。
     本文的主要创新工作为:
     (1)基于碱基配对和堆积作用,提出了最大权堆积问题,设计了最大权堆积问题的O(nlogn)时间和O(n)空间的近似算法,其近似性能比为3。给出了最大权堆积问题的的多项式时间近似方案。
     (2)设计了O(n~3)时间和O(n)空间的茎最大化的启发式算法,算法可以预测任意假结和长度达5000个碱基的RNA序列。相对O(n~3)时间和O(n~2)空间的最大权匹配算法,其空间复杂度由O(n~2)降至O(n),其敏感性由80%提高到87.8%,特异性由53.7%提高到75.9%。
     (3)建立了含有半扩展结构和k茎的RNA假结表示模型,设计了预测任意平面假结和简单非平面假结的动态规划算法,对PseudoBase数据库全部序列的预测敏感性为83.6%,特异性为76.6%。与目前最好的预测任意平面假结和部分非平面假结的PKNOTS算法相比,其时间复杂度由O(n~6)降低为O(n~4),空间复杂度由O(n~4)降低为O(n~3),计算的序列长度由140个碱基提高到800个碱基;对PKNOTS算法测试集合的预测敏感性由82.8%提高到84.8%,预测特异性由78.9%提高到80.7%。
RNA structure prediction is one of the basic issues in computational biology. RNA secondary structure prediction is the first step to predict RNA tertiary structure from RNA sequence.Comparative sequence analysis method was adopted in early period,which gets secondary structure by comparing RNA sequences of the same biological function in different organisms.Homologous sequences are hard to get for many RNA molecules,and a lot of cost of manpower is needed,so the predicted efficiency of comparative sequence analysis method is low.Minimum free energy method is adopted widely to predict RNA secondary structure now.
     Minimum free energy method searches the structure with minimum free energy from all conformations formed by the given sequence based on thermal dynamic model.MFOLD algorithm,presented by Zuker,has become the most widely used optimal methods for RNA secondary structure prediction,which has the predicted accuracy of about 73%.But MFOLD algorithm can not predict pseudoknots and more complex tertiary structure.
     Pseudoknot is the most extensive RNA tertiary structural element,very complicated and stable RNA structure.Pseudoknots prediction is the key and hot point to the research of RNA structure prediction now.Pseudoknots have regulative, catalytic and structural important function and diversity roles in different RNA molecules.It has been proved to be NP-hard class that predicting RNA secondary structure containing arbitrary pseudokonts.
     More discussions on RNA structure prediction containing pseudoknots are centered on minimum free energy algorithms to predict simple pseudokonts. PKNOTS algorithm,presented by Rivas and Eddy,takes O(n~6)time and O(n~4)space to predict arbitrary planar pseudoknots and partial nonplanar pseudoknots;PknotsRG algorithm,presented by Jens and Robert,takes O(n~4)time and O(n~2)space to predict simple nested pseudoknots;LP algorithm,presented by Lyngsφand Pedersen,takes O (n~5)time and O(n~3)space to predict one planar pseudoknot.
     Combinatorial optimization algorithms fold RNA pseudokonts through simplifing energy model to consider partial main acition,but it sacrifices predicted accuracy,such as maximum weighted matching algorithm presented by Cary and Storm.There are some computer attempts to predict pseudoknots with heuristic approaches include genetic algorithms,simulated annealing algorithms and neural network algorithms.
     Adjacent base pairs form stack,stacking and base pairing actions in RNA molecules are the most primary and stable actions.Maximum stacking problem has also attracted close attention in RNA secondary structure prediction containing pseudoknots.Ieong presented the problem of maximum stacked base pairing number in 2001,designed its approximate algorithm with the approximate performance ratio of 3,and Lyngsφgave its exact algorithm.Moreover Lyngsφpresented the problem of maximum stacking number,proved this problem belongs to NP-hard class and designed its polynomial time approximate scheme.
     Lyngsφtreat all stackes as the same,but RNA structural experimental results indicate that the different types of stackes have the different energy,and the energy of stack is determined by the type of its base pairs.So we present maximum weighted stacking problem based on biological stacking and base pairing actions,and discuss its solving algorithm and complexity.
     We divide stackes into(AA,BB)and(AB,BA)classes,compute the optimal structure of(AB,BA)class by the property of(AB,BA)class constructing clique, calculate the approximate structure of(AA,BB)class by the property of stack partion, get the approximate structure by selecting the one with the maximum weight from the weight of(AA,BB)and(AB,BA)classes,and give polynomial time approximation algorithm to predict arbitrary pseudoknots with O(nlogn)time and O(n)space.The approximate performance ratio of this algorithm is 3.
     We divide the given sequence into subsequences with the length less than K+ 1(2≤K),calculate and store the paired weight and unpaired number of various subsequences,delete impossible structure and the structure with smaller weight by daynamic programming,seach the optimal structure formed by subsequences with the length less than K+1 as the approximate structure of given sequence,and give a polynomial time approximate scheme to predict arbitrary pseudoknots.
     Now it is hard to fold large RNA molecules for existing polynomial time pseudoknots prediction algorithms.As continuous stackes construct stem and crossed stems construct pseudoknots,search the optimal structures based on combination with stem zones has become new method to RNA structure prediction,such as stem zone stacking algorithm presented by Benedeti and stem zone random stacking algorithm presented by Li WJ,predict nested secondary structures.Recently Ruan give a heuristic algorithm based on stem zone to predict secondary structure containing pseudoknots with O(n~4)time and O(n~2)space.
     We design a heuristic algorithm to maximize stem and predict arbitrary pseudoknots.First we search all stems of given sequence,select the largest stem with the minimal energy,and mark the bases in the selected stem to make them don't take part in back pairing.Then we select the largest stem from the remainder bases,and follow on until no stem.It takes O(n~3)time and O(n)space to predict RNA sequences with 5000 bases.
     Our algorithm replace base pair with stern,and replace delete with mark,which remove lots of redundancy base pair and nonsense stack,and increase predicted accuracy.Compared with maximum weighted matching algorithm with O(n~3)time and O(n~2)space,our algorithm reduce space complexity from O(n~2)to O(n);and the experimental results show that its sensitivity is improved form 80%to 87.8%,and specificity is increased from 53.7%to 75.9%.Compared with genetic algorithm with the accuracy of 83.3%and simulated annealing algorithm with the accuracy of 79.7%,our algorithm increases the predicted accuracy to 87.5%.
     The key to RNA structure prediction with minimal free energy method is modeling RNA structure.Based on the characteristic of relative stablisity of RNA stem zone,we introduce semi-extensible structures and k-stems,compute semi-extensible structures with k-stems,calculate nested and crossed pseudoknots with the crossing of semi-extensible structures,and establish a new model to express pseudoknots.
     Planar pseudoknot is the most extensive subclass of pseudoknots.In Pseudobase, there is only one nonplanar pseudoknot,and others are all planar pseudoknots.Based on new model,we introduce coaxial stacks actions,design and implement dynamic programming algorithm to predict arbitrary planar pseudoknots and simple nonplanar pseudoknots.All 254 pseudoknots in the PseudoBase are computed with this algorithm,the predicted structures of 173 sequences contain correct pseudoknots,the forecast accuracy is 70.6%;the average sensitivity is 83.6%and the average specificity is 76.6%.For 84 sequences of 3-UTR other virus,the forecast accuracy is 84.5%,the average sensitivity is 91.0%and the average specificity is 91.2%. Experimental results show that this algorithm has good accuracy.
     Compared with the best RE algorithm,the length of the given sequence computed by our algorithm is improved from 140 bp to 800 bp,the time complexity is reduced from O(n~6)to O(n~4)and the space complexity is droped from O(n~4)to O(n~3); for PKNOTS tested sequences,the sensitivity is increased from 82.8%to 84.8%and the specificity is raised from 78.9%to 80.7%.
     Compared with PknotsRG algorithm with O(n~4)time and O(n~2)space to compute simple nested pseudoknots,our algorithm can compute more complex nested and crossed pseudoknots with the same time complexity as PknotsRG algorithm.The test results for PseudoBase data base indidcate that PknotsRG algorithm has the predicted accuracy of 68%and our algorithm has the one of 70.6%for pseudoknots. Compared with LP algorithm to compute one planar pseudoknot with O(n~5)time and O(n~3)space,our algorithm can compute more complex multi-pseudoknots,and reduce the time complexity from O(n~5)to O(n~4).
     There are five parts of contents and results in this paper.The basic concepts, research significance and status are summarized in the first chapter.Existing RNA structure prediction algorithms for nested structure and that for pseudoknots are reviewed and summarized in the secondary chapter.Polynomial approximation algorithm and approximation scheme for maximum weighted stacking problem are designed in the third chapter.Heuristic algorithm to maximize stem and its experimental results are given in the fourth chapter.Dynamic programming algorithm and its experimental results are given to predict planar pseudoknots in the fifth chapter. The end is a brief summary.
     There are three innovation points in this paper as follows.
     (1)Based on stacking and base pairing actions,we present maximum weighted stacking problem and design its polynomial time approximation algorithm with O(nlogn)time and O(n)space and its polynomial time approximation scheme. The approximate performance ratio of this approximation algorithm is 3.
     (2)We design a heuristic algorithm to maximize stem with O(n~3)time and O(n) space to predict RNA sequences with 5000 bases.Compared with maximum weighted matching algorithm with O(n~3)time and O(n~2)space,our algorithm reduce space complexity from O(n~2)to O(n),and the experimental results show that its sensitivity is improved form 80%to 87.8%,and specificity is increased from 53.7%to 75.9%.
     (3)We establish a new model to express pseudoknots including semi-extensible structures and k-stems,design a dynamic programming algorithm to predict arbitrary planar pseudoknots and simple nonplanar pseudoknots.The predicted sensitivity is 83.6%and the predicted specificity is 76.6%for PseudoBase data base.Compared with the best RE algorithm to predict arbitrary planar pseudoknots and partial nonplanar pseudoknots,the length of the given sequence computed by our algorithm is improved from 140 bp to 800 bp,the time complexity is reduced from O(n~6)to O(n~4)and the space complexity is droped from O(n~4)to O(n~3);for PKNOTS tested sequences,the sensitivity is increased from 82.8%to 84.8%and the specificity is raised from 78.9%to 80.7%.

引文

[1]Jiang T,XuY,Zhang MQ.Current Topics in Computational Molecular Biology.北京:清华大学出版社,2002
    [2]Miyano S.Research in Computational Molecular Biology:9th Annual International 计算分子生物学研究.北京:燕山出版社,2005.
    [3]Stajich JE.An Introduction to BioPerl Methods.Mol.Biol.2007,406:535-48.
    [4]帕夫纳PA.计算分子生物学-算法逼近.王翼飞等译.北京:化学工业出版社,2004
    [5]赛图宝J,梅丹尼斯J.计算分子生物学导论.朱浩等译.北京:科学出版社,2003
    [6]Couzin J.Break throug of the Year:Small RNAs Make Big Splash.Science.2002,298:2296-2297
    [7]Hannon GJ.RNA interference.Nature.2002:418(6894):244-51
    [8]W.Gilbert.The RNA world.Nature.1986:319-618
    [9]Dennis,C.,The brave new world of RNA.Nature,2002,418:122-12
    [10]Steitz TA,Moore PB.RNA,the first macromolecular catalyst:the ribosome is a ribozyme.Trends Biochem Sci.2003,28(8):411-418
    [11]Ke A,Zhou K,Ding F,Cate JH,Doudna JA.Conformational switch controls hepatitis delta virus ribozyme catalysis.Nature.2004,429:201-205
    [12]Mathews D,Sabina J,Zuker M,Turner D.Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure.Journal of Molecular Biology.1999,288:911-940
    [13]徐琳、李晓民、谭光明、刘新春、卜东波、冯圣中、孙凝晖.面向FPGA 的RNA二级结构预测并行算法研究.计算机学报.2006,2(29):233-238
    [14]Rivas E,Eddy SR.Noncoding RNA gene detection using comparative sequence analysis.BMC Bioinformatics.2001,2:8
    [15]Juan V,Wilson C.RNA secondary structure prediction based on free energy and phylogenetic analysis.J Mol Biol.1999,289:935-947.
    [16]Rosalía AH,Holger HH,Anne C.Computational RNA secondary structure design:empirical complexity and improved methods.BMC Bioinformatics.2007, 8:34.
    [17] Mathews D, Turner D. Dynalign: An algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology. 2002, 317(2):191-203.
    [18] Tinoco I Jr, Uhlenbeck OC, Levine MD. Estimation of secondary structure in ribonucleic acids. Nature. 1971, 5293 (230):362-367
    [19] Mathews DH., Turner DH. Prediction of RNA secondary structure by free energy minimization. Current Opinion in Structural Biology. 2006,16:270-278
    [20] Pipas JM, Mcmahon JE. Method for predicting RNA secondary structure. Proc.Natl. Acad.Sci. USA .1975,72:2017-2021
    [21] Studnicka GM, Rahn GM, Cummings IW, Salser WA. Computer method for predicting the secondary structure of single-stranded RNA. Nucleic Acids Res. 1978, 5:3365-3387
    [22] Gouy M. In Nucleic Acid and Protein Sequence Analysis: A practical Approach. Oxford: IRL Press, 1987, 259-84
    [23] Eddy SR. How do RNA folding algorithms work? Nature Biotechnology. 2004, 22:1457-8.

    [24] Zuker M. In mathmatical Methods for DNA Sequence. Fla: CRC Press, 1987
    [25] Turner DH, Sugimoto N, Freier SM, Ann. Rew. Improved parameters for prediction of RNA structure. Biophys Chem.1988,17: 167-1921
    [26] Nussinov R, Pieczenik G, Griggs JR, kleitman DJ. Algorithms for loop matchings. SIAMJ. Appl. Math. 1978,35:68-52.
    [27] Zuker M, Sitegler P. Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information. Nucleic Acids Res. 1981,9:133-148
    [28] Sankoff D. Simultaneous solution of the RNA folding, alignment, and protosequence problems. SIAM J.Appl. Math. 1985, 45: 810-825
    [29] Zuker M. Computer prediction of RNA structure. Methods Enzymol. 1989, 180: 262-288
    [30] Schuster P, Fontana W, Stadler PF, Hofacker IL. From sequences to shapes and back: a case study in RNA secondary structure. Proc. Roy. Soc.ser. B, 1994, 255, 279-284.
    [31] Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH. Incorporating chemical modification constraintsinto a dynamic programming algorithm for prediction of RNA secondary structure.Proc.Natl.Acad.Sci.USA 2004,101:7287-7292.
    [32]F(?)urtig B,Richter C,W(?)ohnert J,Schwalbe H.NMR spectroscopy of RNA.Chembiochem.2003,4(10):936-962.
    [33]Pleij CWA,Rietveld K,Bosch L.A new principle of RNA folding based on pseudoknotting.Nucl Acids Res.1985,13:17217-1731
    [34]Kolk MH.,van der Graff M,Wijmenga SS,Pleij CWA.Heus HA,Hilbers CW.NMR structure of a classical pseudoknot:interplay of single- and double-stranded RNA.Science.1998,280:434-438
    [35]Staple DW,Butcher SE.Pseudoknots:RNA Structures with Diverse Functions.PLoS Biol.2005,3:e213
    [36]Adams PL,Stahley MR,Kosek AB,Wang J,Strobel SA.Crystal structure of a self-splicing group I intron with both exons.Nature,2004,430:45-50.
    [37]Theimer CA,Blois CA,Feigon J.Structure of the human telomerase RNA pseudoknot reveals conserved tertiary interactions essential for function.Mol Cell.2005,17:671-682.
    [38]Nixon PL,Rangan A,Kim YG,Rich A,Hoffman DW.Solution structure of a luteoviral P1-P2 frameshifting mRNA pseudoknot.J Mol Biol.2002,322:621-633.
    [39]PseudoBase homepage:http://wwwbio.LeidenUniv.nl/~Batenburg/～PKB.htm
    [40]Barette I,Poisson G,Gendron P.Pseudoknots in prion protein mRNAs confirmed by comparative sequence analysis and pattern searching.Nucleic Acids Research.2001,29(3):753-758
    [41]Lyngsφ RB,Christian CNS.Pseudoknots in RNA Secondary Structure.Proceedings of Recomb,Tokyo:2000.
    [42]Rivas E,Eddy SR.A dynamic programming algorithm for RNA structure prediction including pseudoknots.Journal of Molecular Biology.1999,285:2053-2068
    [43]Lyngsφ RB,Pedersen CNS.RNA pseudoknot prediction in energy based models.Journal of Computational Biology.2001,7:409-428
    [44]Jens R,Robert G.Design,implementation and evaluation of a practical pseudoknot folding algorithm based on thermodynamics.BMC Bioinformatics.2004,5(1):104
    [45]Cary R B,StormoGD.Graph-theoreticApproach to RNA Modeling Using Comparative Data,In Proceedings of the Third International Confeemce on Intelligent Systems for Molecular Biology.Menlo Park:AAAI Press,1995
    [46]Tabaska J-E,Cary RB,Gabow HN,Stormo GD.An RNA folding method capable of identifying pseudoknots and base triples.Bioinformatics.1998,14(8):691-699
    [47]Haijun Liu,Dong Xu,Jianlin Shao and Yifei Wang.An RNA folding algorithm including pseudoknots based on dynamic weighted matching.Computational Biology and Chemistry.2006,30(1):72-76
    [48]陈世杰、谭志杰、曹松、张文炳.RNA分子折叠的统计力学.物理.2006,3:218-229
    [49]Ieong S,Kao MY,Lam TW,Sung WK,Yiu SM.Predicting RNA secondary structures with arbitrary pseudoknots by maximizing the number of stacking pairs.In Proceedings of the 2nd Symposium on Bioinformatics and Bioengineering.2001,183-190
    [50]LyngsΦ RB.Complexity of Pseudoknot Prediction in Simple Models.ICALP,2004,LNCS3142,919-931.
    [51]Dowell RD,Eddy SR:Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction.BMC Bioinformatics 2004,5:71.
    [52]Van Bantenburg FHD,Gultyaev AT.An APL-programmed Genetic Algorithm for the Prediction of RNA Secondary Structure.J.theo.Bio.1995,174:269-280
    [53]Gultyaev AP,van Batenburg FH,Pleij CWA.The computer simulation of RNA folding pathways using a genetic algorithm.J.Mol.Biol.1995,250:37-51
    [54]Schmitz M,Steger G.Description of RNA folding by Simulated Annealing.J.Mol.Biol.1996,255(1):254-266
    [55]任清华,莫忠息,陶玉敏.预测RNA二级结构的一种遗传模拟退火算法.武汉大学学报(理学版).2004,50(1):23-28
    [56]Ren J,Rastegari B,Condon A,Hoos HH.HotKnots:heuristic prediction of RNA secondary structures including pseudoknots.RNA.2005,11:1494-1504.
    [57]Abrahams JP,van der Berg M.Prediction of RNA secondary structure, including pseudoknotting,by computer simulation.Nucl.Acids Res.1990,18:3035-3044
    [58]Benedeti G,Santis PD,Morseti SA.New method to find a set of energetically optimal RNA secondary structure.Nucl.Acids.Res.1989,17(13):5149-5161
    [59]Li WJ,Wu JJ.Prediction of RNA Secondary Structure Based on Helical Regions Distribution.Bioinformatics.1998,14:700-706
    [60]Ruan J,Stormo GD,Zhang W.An Iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots.Bioinformatics.2004,20(1):58-66.
    [61]Hochbaum DS.Efficient bounds for the stable set,vertex cover,and set packing problems.Discrete Appl.Math.1983,6 243-254.
    [62]Papadimitriou CM.Computational Complexity.Addison-Wesley Publishing Company,Inc.1994
    [63]Barette I,Poisson G,Gendron P.Pseudoknots in prion protein mRNAs confirmed by comparative sequence analysis and pattern searching.Nucleic Acids Research.2001.29(3):753-758.
    [64]Onoa B,Tinoco I.RNA folding and unfolding.Curr Opin Struct Biol.2004,14(3):374-379.
    [65]Serra MJ,Turner DH.Predicting the thermodynamic properties of RNA.Methods Enzymol.1995.259:242-261.
    [66]Zhang K,Shasha D.SIAM Journal on Computing.1989,18:1245-1262
    [67]Matthews DH,Andre TC,Kim J,Turner DH,Zuker M.Molecular Modeling of Nucleic Acids.American Chemical Society.1998
    [68]Akutsu,T.Dynamic programming algorithms for RNA secondary structure prediction with pseudoknots.Discrete Applied Mathematics.2000,104:45-62
    [69]朱大铭,马绍汉,雷鹏.翻转距离星树问题的复杂度和近似算法.软件学报.2002,13(8)
    [70]朱大铭,马绍汉.基因组translocation排序问题的改进多项式算法.计算机学报.2002
    [71]朱大铭,栾俊峰,马绍汉.Hardness and methods to solve clique.JCST 2001
    [72]栾俊峰,朱大铭,马绍汉.目标序列部分确定的翻转距离星树问题.软件学报.2003
    [73]栾俊峰,朱大铭,马绍汉.计算生物学中有关基因组翻转距离的NPC问题.计算机科学.2002
    [74]栾俊峰,朱大铭,马绍汉.计算生物学中的9叶星树问题.计算机科学.2002
    [75]Li Hengwu,Zhu Daming,Liu Zhendong,Li Hong.Prediction for RNA planar pseudoknots.Progress in Nature Science.2007,17(6):717-724
    [76]HengWu Li,DaMing Zhu.A New Pseudoknots Folding Algorithm for RNA Structure Prediction.LNCS,springer-verlag.2005,3595:94-103.
    [77]HengWu Li,DaMing Zhu.Approximation Scheme for RNA Structure Prediction Based on Base Pair Stacking.SMO,WSEAS PRESS,2007,426-429
    [78]HengWu Li,DaMing Zhu.New algorithm for predicting ribonucleic acid secondary structure including pseudoknots.Journal of Tongii University.2004,10:176-178
    [79]HengWu Li.Approximation Algorithm for Pseudoknotted RNA Structure Prediction.CIS.2007,IEEECS,108-111.
    [80]GuohuiYao,Daming Zhu,Hengwu Li,Shaohan Ma.A polynomial algorithm to compute the minimum degree spanning trees of directed acyclic graphs with applications to the broadcast problem.Discrete Mathematics.2007,7
    [81]李恒武,朱大铭,纪秀花.RNA二级结构预测算法的设计与实现.计算机工程与科学.2006,28(7):82-84
    [82]刘振栋,李恒武,朱大铭.计算最大堆积的RNA二级结构预测算法.南京大学学报.2005,41(5):532-537
    [83]李恒武,朱大铭.RNA二级结构预测算法.计算机科学.2002,29(9):275-278
    [84]HengWu Li,DaMing Zhu.An Algorithm for Predicting RNA Secondary Structure Including Pseudoknots.Journal of communication and computer.2005,2(7):42-46
    [85]Hengwu Li,Daming Zhu.Algorithm for RNA pseudoknots prediction.Intematioal journal of compute science and network security.2006,11(6):265-272
    [86]李恒武,朱大铭.预测RNA二级结构的新算法.计算机科学.2004,31(10):142-144
    [87]Barette I,Poisson G,Gendron P,Major E Pseudoknots in prion protein mRNAs confirmed by comparative sequence analysis and pattem searching.Nucleic Acids Research.2001,29(3):753-758
    [88]Hofacker I,Fontana W,Stadler P,Bonhoeffer L,Tacker M,Schuster P.Fast folding and comparison of RNA secondary structures.Monatshefte Chemic.1994,125:167-188
    [89]Deogun J,Donis E,Komina O,Ma F.RNA secondary structure prediction with simple pseudoknots.In Proc Second Asia-Pacific Bioinformatics Conference.2004,239-246
    [90]Giegerich R,Meyer C.Algebraic Dynamic Programming.In Algebraic Methodology And Software Technology.9th International Conference,AMAST.France:Springer LNCS 2422,2002,349-364
    [91]Giegerich R.Explaining and controlling ambiguity in dynamic programming.In Proc Combinatorial Pattern Matching.Springer Verlag,2000,46-59
    [92]Wuchty S,Fontana W,Hofacker I,Schuster P Complete suboptimal folding of RNA and the stability of secondary structures.Biopolymers.1998,49:145-165
    [93]Tinoco I Jr,Bustamante C.How RNA folds.Journal of Molecular Biology.1999,293:271-281
    [94]Rivas E,Eddy SR.The language of RNA:a formal grammar that includes pseudoknots.Bioinformatics.2000,16(4):334-340
    [95]Cai L,Malmberg RL,Wu Y.Stochastic modeling of RNA pseudoknotted structures:a grammatical approach.Bioinformatics.2003,19:66-73
    [96]Giegerich R,Meyer C,Steffen P.A discipline of dynamic programming over sequence data.Science of Computer Programming.2004,51(3):215-263.
    [97]Macke T,Ecker D,Gutell R,Gautheret D,Case D,Sampath R.RNAMotif,an RNA secondary structure definition and search algorithm.Nucleid Acids Research.2001,29(22):4724-4735.
    [98]Giegerich R.A systematic approach to dynamic programming in bioinformatics.Bioinformatics.2000,16:665-677
    [99]Gultyaev AP,van Batenburg F,Pleij C.An approximation of loop free energy values of RNA H-pseudoknots.RNA.1999,5:609-617.
    [100]Dirks R,Pierce NA.A partition function algorithm for nucleic acid secondary structure including pseudoknots.Journal of Computational Chemistry.2003,24:1664-1677.
    [101]Evers D,Giegerich R.Reducing the conformation space in RNA structure prediction.In German Conference on Bioinformatics.2001,118-124.
    [102]Rijk PD,Wuyts J,Wachter RD.RnaViz2:an improved representation of RNA secondary structure.Bioinformatics.2003,19(2):299-300.
    [103]Reeder J,Hochsmann M,Rehmsmeier M,Voss B,Giegerich R Beyond Mfold:recent advances in RNA bioinformatics.J Biotechnol.2006,124:41-55.
    [104]Hendrix DK,Brenner SE,Holbrook SR.RNA structural motifs:building blocks of a modular biomolecule.Q Rev Biophys.2005,38:221-43.
    [105]Tumpey TM,Maines TR,Van Hoeven N,Glaser L,Sol6rzano A,et al.A two-amino acid change in the hemagglutinin of the 1918 influenza virus abolishes transmission.Science.2007,315:655-659.
    [106]Dawson W,Fujiwara K,Futamura Y,Yamamoto K,Kawai G A method for finding optimal RNA secondary structures using a new entropy model(vsfold).Nucleotides,Nucleosides,and Nucl Acids.2006,25:171-189.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700