用户名: 密码: 验证码:
基于比较序列分析的RNA二级结构预测与评估
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着越来越多非编码基因及其功能被识别和揭示,人们逐渐认识到非编码RNA和蛋白质分子一样重要,甚至是主要的功能性分子。二级结构预测是非编码RNA识别及其功能研究的根本途径与核心基础,因此RNA二级结构预测方法的研究具有重要的科学意义。
     基于比较序列分析的RNA二级结构预测方法精度最高、效果最好、应用最普遍。在这一类方法中,算法的输入是一组同源RNA序列或由它们组成的RNA多序列比对,算法的目标是求出所有RNA序列共有的最优二级结构。目前,基于比较序列分析的RNA二级结构预测方法还存在以下五个问题:(1)如何降低二级结构检测或预测算法的计算复杂度,同时又能保证算法的精度?(2)如何设计基于生物知识的、启发式的二级结构预测算法?(3)如何构建高质量、高精度的RNA多序列比对,以提高二级结构预测的精度?(4)如何在二级结构预测算法中引入更多更详细的参考信息(如系统进化信息)以提高算法的精度?(5)如何在二级结构的预测过程中对得到的二级结构进行评估,从而给出精度更高、可信度更大的预测结果?本文针对以上问题进行了深入的分析和研究,分别提出和实现了相应的解决方法,并在相应的数据集上对它们进行了测试验证和比较分析。本文的主要工作和创新之处概括如下:
     (1)提出位置矩阵和位置向量的概念及理论。
     本文提出的位置矩阵是一种特殊的n×n矩阵,n为RNA序列或RNA多序列比对的长度,矩阵的类型有两种:单个RNA序列的位置矩阵和RNA多序列比对的位置矩阵。单个RNA序列的位置矩阵元素取值类型有三种:0、1、-1,通过检测矩阵的行中连续非0区域,可以方便准确地识别出RNA序列中连续碱基配对区域(即stem)。RNA多序列比对的位置矩阵元素取值类型有两种:0、1,通过检测矩阵的行中连续“1”区域,可以方便准确地识别出RNA多序列比对中保守的连续碱基配对区域(即保守stem)。本文提出的位置向量是一种特殊的n维向量,n为RNA序列或RNA多序列比对的长度,位置向量的类型有两种:单个RNA序列的位置向量和RNA多序列比对的位置向量。位置矩阵记录了RNA序列或RNA多序列比对的全部可能的折叠方式,位置向量则记录了RNA序列或RNA多序列比对在某种折叠方式下的具体二级结构。理论分析和数值实验表明,上述理论能够有效地帮助解决RNA二级结构预测中的若干相关问题。
     (2)提出基于信噪比度量的RNA二级结构评估方法。
     stem是RNA二级结构的最基本组成单元,本文以stem而非整个二级结构为建模对象,针对不同问题,提出不同的评估算法,并应用到相应问题的解决方法中。概括起来,本文提出的RNA二级结构评估算法可以分为两类:针对RNA序列中stem的评估算法和针对RNA多序列比对中保守stem的评估算法。对于前者,本文以stem中包含的碱基对个数为参考计算信噪比值Signal-to-Noise;对于后者,本文以保守stem中包含的所谓“列对”个数为参考计算信噪比值Signal-to-Noise。数值实验表明,这两类评估算法在相应问题的解决中均能有效地改善相应方法的性能。
     (3)提出基于多序列比对的RNA二级结构检测与评估方法。
     RNA二级结构检测是识别非编码RNA的关键过程,本文以RNA多序列比对为处理对象,采用比较序列分析策略,利用位置矩阵、位置向量理论和信噪比度量方法,提出基于保守stem检测与评估的RNA二级结构检测与评估算法。理论分析和数值实验表明,本文方法均优于主流方法QRNA和ddbRNA。与QRNA相比,本文方法具有计算复杂度低、适于RNA多序列比对(包含两条以上序列)和敏感性高等优点;与ddbRNA相比,本文方法具有敏感性和特异性均高、适于包含空位的RNA多序列比对等优点。
     (4)提出基于位置矩阵和位置向量的RNA二级结构预测方法。
     这是本文提出的位置矩阵和位置向量理论在RNA二级结构预测中的直接应用。首先,论文提出一种基于“种子-扩展”的启发式RNA二级结构预测方法;其次,论文提出一种基于保守stem检测与评估的混合式RNA二级结构预测方法。对于每一种方法,论文分别在不同的输入(RNA多序列比对或未比对的同源RNA序列集合)情况下,给出了不同的具体实现算法。对于每个算法,论文均给出了相应的数值实验和性能分析。实验结果表明:在以RNA多序列比对为输入的情况下,两种方法均优于同类方法RNAalifold;在以未比对的同源RNA序列集合为输入的情况下,两种方法均优于同类方法Mfold。
     (5)提出基于位置矩阵和位置向量的RNA多序列结构比对构建方法。
     构建高质量的RNA多序列结构比对是基于比较序列分析的RNA二级结构预测方法中关键步骤。本文以位置矩阵、位置向量理论和信噪比度量为基本方法,以“种子-扩展”为基本思想,以未比对的同源RNA序列集合为输入,提出一种基于保守stem检测与评估的RNA多序列结构比对构建方法。论文首先阐述了RNA序列的结构比对问题,然后给出了本文方法的详细描述,最后给出了该方法的数值实验和性能分析,实验结果表明:该方法明显优于当前主流方法Clustal W。
     (6)提出基于上下文无关随机文法和系统进化分析的RNA二级结构预测方法。
     系统进化信息是生物序列分析中重要的参考信息。本文通过把更加丰富、复杂的同源RNA序列系统进化信息融合到RNA二级结构预测过程中,提出一种新的混合RNA二级结构预测方法。首先,论文定义了新的剖面上下文无关随机文法,以实现对RNA多序列比对及其一致二级结构的建模;其次,论文定义了两个不同的隐马氏模型,分别对RNA序列的非结构区域和结构区域的系统进化过程进行建模;最后,论文通过把此二个隐马氏模型融合到新定义的剖面上下文无关随机文法中,提出一种新的全概率模型以计算最优一致二级结构。数值实验表明:本文提出的方法优于当前主流方法Pfold,尤其当输入的RNA多序列比对中包含的序列个数更多、空位更多、序列保守性更低时,这种优势更加明显。
It has been understood that ncRNAs are important and main functional molecules as well as proteins since more and more ncRNAs are found or identified. The prediction of RNA secondary structure is the essential way and central foundation for identifying and understanding ncRNAs. Therefore, the studying of methods for predicting RNA secondary structure is very important in sicence.
     The best and widely used methods for predicting RNA secondary are all based on comparative sequence analysis. In these methods, the input for the algorithm is a set of RNA sequences or an alignment of multiple RNA sequences, and the target of the algorithm is to compute the optimized secondary structure common to all sequences. However, five intractable problems exist in the methods for predicting RNA secondary structure based on comparative sequence analysis: (1) how to reduce computing complexity of the algorithm without leading to decreased accuracy of the prediction? (2) How to devise methods for predicting secondary structure using biological knowledge or heuristic procedures? (3) How to construct high-quality and high-precision structural alignment of multiple RNA sequences for improving the accuracy of the predicting algorithm? (4) How to introduce more detailed reference information such as evolutionary information to better the prediction of RNA secondary structure? (5) How to obtain highly precise and highly credible results of predicting RNA secondary structure by assessing the predicted secondary structures? In this dissertation, we go deep into the problems mentioned above, design and implement corresponding solutions, test and evaluate the proposed algorithms on corresponding data sets. The major content and innovation of this work are:
     (1) The theory of position matrix and position vector.
     The position matrix presented in this research is a special n×n matrix, where n is the length of the RNA sequence or RNA alignment. There are two kinds of position matrices: the position matrix for single RNA sequence and the position matrix for alignment of multiple RNA sequences. As for the former, the elements of the matrix are composed of 0, 1 and -1. The regions of continuous base pairs (i.e. the stem) in the RNA sequence can conveniently and exactly be identified by detecting the regions of continuous non-zero in the rows of the matrix. As for the latter, the elements of the matrix are composed of 0 and 1. The regions of conserved continuous base pairs (i.e. the conserved stem) in the RNA alignment can conveniently and exactly be identified by detecting the regions of continuous "1" in the rows of the matrix. The position vector presented here is a special vector of n dimensions, where n is the length of the RNA sequence or alignment. There are two kinds of position vectors: the position vector for single RNA sequence and the position vector for RNA alignment of multiple sequences. The position matrix records all the possible folding of the RNA sequence or multiple RNA sequence alignment. The position vector records the detailed secondary structure of the RNA sequence or multiple RNA sequence alignment under some folding. Theoretic analysis and experimental results show that the theory mentioned above can be efficiently applied to solving some corresponding problems about RNA secondary structure prediction.
     (2) The method for assessing RNA secondary structure using Signal-to-Noise.
     In this document, different assessing algorithms for different problems are proposed by taking the stems which are the basic building blocks of the RNA secondary structure as objects to be modeled. In summary, there are two kinds of assessing algorithms proposed in this research: the algorithms for assessing stems in single RNA sequence and the algorithms for assessing conserved stems in the RNA alignment. As for the former, the Signal-to-Noise is computed on the basis of base pairs in the stem. As for the latter, the Signal-to-Noise is computed on the basis of so-called column pairs in the conserved stem. Experimental results show that both of them can efficiently improve the methods for solving corresponding problems.
     (3) The method for detection and assessment of RNA secondary structure using multiple sequence alignment.
     The key for identifying ncRNA is to detect its secondary structure. Here we take the RNA alignment as input, use comparative sequence analysis, the theory of position matrix and position vector, and the method of Signal-to-Noise to devise the algorithm for detecting and assessing RNA secondary structure on the basis of detection and assessment of conserved stems. The theoretic analysis and experimental results show that our method is better than both QRNA and ddbRNA which are both popular methods for predicting RNA secondary structure at present. Compared with QRNA, our method has lower computing complexity, higher sensitivity and can be used to RNA alignment of more than two sequences. Compared with ddbRNA, our method has higher both sensitivity and specificity, and can be used to gapped RNA alignment.
     (4) The method for RNA secondary structure prediction using position matrix and position vector.
     This is the direct applying of the theory of position matrix and position vector to RNA secondary structure prediction. First, a heuristic method for predicting RNA secondary structure is proposed based on the "seed-expanded" idea. Second, a combined method for predicting RNA secondary structure is proposed based on detection and assessment of conserved stems. For each of the proposed methods, we implement it as two different algorithms according to different inputs (the alignment of multiple RNA sequences or the set of unaligned RNA sequences). For each of the implemented algorithms, we test it and analyze the performance of it. The experimental results suggest that both of the proposed methods are better than RNAalifold when the input is the RNA alignment, and both of them are better than Mfold when the input is the set of unaligned RNA sequences.
     (5) The method for constructing structural alignment of RNA sequences using position matrix and position vector.
     The key for RNA secondary structure prediction using comparative sequence analysis is constructing high-quality structural alignment of RNA sequences. In this research, a new method for building structural alignment of RNA sequences is proposed based on detection and assessment of conserved stems, using the theory of position matrix and position vector and Signal-to-Noise as basic approaches, the idea of "seed-expanded" as basic strategy, and the set of unaligned RNA sequences as input. In this thesis, the problem of structural alignment of RNA sequences is first introduced and then a new method for constructing high-precision structural alignment of multiple RNA sequences is described in detail. And finally the testing and analyzing of the method is provided in the thesis. The experimental results show that our method is overwhelmingly better than Clustal W which is a popular method for multiple sequence alignment at present.
     (6) The method for predicting RNA secondary structure using profile stochastic context-free grammars and phylogenic analysis.
     Evolutionary information is very important reference in the analysis of biological sequences. In this research, a new method for predicting RNA secondary structure based on Profile SCFG and phylogenic analysis is presented by integrating more complicated evolutional information of homologous sequences with the prediction of secondary structure. First, a new Profile SCFG is defined for modeling RNA alignment and its consensus secondary structure. Then, two different HMMs are defined for respectively modeling structural regions or non-structural regions in the RNA sequences. Finally, a new probabilistic model for computing the optimized consensus secondary structure is proposed by integrating the HMMs into the Profile SCFG. The method presented here and the Pfold are respectively tested on the data sets built from Rfam database. Experimental results show that our method is better than Pfold, especially when the input alignment contains more sequences and more gaps, and has lower sequence conservation.
引文
[1]Eddy,S.R.Computational genomics of noncoding RNA genes.Cell,2002(109):137-140
    [2]Kumar,M.and G.Carmichael.Antisense RNA:function and fate of duplex RNA in cells of higher eukaryotes.Microbiol Mol Biol Rev,1998(62):1415-1434
    [3]Doudna,J.and T.Cech.The chemical repertoire of natural ribozymes.Nature,2002(418):222-228
    [4]Poole,A.,D.Jeffares,and D.Penny.The path from the RNA world.J Mol Evol,1998(46):1-17
    [5]Erdmann,V.,et al.Regulatory RNAs.Cell Mol Life Sci,2001(58):960-977
    [6]Jeffares,D.,A.Poole,and D.Penny.Relics from the RNA world.J Mol Evol,1998(46):18-36
    [7]Calin,G.,et al.Ultraconserved regions encoding ncRNAs are altered in human leukemias and carcinomas.Cancer Cell,2007(12):215-229
    [8]Steigele,S.,et al.Comparative analysis of structured RNAs in S.cerevisiae indicates a multitude of different functions.BMC Bioinformatics,2007(5):25
    [9]Pauler,F.,M.Koerner,and D.Barlow.Silencing by imprinted noncoding RNAs:is transcription the answer? Trends Genet,2007(23):284-292
    [10]Sasaki,Y.,et al.Coordinated expression of ncRNAs and HOX mRNAs in the human HOXA locus.Biochem Biophys Res Commun,2007(357):724-730
    [11]Tsang,W.,et al.Induction of drug resistance and transformation in human cancer cells by the noncoding RNA CUDR.RNA,2007(13):890-898
    [12]Dye,M.,et al.Turnover and function of noncoding RNA polymerase Ⅱtranscripts.Cold Spring Harb Symp Quant Biol,2006(71):275-284
    [13]Madej,M.,J.Alfonzo,and A.Huttenhofer.Small ncRNA transcriptome analysis from kinetoplast mitochondria of Leishmania tarentolae.Nucleic Acids Research,2007(35):1544-1554
    [14]秦云霞等.非编码RNA及其研究进展.生物技术通报,2004(5):9-12
    [15]Storz,G.,J.Opdyke,and A.Zhang.Controlling mRNA stability and translation with small,noncoding RNAs.Curr Opin Microbiol,2004(7):140-144
    [16]Voss,B.,et al.A motif-based search in bacterial genomes identifies the ortholog of the small RNA Yfr1 in all lineages of cyanobacteria.BMC Genomics,2007(8):375
    [17]Yeung,M.,M.Benkirane,and K.Jeang.Small non-coding RNAs,mammalian cells,and viruses:regulatory interactions? Retrovirology,2007(4):74
    [18]Coenye,T.,et al.Identification of putative noncoding RNA genes in the Burkholderia cenocepacia J2315 genome.FEMS Microbiol Lett.,2007(276):83-92
    [19]Nakamura,T.,et al.A Cyanobacterial Non-coding RNA,Yfr1,is Required for Growth Under Multiple Stress Conditions. Plant Cell Physiol, 2007(48):1309-1318
    [20] Royo, H., et al. Bsr, a nuclear-retained RNA with monoallelic expression. Mol Biol Cell, 2007(18): 2817-2827
    [21] Yang, L. and S. Altman. A noncoding RNA in Saccharomyces cerevisiae is an RNase P substrate. RNA, 2007(13): 682-690
    [22] Gildehaus, N., et al. Studies on the function of the riboregulator 6S RNA from E.coli: RNA polymerase binding, inhibition of in vitro transcription and synthesis of RNA-directed de novo transcripts. Nucleic Acids Research, 2007(35):1885-1896
    [23] Espinoza, C, J. Goodrich, and J. Kugel. Characterization of the structure,function, and mechanism of B2 RNA, an ncRNA repressor of RNA polymerase II transcription. RNA, 2007(13): 583-596
    [24] Vecerek, B., I. Moll, and U. Blasi. Control of Fur synthesis by the non-coding RNA RyhB and iron-responsive decoding. EMBO J, 2007(26): 965-975
    [25] Eddy, S.R. Non-coding RNA genes and modern RNA world. Nat RevGenet 2001(2): 919-929
    [26] Mattick, J.S. Non-coding RNAs: the architects of eukaryotic complexity. EMBO Rep, 2001(2): 986-991
    [27] Quintana, Ml, R. Rauhut, and W. endeckel. Identification of Novel Genes Coding for Small Expressed RNAs. Science, 2001(294): 853-858
    [28] Huttenhofer, A., P. Schattner, and N. Polacek. Non-coding RNAs: hope or hype?TRENDS in Genetics, 2005(21): 289-297
    [29] Huttenhofer, A., et al. RNomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse. EMBO J, 2001(20):2943-2953
    
    [30] Lim, L.P. Vertebrate microRNA genes. Science, 2003(299): 1540
    [31 ] http://lowelab.ucsc.edu/tRNAscan-SE/. 2006, 9
    [32] http://lowelab.ucsc.edu/snoGPS/. 2006, 9
    [33] http://lowelab.ucsc.edu/snoscan/. 2006, 9
    [34] http://genes.mit.edu/mirscan. 2006, 9
    [35] http://www.scripps.edu/mb/case/casegr-sh-3.5.html. 2006, 9
    [36] http://tagc.univ-mrs.fr/erpin/. 2006, 9
    [37] http://www.ba.itb.cnr.it/BIG/PatSearch/. 2006, 9
    [38] Rivas, E. and S.R. Eddy. Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2001(2): 8
    [39] Durbin, R., et al. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge: Cambridge University press, 1998
    [40] Rivas, E., K. RJ, and T. Johes. Computational identification of noncoding RNAs in E.coli by comparative genomics. Curr Biol, 2001(11): 1369-1373
    [41] http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=59011075&=evofold. 2007, 6
    [42] Pedersen, J., et al. Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol, 2006(2): e33-e44
    [43] Washietl, S., et al. Structured RNAs in the ENCODE selected regions of the human genome. Genome Research, 2007(17): 852-864
    [44] Sczyrba, A., et al. RNA-related tools on the Bielefeld Bioinformatics Server. Nucleic Acids Research, 2003(31): 3767-3770
    [45] Jia, D., et al. Systematic identification of non-coding RNA 2, 2,7-trimethylguanosine cap structures in Caenorhabditis elegans. BMC Mol Biol,2007(8): 86
    [46] Seemann, S., et al. Detection of RNA structures in porcine EST data and related mammals. BMC Genomics, 2007(8): 316
    [47] Veksler-Lublinsky, I., et al. A structure-based flexible search method for motifs in RNA. J Comput Biol 2007(14): 908-926
    [48] Yao, Z., et al. A Computational Pipeline for High- Throughput Discovery of cis-Regulatory Noncoding RNA in Prokaryotes. PLoS Comput Biol, 2007(3):e126
    [49] Petruk, S., et al. A model for initiation of mosaic HOX gene expression patterns by non-coding RNAs in early embryos. RNA Biol 2007(4): 1-6
    [50] Aksay, C., et al. taveRNA: a web suite for RNA algorithms and applications.Nucleic Acids Research, 2007(35): W325-W329
    
    [51] Mrazek, J., et al. Subtractive hybridization identifies novel differentially expressed ncRNA species in EBV-infected human B cells. Nucleic Acids Research, 2007(35): e73
    
    [52] Zhang, Z., A. Pang, and M. Gerstein. Comparative analysis of genome tiling array data reveals many novel primate-specific functional RNAs in human. BMC Evol Biol, 2007(7):S14
    
    [53] Babak, T., B. Blencowe, and T. Hughes. Considerations in the identification of functional RNA structural elements in genomic alignments. BMC Bioinformatics, 2007(8): 33
    
    [54] Mandin, P., et al. Identification of new noncoding RNAs in Listeria monocytogenes and prediction of mRNA targets. Nucleic Acids Research,2007(35): 962-974
    [55] Klein, R.J., Z. Misulovin, and S.R. Eddy, Noncoding RNA genes identified in AT-rich hyperthermophiles. PNAS, 2002(99): 7542-7547
    [56] Sam, G.J., B. Alex, and M. Mhairi, Rfam: an RNA family database. Nucleic Acids Research, 2003(31): 439-441
    [57] Sam, G.J., M. Simon, and M. Mhairi Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Research, 2005(33): D121-D124
    [58] Weinberg, Z. and W.L. Ruzzo, Exploiting conserved structure for faster annotation of non-coding RNAs without loss of accuracy. Bioinformatics,2004(20): 1334-1341
    
    [59] Furtig, B., et al. NMR spectroscopy of RNA. Chembiochem, 2003(4): 936-962
    [60] Gardner, P.P. and R. Giegerich. A comprehensive comparison of comparative RNA structure prediction approaches.BMC Bioinformatics,2004(5):140-157
    [61]Nussinov,R.,et al.Algorithms for loop matching.SIAM Journal of Applied Mathematics,1978(35):68-82
    [62]Zuker,M.and P.Stiegler,Optimal computer folding of large RNA sequences using thermodynamics and auxiliary information.Nucleic Acids Research,1981(9):133-148
    [63]Zuker,M.,Mfold web server for nucleic acid folding and hybridization prediction.Nucleic Acids Research,2003(31):3406-3415
    [64]Mfold[http://www.bioinfo.rpi.edu/applications/mfold/].2006,6
    [65]Hofacker,I.,et al.Fast folding and comparison of RNA secondary structures.Monatshefte fur Chemie,1994(125):167-188
    [66]RNAfold[http://www.tbi.univie.ac.at/-ivo/RNA/].2006,7
    [67]Ding,Y.and C.Lawrence.A statistical sampling algorithm for RNA secondary structure prediction.Nucleic Acids Research,2003(31):7280-7301
    [68]Sfold[http://www.bioinfo.rpi.edu/applications/sfold/srna.pl].2006,7
    [69]Do,C.B.,D.A.Woods,and S.Batzoglou.CONTRAfold:RNA secondary structure prediction without physics-based models.Bioinformatics,2006(22):e90-e98
    [70]Thompson,J.,D.Higgins,and T.Gibson.CLUSTAL W:improving the sensitivity of progressive multiple sequence alignment through sequence weighting,positions-specific gap penalties and weight matrix choice.Nucleic Acids Research 1994(22):4673-4680
    [71]Notredame,C.,D.Higgins,and J.Heringa,T-Coffee:A novel method for fast and accurate multiple sequence alignment.Journal of Molecular Biology,2000(302):205-217
    [72]Gotoh,O.,Multiple sequence alignment:algorithms and applications.Adv Biophys,1999(36):159-206
    [73]Gorodkin,J.,et al.Displaying the information contents of structural RNA alignments.Comput Appl Biosci,1997(13):583-586
    [74]Hofacker,I.,M.Fekete,and P.Stadler.Secondary structure prediction for aligned RNA sequences.Journal of Molecular Biology,2002(319):1059-1066
    [75]Ruan,J.,G.Stormo,and W.Zhang.An iterated loop matching approach to the prediction of RNA secondary structures with pseudoknots.Bioinformatics,2004(20):58-66
    [76]Knudsen,B.and J.Hein.RNA secondary structure prediction using stochastic context-free grammars and evolutionary history.Bioinformatics,1999(15):446-454
    [77]Knudsen,B.and J.Hein.Pfold:RNA secondary structure prediction using stochastic context-free grammars.Nucleic Acids Research,2003(31):3423-3428
    [78]RNAalifold[http://www.tbi.univie.ac.at/-ivo/RNA/].2006,7
    [79]ILM[http://www.cs.wustl.edu/-zhang/projects/rna/ilm/].2006,7
    [80]Pfold[http://www.daimi.au.dk/-compbio/rnafold/].2006,7
    [81] Sankoff, D. Simultaneous solution of the RNA folding, alignment and protosequence problems. SIAM Journal on Applied Mathematics, 1985(45):810-825
    [82] Gorodkin, J., S. Stricklin, and G. Stormo. Discovering common stemloop motifs in unaligned RNA sequences. Nucleic Acids Research, 2001(29): 2135-2144
    [83] FOLDalign [http://www.bioinf.au.dk/FOLDALIGN/]. 2006, 8
    [84] Mathews, D. and D. Turner. Dynalign: An algorithm for finding the secondary structure common to two RNA sequences. Journal of Molecular Biology,2002(317): 191-203
    
    [85] Dynalign [http://rna.urmc.rochester.edu/]. 2006, 8
    [86] Hofacker, I., S. Bernhart, and P. Stadler. Alignment of RNA base pairing probability matrices. Bioinformatics, 2004(20): 2222-2227
    [87] Torarinsson, E., J. Havgaard, and J. Gorodkin. Multiple structural alignment and clustering of RNA sequences. Bioinformatics, 2007(23): 926-932
    [88] Touzet, H. and O. Perriquet. CARNAC: folding families of related RNAs.Nucleic Acids Research, 2004(32): W142-145
    [89] Shapiro, B. An algorithm for comparing multiple RNA secondary structures.Comput Appl Biosci, 1988(4): 387-393
    [90] Zhang, K. and D. Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal of Computing, 1989(18):1245-1262
    [91] Tai, K. The tree-to-tree correction problem. Journal of the ACM, 1979(26):422-433
    [92] Shapiro, B. and K. Zhang. Comparing multiple RNA secondary structures using tree comparisons. CABIOS, 1990(6): 309-318
    [93] Wang, L., T. Jiang, and D. Gusfield. A more efficient approximation scheme for tree alignment. SIAM J Comput, 2000(30): 283-299
    [94] Sakakibara, Y. Pair hidden Markov models on tree structures. Bioinformatics,2003(19): i232-i240
    [95] Wang, Z. and K. Zhang. Alignment between two RNA structures, in Lecture Notes'in Computer Science. Proceedings of the 26th International Symposium on Mathematical Foundations of Computer Science, Springer-Verlag London,UK, 2001, 690-703
    [96] Jiang, T., J. Wang, and K. Zhang. Alignment of trees: an alternative to tree edit.Theor Comput Sci, 1995(143): 137-148
    [97] McCaskill, J. The equilibrium partition function and base pair binding probabilities for RNA secondary structures. Biopolymers, 1990(29): 1105-1119
    [98] Havgaard, J., E. Torarinsson, and J. Gorodkin. Fast Pairwise Structural RNA Alignments by Pruning of the Dynamical Programming Matrix. PLoS Comput Biol. 2007(3): e193
    [99] Hochsmann, M., et al. Local similarity of RNA secondary structures. In Proc of the IEEE Bioinformatics Conference, 2003
    [100] RNAforester [http://bibiserv.techfak.uni-bielefeld.de/rnaforester/]. 2006, 7
    [101] Siebert, S. and R. Backofen. MARNA: multiple alignment and consensus structure prediction of RNAs based on sequence structure comparisons.Bioinformatics, 2005(21): 3352-3359
    [102] MARNA [http://www.bio.inf.uni-jena.de/Software/MARNA/index.html]. 2006,8
    [103] Woese, C. and N. Pace. The RNA World, chap. Probing RNA structure, function,and history by comparative analysis. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press, 1993, 91-117
    [104] Pace, N.R., B.C. Thomas, and C.R. Woese. Probing RNA structure, function,and history by comparative analysis. The RNA World, 2nd edn, NY: Cold Spring Harbor Laboratory Press, 1999, 113-141
    [105] Horesh, Y., et al. RNAspa: a shortest path approach for comparative prediction of the secondary structure of ncRNA molecules. BMC Bioinformatics, 2007(8):366
    [106] Havgaard, J., et al. Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%. Bioinformatics, 2005(21): 1815-1824
    [107] Kiryu, H., et al. Murlet: a practical multiple alignment tool for structural RNA sequences. Bioinformatics, 2007(23): 1588-1598
    [108] Lyngs, R. and C. Pedersen. RNA Pseudoknot Prediction in Energy Based Models. Journal of Molecular Biology, 2000(7): 409-427
    [109] Cai, L., R. Malmberg, and Y. Wu. Stochastic modeling of RNA pseudoknotted structures: a grammatical approach. Bioinformatics, 2003(19): 166-73
    [110] Altschul, S., et al. Basic local alignment search tool. Journal of Molecular Biology, 1990(215): 403-410
    [111] Bafna, V., H. Tang, and S. Zhang. Consensus folding of unaligned RNA sequences revisited. Journal of computational biology, 2006(13): 283-295
    [112] Tabei, Y., et al. SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments. Bioinformatics, 2006(22):1723-1729
    [113] Ji, Y., X. Xu, and G. Stormo. A graph theoretical approach for predicting common RNA secondary structure motifs including pseudoknots in unaligned sequences. Bioinformatics, 2004(20): 591-602
    [114] Matsui, H., K. Sato, and Y. Sakakibara. Pair stochastic tree adjoining grammars for aligning and predicting pseudoknot RNA structures. Bioinformatics,2005(21): 2611-2617
    [115] Xayaphoummine, A., T. Bucher, and H. Isambert. Kinefold web server for RNA/DNA folding path and structure prediction including pseudoknots and knots. Nucleic Acids Research, 2005(33): W605-W610
    [116] Cao, S. and S. Chen. Predicting RNA pseudoknot folding thermodynamics.Nucleic Acids Research, 2006(34): 2634-2652
    [117] Storz, G. An expanding universe of noncoding RNAs. Science, 2002(296): 1260-1263
    
    [118] PACdb (http://harlequin.jax.org/pacdb/). 2006, 7
    [119] (ftp://ftp.ebi.ac.uk/pub/databases/UTR/data). 2006, 9
    [120] Lewis, B.P., et al. Prediction of Mammalian MicroRNA Targets. Cell,2003(115): 787-798
    [121] Rajewsky, N. and N.D. Socci. Computational identification of microRNA targets. Dev.Biol, 2003(267): 529-535
    [122] Kiriakidou, M. A combined computational-experimental approach predicts human microRNA targets. Genes Dev, 2004(18): 1165-1178
    [123] Stark, A. Identification of Drosophila microRNA targets. PLoS Biol., 2003(1):397-409
    [124] Rehmsmeier, M. Fast and effective prediction of microRNA /target duplexes.RNA,2004(10): 1507-1517
    [125] Enright, A.J., et al. MicroRNA targets in Drosophila. Genome Biology, 2003(5):R1
    [126] Rhoades, M.W, et al. Prediction of Plant MicroRNA Targets. Cell, 2002(110):513-520
    [127] Smalheiser, N.R. and V.I. Torvik. a population-based statistical approach identifies parameters characteristic of human microRNA-mRNA interactions.BMC Bioinformatics, 2004(5): 139
    [128] Sethupathy, P., M. Megraw, and A. Hatzigeorgiou. A guide through present computational approach for the identification of mammalian microRNA targets.Nat Methods, 2006(3): 881-886
    
    [129] John, B., et al. Human MicroRNA targets. PLoS Biol, 2004(2): e363
    [130] Lewis, B., C. Burge, and D. Bartel. Conserved seed pairing, often flankedby adenosines, indicates that thousands of human genes are microRNAtargets. Cell,2005(120): 15-20
    [131] Krek, A., et al. Combinatorial microRNA target predictions. Nat Genet,2005(37): 495-500
    [132] Gan, H.H., S. Pasquali, and T. Schlick. Exploring the repertoire of RNA secondary motifs using graph theory; implications for RNA design. Nucleic Acids Research, 2003(31): 2926-2943
    [133] L, W. and J. W. Prediction of RNA secondary structure based on helical regions distribution. Bioinformatics, 1998(14): 700-706
    [134] Ying, X., et al. RDfolder: a web server for prediction of RNA secondary structure. Nucleic Acids Research, 2004(32): W150-153
    [135] Bernardo, D.D., T. Down, and T. Hubbard. ddbRNA: detection of conserved secondary structures in multiple alignments. Bioinformatics, 2003(19):1606-1611
    [136] Chiu, D. and T. Kolodziejczak. Inferring consensus structure from nucleic acid sequences. Comput Appl Biosci, 1991(7): 347-352
    [137] Gutell, R., et al. Identifying constraints on the higher-order structure of RNA: continued development and application of comparative sequence analysis methods. Nucleic Acids Research, 1992(20): 5785-5795
    [138] Sato, K. and Y. Sakakibara. RNA secondary structural alignment with conditional random fields. Bioinformatics, 2005(21): ii237-ii242
    [139] Perriquet, 0., H. Touzet, and M. Dauchet. Finding the common structure shared by two homologous RNAs. Bioinformatics, 2003(19): 108-116
    [140] Sakakibara, Y., et al. Stochastic context-free grammars for tRNA modeling.Nucleic Acids Research, 1994(22): 5112-5120
    [141] Eddy, S.R. and R. Durbin. RNA sequence analysis using covariance models.Nucleic Acids Research, 1994(22): 2079-2088
    [142] Dowell, R. and S. Eddy. Evaluation of several lightweight stochastic context-free grammars for RNA secondary structure prediction. BMC Bioinformatics, 2004(5): 71
    [143] Dowell, R. and S. Eddy. Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics, 2006(7):400-417

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700