MicroRNA识别及其与疾病关联的预测算法研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
microRNA(miRNA)是一类长度约为22nt(核苷酸)的内源非编码RNA,在动植物许多重要的生命过程中起着关键的调控作用,并且与肿瘤等多种疾病的发生发展密切相关。生物信息学在miRNA的研究中起到了重要作用,极大地推动了该领域的迅速发展。本文主要研究miRNA相关问题的计算预测方法,对miRNA前体分类预测、miRNA成熟体位置预测、疾病关联的miRNA预测等问题进行了深入的研究,取得了一些创新成果。主要包括以下四方面的内容:
     (1)研究了高效的基于支持向量机的miRNA前体分类预测方法。
     研究miRNA的功能需要先找到miRNA。通过生物实验识别miRNA的方法是耗时和昂贵的,并且难于发现那些表达量较低或者只在特定组织或发育阶段表达的miRNA。因此,使用计算预测方法筛选可能的miRNA候选集合,可以为生物实验提供指导和参考,对推动miRNA的识别具有重要意义。本文结合miRNA前体的特点,提出了基于支持向量机的miRNA前体分类预测方法。好的特征和正反例(真/假miRNA前体)数据集合是建立高效的分类预测模型的基础。因此,本文从真/假miRNA前体中提取得到序列相关特征、结构相关特征和能量相关特征。提出了基于遗传算法的特征选择方法,选取了有代表性的特征子集。由于植物miRNA前体反例数据集的匮乏,本文首次从拟南芥、水稻、大豆的蛋白质编码序列中提取类似茎环的序列作为假miRNA前体序列,并建立反例数据集。针对真/假植物miRNA前体类别不平衡问题,结合集成学习和AdaBoost思想建立了集成分类器PlantMiRNAPred。PlantMiRNAPred分别在拟南芥、水稻、毛果杨、小立碗藓、苜蓿、高粱、玉米和大豆等8个物种中取得了超过90%的准确率,对植物miRNA前体的识别研究具有重要价值。此外,我们还使用人类miRNA前体的数据建立了分类模型HumanMiRNAPred,该模型也取得了更高的预测性能,有助于推动人类miRNA前体的识别研究。
     (2)研究了准确的miRNA成熟体位置预测方法,能够为新预测得到的miRNA前体候选,预测其中成熟体的位置。
     基于机器学习的miRNA前体分类预测方法,通常只能预测分类新的miRNA前体,无法预测其中miRNA成熟体的位置。然而,在进行后续生物实验验证前,通常需要给出其中miRNA成熟体的位置,因此本文提出了基于支持向量机的miRNA成熟体位置预测方法。首先将miRNA:miRNA*作为一个整体,以更好的反映miRNA及miRNA*相互结合的特点。其次,从真/假miRNA:miRNA*中提取特征并选取得到有代表性的特征子集。第三,针对真/假miRNA:miRNA*数量相差悬殊的问题,提出了两阶段样本选择方法,依据反例样本(假的miRNA:miRNA*)的分布密度和样本的预测误差,选取有代表性的反例样本,建立miRNA成熟体位置预测模型MaturePred。与现有的方法相比,MaturePred取得了更准确的预测性能,能够为后续生物实验提供更可靠的动植物miRNA成熟体候选。
     (3)结合miRNA功能相似性的准确度量,提出基于k个最相似miRNA结点的疾病关联miRNA预测算法。
     miRNA调控的异常是导致肿瘤等多种疾病的重要原因,因此研究miRNA与疾病的关联对研究发病机理是非常重要的。研究表明功能相似的miRNA通常参与相似疾病的过程,即与相似的疾病关联,反之亦然。于是可以通过度量与两个miRNA相关的两组疾病间的语义相似性,评估两个miRNA间功能相似性。本文通过考虑每个疾病术语的信息含量,进一步改进了miRNA功能相似性的度量。提出了基于k个最相似的邻居miRNA结点的疾病关联miRNA预测算法HDMP,该方法可以系统的预测与特定疾病关联的miRNA候选。此外,结合同属于一个miRNA家族或miRNA分簇中的miRNA间功能更相似的特点,在预测时进一步考虑miRNA家族和分簇的信息,提出了预测算法HDMPW。针对18种人类常见的疾病,证实了HDMP和HDMPW能够有效预测疾病关联的miRNA候选。随着miRNA和疾病关联数据的快速增长,HDMP未来可以扩展到其它人类疾病的预测。
     (4)在建立miRNA功能相似性图的基础上,提出基于随机游走的疾病关联miRNA预测算法。
     在计算miRNA间功能相似性的基础上,建立miRNA功能相似性图。将疾病关联miRNA的预测问题转换为随机游走问题,提出了基于随机游走的预测算法HDMPR。与HDMP和HDMPW不同的是,HDMPR在预测时不是考虑了k个最相似邻居结点的信息,而且考虑了miRNA功能相似性图的全局结构信息。使用18种人类常见的疾病与miRNA的关联数据,验证了HDMPR方法的有效性。实验结果表明,对于多数的疾病而言,HDMPR取得了比HDMP和HDMPW更好的预测性能。总体来说,HDMP、HDMPW、HDMPR均能够为后续生物实验,提供可靠的与特定疾病关联的miRNA候选,为生物学家进一步验证可能的疾病关联miRNA提供指导作用。
MicroRNAs (miRNAs) are a set of short (about22nucleotides) non-coding RNAsthat play significant regulatory roles in various biological processes of animals andplants. Furthermore, accumulating evidence indicates miRNAs are associated withvarious human diseases. The application of bioinformatics in miRNA research greatlypromotes the development of this cutting-edge area of current biology. In this thesis, westudied pre-miRNA classification, mature miRNA position prediction, anddisease-related miRNA identification. The creative work mainly consists of thefollowing four parts.
     (1) A novel classification method based on support vector machine (SVM) isproposed specifically for predicting plant pre-miRNAs.
     Identification of miRNAs is the first step in miRNA functional studies. DetectingmiRNAs by experimental techniques is expensive and time-consuming. It is difficult toidentify the lowly expressed miRNAs or the miRNAs that expressed in the specifictissues or expressed in developmental stage. Therefore, computational predictionmethod can provide the potential pre-miRNA candidates for the biologists. Consideringthe characteristics of pre-miRNAs, the classification method based on SVM is proposed.It is well studied that the good features and positive/negative (real/pseudo pre-miRNA)datasets are the basis of constructing efficient classification model. Therefore, thesequence-related features, structure-related features, and energy-related features areextracted from the real/pseudo plant pre-miRNAs. A set of informative features areselected by our feature selection method based on genetic algorithm. Due to lack ofpseudo plant pre-miRNAs, we extract the pseudo hairpin sequences from the proteincoding sequences of Arabidopsis thaliana, Oryza sativa, and Glycine max respectively.These pseudo hairpin sequences are used as negative samples. Considering the classimbalance of real/pseudo pre-miRNAs, the classification model (PlantMiRNAPred) isconstructed by combining ensemble learning and AdaBoost method. PlantMiRNAPredachieves more than90%accuracy on the plant datasets from8plant species, includingArabidopsis thaliana, Oryza sativa, Populus trichocarpa, Physcomitrella patens,Medicago truncatula, Sorghum bicolour, Zea mays, and Glycine max. PlantMiRNAPredhas important value in identifying plant pre-miRNAs. In addition, we construct aclassification model, HumanMiRNAPred, with the data of human pre-miRNAs.HumanMiRNAPred achieves higher prediction performance, which is helpful forfacilitating identification of human pre-miRNAs.
     (2) A machine learning method based on support vector machine is proposed topredict the positions of miRNAs for the new pre-miRNA candidates.
     Most of pre-miRNA classification methods based on machine learning can distinguish real pre-miRNAs from pseudo pre-miRNAs, and few can predict thepositions of miRNAs. However, to efficiently identify the actural miRNAs, thepositions of miRNAs usually should be given for the subsequent biological experiments.Therefore, the position prediction method is proposed. First, a miRNA:miRNA*duplexis regarded as a whole to capture the binding characteristic of between a miRNA and itscorresponding miRNA*. Second, we extract the features from real/pseudomiRNA:miRNA*s and select the informative features to improve the predictionaccuracy. Third, two-stage sample selection algorithm is proposed to combat the seriousimbalance problem between real miRNA:miRNA*s and pseudo miRNA:miRNA*s. Therepresentative negative training samples (pseudo miRNA:miRNA*s) are selectedaccording to their distribution density in the high dimensional sample space and theirprediction deviations. The prediction method, MaturePred, achieves higher predictionaccuracy compared with the existing methods. MaturePred can provide the morereliable animal miRNA candidates and plant miRNA candidates for subsequentexperiments.
     (3) On the basis of accurately measuring the functional similarity of two miRNAs,the method based on the k most similar neighboring miRNAs is proposed for predictingdisease-related miRNAs.
     The abnormal expression of miRNAs is one of important causes which result invarious diseases. Therefore, the identification of human disease-related miRNAs isimportant for investigating their involvement in the pathogenesis of diseases. It isknown that miRNAs with similar functions are often associated with similar diseasesand vice versa. Therefore, the functional similarity of two miRNAs has beensuccessfully inferred by measuring the semantic similarity of their associated diseases.We achieve more accurate measurement of miRNA functional similarity by consideringthe information content of disease terms. A new prediction algorithm, HDMP, based onthe k most similar neighboring miRNAs is presented for predicting disease-relatedmiRNAs. In addition, the miRNAs that belong to a miRNA family or locate a cluster aremore similar with each other. We furthermore propose the prediction algorithm based onthe information of miRNA family or cluster. The algorithm is referred to as HDMPW.HDMP and HDMPW were proved successful in predicting the potential disease-relatedmiRNA candidates for18human diseases. HDMP can be easily extended to otherdiseases with the rapid increase of miRNA-disease association data for specificdiseases.
     (4) On the basis of constructing miRNA functional similarity graph, a methodbased on random walk is proposed for predicting disease-related miRNAs.
     The miRNA functional similarity graph is constructed by calculating the functionalsimilarity of two miRNAs. The prediction algorithm based on random walk with restart,HDMPR, is proposed for predicting disease-related miRNAs. Unlike HDMP and HDMPW, HDMPR does not consider the k most similar neighboring miRNAs, butrather it considers the global structure of miRNA functional similarity graph. Theefficiency of HDMPR is validated by the association data of18human diseases. Theexperimental result indicates that HDMPR achieves higher prediction performance thanHDMP and HDMPW for most of the18diseases. Overall, HDMP, HDMPW, andHDMPR are useful in providing reliable disease-related miRNA candidates forsubsequent biological testing.
引文
[1] Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function[J]. Cell,2004,116(2):281-297.
    [2] Chatterjee S, Grobhans H. Active turnover modulates mature microRNA activityin Caenorhabditis elegans[J]. Nature,2009,461:546-549.
    [3] Bushati N, Cohen SM. MicroRNA functions[J]. Annu Rev Cell Dev Biol.,2007,23:175-205.
    [4] Lee RC, Feinbaum RL, Ambros V. The C. elegans heterochronic gene lin-4encodes small RNAs with antisense complementarity to lin-14[J]. Cell,1993,75(5):843-854.
    [5] Reinhart BJ, Slack FJ, Basson M, et al. The21-nucleotide let-7RNA regulatesdevelopmental timing in caenorhabditis elegans[J]. Nature,2000,403(6772):901-906.
    [6] Slack FJ, Basson M, Liu Z, et al. The lin-41RBCC gene acts in the C. elegansheterochronic pathway between the let-7regulatory RNA and the LIN-29transcription factor[J]. Molecular cell,2000,5(4):659-669.
    [7] Lee RC, Ambros V. An extensive class of small RNAs in caenorhabditiselegans[J]. Science,2001,294(5543):862-864.
    [8] Lau NC, Lim LP, Weinstein EG, et al. An abundant class of tiny RNAs withprobable regulatory roles in caenorhabditis elegans[J]. Science,2001,294(5543):858-862.
    [9] Lagos-Quintana M, Rauhut R, Lendeckel W, et al. Identification of novel genescoding for small expressed RNAs[J]. Science,2001,294(5543):853-858.
    [10] Pasquinelli AE, Reinhart BJ, Slack F, et al. Conservation of the sequence andtemporal expression of let-7heterochronic regulatory RNA[J]. Nature,2000,408(6808):86-89.
    [11] Ghosh Z, Chakrabarti J, Mallick B. miRNomics-The bioinformatics of microRNAgenes[J]. Biochem. Biophys. Res. Commun.,2007,363(1):6-11.
    [12]侯妍妍,应晓敏,李伍举。microRNA计算发现方法的研究进展[J]。遗传,2008,30(6):687-696.
    [13] Calin GA, Dumitru CD, Shimizu M et al. Frequent deletions and down-regulationof microRNA genes miR15and miR16at13q14in chromic lymphocyticleukemia[C]. Proc Natl Acad Sci USA,2002,99(24):15524-15529.
    [14] Yang B, Lin H, Xiao J, et al. The muscle-specific microRNA miR-1regulatescardiac arrhythmogenic potential by targeting GJA1and KCNJ2[J]. Nat Med,2007,13(4):486-491.
    [15] Lu Y, Zhang Y, Shan H, et al. MicroRNA-1downregulation by propranolol in a ratmodel of myocardial infarction: a new mechnism for ischaemiccardioprotection[J]. Cardiovasc Res,2009,84(3):434-441.
    [16] Terentyev D, Belevych AE, Terentyeva R, et al. miR-1overexpression enhancesCa2+release and promotes cardiac arrhythmogenesis by targeting PP2A regulatorysubunit B56alpha and causing CaMKII-dependent hyperphosphorylation ofRyR2[J]. Circ Res,2009,104(4):514-521.
    [17]王宁,吕延杰,杨宝峰。MicroRNA在心律失常研究中的进展及其应用前景[J]。分子诊断与治疗杂志,2011,3(4):266-273.
    [18] Lu M, Zhang Q, Deng M, et al. An analysis of human microRNA and diseaseassociations[J]. PLoS One,2008,3: e3420.
    [19] Jiang QH, Wang YD, Hao YY, et al. Mir2disease: a manually curated database formicroRNA deregulation in human disease[J]. Nucleic Acids Res,2009,37:D98-D104.
    [20] Bandyopadhyay S, Mitra R, Maulik U, et al. Development of the human cancermicroRNA network[J]. Silence,2010,1:6.
    [21] Barad O, Meiri E, Avniel A, et al. MicroRNA expression detected byoligonucleotide microarrays: system establishment and expression profiling inhuman tissues[J]. Genome Research,2004,14:2486-2494.
    [22] Chen Y, Gelfond J AL, McManus LM, et al. Reproducibility of quantitativeRT-PCR array in miRNA expression profiling and comparison with microarrayanalysis[J]. BMC Genomics,2009,10:407.
    [23] Gaur A, Jewell DA, Liang Y, et al. Characterization of microRNA expressionlevels and their biological correlates in human cancer cell lines[J]. Cancer Res.,2007,67:2456-2468.
    [24] Gutierrez NC, Sarasquete ME, Misiewicz-Krzeminska I, et al. Deregulation ofmicroRNA expression in the different genetic subtypes of multiple myeloma andcorrelation with gene expression profiling[J]. Leukemia,2010,24:629-637.
    [25] Lu J, Getz G, Miska EA, et al. MicroRNA expression profiles classify humancancers[J]. Nature,2005,435:834-838.
    [26] Saba R, Booth SA. Target labeling for the detection and profiling of microRNAsexpressed in CNS tissue using microarrays[J]. BMC Biotechnol.,2006,6:47.
    [27] Yakhini Z, Jurisica I. Cancer computational biology[J]. BMC Bioinformatics,2011,12:210.
    [28] Lee Y, Yang X, Huang Y, et al. Network modeling identifies molecular functionstargeted by miR-204to suppress head and neck tumor metastasis[J]. PLoSComput Biol,2010,6(4): e1000730.
    [29] Jiang Q, Hao Y, Wang G, et al. Prioritization of disease microRNAs through ahuman phenome-microRNAome network[J]. BMC Systems Biology,2010,4:S2.
    [30] Wang, D, Wang J, Lu M, et al. Inferring the human microRNA functionalsimilarity and functional network based on microRNA-associated diseases[J].Bioinformatics,2010,26:1644-1650.
    [31] Li X, Wang Q, Zheng Y, et al. Prioritizing human cancer microRNAs based ongenes’ functional consistency between microRNA and cancer[J]. Nucleic AcidsResearch,2011,39,1-10.
    [32] Chen CZ. MicroRNAs as oncogenes and tumor suppressors. N Engl J Med.,2005,353(17):1768-1771.
    [33] Yi R, Qin Y, Macara IG, et al. Exportin-5mediates the nuclear export ofpre-microRNAs and short hairpin RNAs[J]. Genes Dev,2003,17(24):3011-3016.
    [34] Bohnsack MT, Czaplinski K, Gorlich D. Exportin5is a RanGTP-dependentdsRNA-binding protein that medicates nuclear export of pre-mIRNAs[J]. RNA,2004,10(2):185-191.
    [35] Zeng Y, Cullen BR. Structural requirements for pre-microRNA binding andnuclear export by Exportin5[J]. Nucleic Acids Res,2004,32(16):4776-4785.
    [36] Kim VN. MicroRNA precursors in motion: exportin-5mediates their nuclearexport[J]. Thrends Cell Biol,2004,14(4):156-159.
    [37] Hutvagner G, McLachlan J, Pasquinelli AE et al. A cellular function for theRNA-interference enzyme Dicer in the maturation of the let-7small temporalRNA[J]. Science,2001,293(5531):834-838.
    [38] Ketting RF, Fischer SE, Bernstein E, et al. Dicer functions in RNA interferenceand in synthesis of small RNA involved in developmental timing in C. elegans[J].Genes Dev,2001,15(20):2654-2659.
    [39] Reinhart BJ, Weinstein EG, Rhoades MW, et al. MicroRNAs in plants[J]. GenesDev,2002,16(13):1616-1626.
    [40] Papp I, Mette MF, Aufsatz W, et al. Evidence for nuclear processing of plantmicroRNA and short interfering RNA precursors[J]. Plant Physiol,2003,132(3):1382-1390.
    [41] Michael MZ, O Connor SM, Van Holst Pellekaan NG, et al. Reducedaccumulation of specific microRNAs in colorectal neoplasia[J]. Mol Cancer Res,2003,1(12):882-891.
    [42] Iorio MV, Ferracin M, Liu CG, et al. MicroRNA gene expression deregulation inhuman breast cancer[J]. Cancer Res,2005,65(16):7065-7070.
    [43] Johnson SM, Grosshans H, Shingara J, et al. RAS is regulated by the let-7microRNA family[J]. Cell,2005,120:635-647.
    [44] Zanette D, Rivadavia F, Molfetta G, et al. MiRNA expression profiles in chroniclymphocytic and acute lymphocytic leukemia[J]. Brazilian Journal of Medical andBiological Research,2007,40:1435-1440.
    [45] Huang Q, Gumireddy K, Schrier M et al. The microRNAs mir-373and mir-520cpromote tumour invasion and metastasis[J]. Nat Cell Biol,2008,10(2):202-210.
    [46] Rodriguez A, Griffiths-Jones S, Ashurst JL, et al. Identification of mammalianmicroRNA host genes and transcription units[J]. Genome Res,2004,14(10A):1902-1910.
    [47] Altuvia Y, Landgraf P, Lithwick G, et al. Clustering and conservation patterns ofhuman microRNAs[J]. Nucleic Acids Res,2005,33(8):2697-2706.
    [48] Weber MJ. New human and mouse microRNA genes found by homologysearch[J]. FEBS Journal,2004,272(1):59-73.
    [49] Dezulian T, Remmert M, Palatnik JF, et al. Identification of plant microRNAhomologs[J]. Bioinformatics,2006,22(3):359-360.
    [50] Legendre M, Lambert A, Gautheret D. Profile-based detection of microRNAprecursors in animal genomes[J]. Bioinformatics,2005,21(7):841-845.
    [51] Gautheret D, Lambert A. Direct RNA motif definition and identification frommultiple sequence alignments using secondary structure profiles[J]. J Mol Biol.,2001,313(5):1003-1011.
    [52] Wang X, Zhang J, Li F, et al. MicroRNA identification based on sequence andstructure alignment[J]. Bioinformatics,2005,21(18):3610-3614.
    [53] Li Y, Li W, Jin YX. Computational identification of novel family members ofmicroRNA genes in Arabidopsis thaliana and Oryza sativa[J]. Acta BiochimicaBiophysica Sinica,2005,37(2):75-87.
    [54] Qiu CX, Xie FL, Zhu YY, et al. Computational identification of microRNAs andtheir targets in Gossypium hirsutum expressed sequence tags[J]. Gene,2007,395:49-61.
    [55] Xie FL, Huang SQ, Guo K, et al. Computational identification of novelmicroRNAs and targets in Brassica napus[J]. FEBS Letter,2007,581(7):1464-1474.
    [56] Zhang BH, Pan XP, Wang QL, et al. Identification and characterization of newplant microRNAs using EST analysis[J]. Cell Research,2005,15(5):336-360.
    [57] Lim LP, Lau NC, Weinstein EG, et al. The microRNAs of Caenorhabditiselegans[J]. Genes Dev.,2003,17(8):991-1008.
    [58] Ohler U, Yekta S, Lim LP, et al. Patterns of flanking sequence conservation and acharacteristic upstream motif for microRNA gene identification[J]. RNA,2004,10(9):1309-1322.
    [59] Lim LP, Glasner ME, Yekta S, et al. Vertebrate microRNA genes[J]. Science,2003,299(5612):1540.
    [60] Lai EC, Tomancak P, Williams RW, et al. Computational identification ofdrosophila microRNA genes[J]. Genome Biol.,2003,4(7): R42.
    [61] Bonnet E, Wuyts J, Rouze P, et al. Detection of91potential conserved plantmicroRNAs in Arabidopsis thaliana and Oryza sativa identifies important targetgenes[C]. Proc Nat Acad Sci, USA,2004,101(31):11511-11516.
    [62] Wang XJ, Reyes JL, Chua NH, et al. Prediction and identification of Arabidopsisthaliana microRNAs and their mRNA targets[J]. Genome Biol,2004,5(9): R65.
    [63] Jones-Rhoades MW, Bartel DP. Computational identification of plant microRNAsand their targets, including a stress-induced miRNA[J]. Molecular Cell,2004,14(6):787-799.
    [64] Xie X, Lu J, Kulbokas EJ, et al. Systematic discovery of regulatory motifs inhuman promoters and3′UTRs by comparison of several mammals[J]. Nature,2005,434(7031):338-345.
    [65] Adai A, Johnson C, Mlotshwa S, et al. Computational prediction of miRNAs inArabidopsis thaliana[J]. Genome Res.,2005,15(1):78-91.
    [66] Pfeffer S, Zavolan M, Grasser FA, et al. Identification of virus-encodedmicroRNAs[J]. Science,2004,304(5671):734-736.
    [67] Ng KLS, Mishra SK. De novo SVM classification of precursor microRNAs fromgenomic pseudo hairpins using global and intrinsic folding measures[J].Bioinformatics.2007,23(11):1321-1330.
    [68] Batuwita R, Palade V. MicroPred:effective classification of pre-miRNAs forhuman miRNA gene prediction[J]. Bioinformatics,2009,25(8):989-995.
    [69] Xue C, Li F, He T, et al. Classification of real and pseudo microRNA precursorsusing local structure-sequence features and support vector machine[J]. BMCBioinformatics,2005,6:310.
    [70] Ding J, Zhou S, Guan J. MiRenSVM: towards better prediction of microRNAprecursors using an ensemble SVM classifier with multi-loop features[J]. BMCBioinformatics,2010,11:S11.
    [71] Zhao D, Wang Y, Luo D, et al. PMirP: a pre-microRNA prediction method basedon structure-sequence hybrid features[J], Artificial Intelligence in Medicine,2010,49(2):127-132.
    [72] Nam J, Shin K, Han J, et al. Human microRNA prediction through a probabilisticco-learning model of sequence and structure[J], Nucleic Acids Research,2005,33(11):3570-3581.
    [73] Yousef M, Nebozhyn M, Shatkay H, et al. Combining multi-Species genomic datafor microRNA identification using a na ve bayes classifier machine learning foridentification of microrna genes[J]. Bioinformatics,2006,22(11):1325-1334.
    [74] Yousef M, Jung S, Showe LC, et al. Learning from positive examples when thenegative class is undetermined-microRNA gene identification[J]. Algorithms forMolecular Biology,2008,3:2.
    [75] Jiang P, Wu H, Wang W, et al. MiPred: classification of real and pseudomicroRNA precursors using random forest prediction model with combinedfeatures[J], Nucleic Acids Research,2007,35: W339-W344.
    [76] Chang DT, Wang CC, Chen JW. Using a kernel density estimation based classifierto predict species-specific microRNA precursors[J]. BMC Bioinformatics,2008,9(Suppl12): S2.
    [77] Gkirtzou K, Tsamardinos L, Tsakalides P, et al. MatureBayes: a probabilisticalgorithm for identifying the mature miRNA within novel precursors[J]. PLoSONE,2010,5: e11843.
    [78] Sheng Y, Engstr m PG, Lenhard B. Mammalian microRNA prediction through asupport vector machine model of sequence and structure[J]. PLoS ONE,2007,2:e946.
    [79] Wu Y, Wei B, Liu H, et al. MiRPara: a SVM-based software tool for prediction ofmost probable microRNA coding regions in genome scale sequences[J]. BMCBioinformatics,2011,12(1):107.
    [80] Berezikov E, Cuppen E, Plasterk RH. Approaches to microRNA discovery. Nat.Genet.,2006,38,2-7.
    [81] Lu YZ, Yang XY. Computatinal identification of novel microRNAs and theirtargets in vigna unguiculata. Com. Funct. Genomics,2010,10,128297-128313.
    [82] Lindow M, Krogh A. Computational evidence for hundreds of non-conservedplant microRNAs[J]. BMC Genomics,2005,6,119-127.
    [83] Sewer A, Paul N, Landgraf P, et al. Identification of clustered microRNAs usingan ab initio prediction method[J]. BMC Bioinformatics,2005,6:267-281.
    [84] Freyhult E, Garder PP, Moulton V. A comparison of RNA folding measures[J].BMC Bioinformatics,2005,6,241-248.
    [85] Seffens W, Digby D. mRNAs have greater negative folding free energies thanshuffled or codon choice randomized sequences[J]. Nucleic Acids Res.,1999,1578-1584.
    [86] Moulton V, Zuker M, Steel M, et al. Metrics on RNA secondary structures[J]. J.Comput. Biol.,2000,7,277-292.
    [87] Zhang BH, Pan XP, Cox SB, et al. Evidence that miRNAs are different from otherRNAs[J]. Cell Mol. Life Sci.,2006,63,246-254.
    [88] Ambros V, Bartel B, Bartel DP, et al. A uniform system for microRNAannotation[J]. RNA,2003,9:277-279.
    [89] Smalheiser NR, Torvik VI. Mammalian microRNAs derived from genomicrepeats[J]. Trends Genet.,2005,21,322-326.
    [90] Peng H, Long F, Ding C. Feature selection based on mutual information: criteriaof max-dependency, max–relevance, and min-redundancy[J]. IEEE Transactionson Pattern Analysis and Machine Intelligence,2005,27(8):1226-1238.
    [91] Griffiths-Jones S, Grocock RJ, van Dongen S, et al. miRBase: microRNAsequences, targets and gene nomenclature. Nucleic Acids Res.,2006,34:D140-144.
    [92] Freund Y, Schapire RE. A decision-theoretic generalization of on-line learning andan application to boosting[J]. Journal of Computer and System Sciences,1997,55(1):119-139.
    [93] Liu X, Wu J, Zhou Z. Exploratory undersampling for class-imbalance learning[J].IEEE Transactions on Systems, Man and Cybernetics.,2009,39(2):539-550.
    [94] Rogers DJ, Tanimoto TT. A computer program for classifying plants[J]. Science,1960,132:1115-1118.
    [95] Arabidopsis Genome Initiative. Analysis of the genome sequence of the floweringplant Arabidopsis thaliana. Nature,2000,408,796-815.
    [96] International Rice Genome Sequencing Project. The map-based sequence of therice genome. Nature,2005,436,793-800.
    [97] Schmutz J, Cannon SB, Schlueter J, et al. Genome sequence of thepalaeopolyploid soybean[J]. Nature,2010,463,178-183.
    [98] Hofacker IL, Fontana W, Stadler PF, et al. Fast folding and comparison of RNAsecondary structures[J]. Monatshefte für Chemie,1994,125,167-188.
    [99] Rice P, Longden I, Bleasby A, et al. EMBOSS: The European Molecular BiologyOpen Software Suite[J]. Trends in Genetics,2000,16:276-277.
    [100] Agarwal S, Vaz C, Bhattacharya A, et al. Prediction of novel precursor miRNAsusing a context-sensitive hidden Markov model (CSHMM)[J]. BMCBioinformatics,2010,11(Suppl1): S29.
    [101] Platt JC. Probabilistic outputs for support vector machines and comparisons toregularized likelihood methods[M]. Massachusettes Institute of Technology Press,1999:1-11.
    [102] Weiss G. Mining with rarity: a unifying framework[J]. SIGKDD Expl.,2004,6:7-19.
    [103] Mitra P, Murthy CA, Pal SK. Density-based multiscale data condensation[J].IEEE Transactions on pattern analysis and machine intelligence,2002,24:734-747.
    [104] Esquela-Kerscher A, Slack FJ. Oncomirs-microRNAs with a role in cancer[J]. Nat.Rev. Cancer,2006,6:259-269.
    [105] Latronico MV, Catalucci D, Condorelli G. Emerging role of microRNAs incardiovascular biology[J]. Circ. Res.,2007,101:1225-1236.
    [106] Lynam-Lennon N, Maher SG, Reynolds JV. The roles of microRNA in cancer andapoptosis[J]. Biol. Rev. Camb. Philos. Soc.,2009,84:55-71.
    [107] Gardner PP, Daub J, Tate JG, et al. Rfam: updates to the RNA families database[J].Nucleic Acids Res.,2009,37(Suppl.1):136-140.
    [108] Bartel DP. MicroRNAs: target recognition and regulatory functions[J]. Cell,2009,136:215-233.
    [109] Baskerville S, Bartel DP. Microarray profiling of microRNAs reveals frequentcoexpression with neighboring miRNAs and host genes[J]. RNA,2005,11:241-247.
    [110] Hamosh A, Scott AF, Amberger JS, et al. Online mendelian inheritance in man(OMIM), a knowledgebase of human genes and genetic disorders[J]. NucleicAcids Res.,2005,33:514-517.
    [111] Lehmann U, Hasemeier B, Christgen M, et al. Epigenetic inactivation ofmicroRNA gene hsa-mir-9-1in human breast cancer[J]. J Pathol.,2008,214:17-24.
    [112] Xin F, Li M, Balch C, et al. Computational analysis of microRNA profiles andtheir target genes suggests significant involvement in breast cancer antiestrogenresistance[J]. Bioinformatics,2009,25:430-434.
    [113] Yan LX, Huang XF, Shao Q, et al. MicroRNA miR-21overexpression in humanbreast cancer is associated with advanced clinical stage, lymph node metastasisand patient poor prognosis[J]. RNA,2008,14:2348-2360.
    [114] Tong H, Faloutsos, Pan J. Fast random walk with restart and its applications.International conference on data mining[C],2006, Hong Kong,613-622.
    [115]郑伟,王朝坤,刘璋,王建民。一种基于随机游走模型的多标签分类算法[J]。计算机学报,2010,33(8):1418-1426.
    [116] Bogdanov P, Singh AK. Molecular function prediction using neighborhoodfeatures[J]. IEEE/ACM Transactions on Computational Biology andBioinformatics,2010,7(2):208-217.
    [117] Page L, Brin S, Motwani R, et al., The pagerank citation ranking: bringing orderto the web[R]. Stanford Digital Libraries Working Paper,1998.
    [118] Haveliwala T. Topic-sensitive pagerank: a context-sensitive ranking algorithm forweb search[J]. IEEE Transactions on Knowledge and Data Engineering,2003,15(4):784-796.
    [119] Watts DJ, Strogatz SH. Collective dynamics of small-world networks[J]. Nature,1998,393:440-442.
    [120] Costa LF, Rodrigues FA, Travieso G et al. Characterization of complex networks:a survey of measurements[J]. Advances in Physics,2007,56(1):167-242.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700