利用基因组数据挖掘对人类长非编码RNA进行功能注释(英文)
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Genomic data mining for functional annotation of human long noncoding RNAs
  • 作者:Brian ; L.GUDENAS ; Jun ; WANG ; Shu-zhen ; KUANG ; An-qi ; WEI ; Steven ; B.COGILL ; Liang-jiang ; WANG
  • 英文作者:Brian L.GUDENAS;Jun WANG;Shu-zhen KUANG;An-qi WEI;Steven B.COGILL;Liang-jiang WANG;Department of Genetics and Biochemistry, Clemson University;
  • 关键词:长非编码RNA(lncRNA) ; 功能注释 ; 基因组数据挖掘 ; 机器学习
  • 英文关键词:Long noncoding RNA;;Functional annotation;;Genomic data mining;;Machine learning
  • 中文刊名:ZDYW
  • 英文刊名:浙江大学学报B辑(生物医学与生物技术)(英文版)
  • 机构:Department of Genetics and Biochemistry, Clemson University;
  • 出版日期:2019-05-21
  • 出版单位:Journal of Zhejiang University-Science B(Biomedicine & Biotechnology)
  • 年:2019
  • 期:v.20
  • 基金:supported by the Self Regional Healthcare Foundation,USA
  • 语种:英文;
  • 页:ZDYW201906003
  • 页数:12
  • CN:06
  • ISSN:33-1356/Q
  • 分类号:23-34
摘要
越来越多证据表明RNA在生物系统中扮演着重要的角色,而这些发现支持了生命起源于RNA的假设。在人类基因组中,大部分的基因并不编码蛋白质,被称为非编码RNA基因。长非编码RNA(lncRNA)是其中最大的一类,其转录本长度大于200个核苷酸。虽然一些lncRNA已被证明是调控基因表达和3D基因组结构的重要元件,但是大部分lncRNA还未被研究和注释。本课题组利用大量基因组数据,提出一些基于数据挖掘和机器学习的方法,对人类lncRNA进行功能注释。我们与其他同领域课题组的近期研究结果表明,基因组数据挖掘可帮助加深对lnc RNA功能的理解,并为与疾病相关lncRNA的实验研究提供重要信息。
        Life may have begun in an RNA world, which is supported by increasing evidence of the vital role that RNAs perform in biological systems. In the human genome, most genes actually do not encode proteins;they are noncoding RNA genes. The largest class of noncoding genes is known as long noncoding RNAs(lncRNAs), which are transcripts greater in length than 200 nucleotides, but with no protein-coding capacity.While some lncRNAs have been demonstrated to be key regulators of gene expression and 3D genome organization, most lncRNAs are still uncharacterized. We thus propose several data mining and machine learning approaches for the functional annotation of human lncRNAs by leveraging the vast amount of data from genetic and genomic studies. Recent results from our studies and those of other groups indicate that genomic data mining can give insights into lncRNA functions and provide valuable information for experimental studies of candidate lncRNAs associated with human disease.
引文
Achar A,S?trom P,2015.RNA motif discovery:a computational overview.Biol Direct,10:61.https://doi.org/10.1186/s13062-015-0090-5
    Brázda V,HároníkováL,Liao JCC,et al.,2014.DNA and RNA quadruplex-binding proteins.Int J Mol Sci,15(10):17493-17517.https://doi.org/10.3390/ijms151017493
    Cabili MN,Dunagin MC,McClanahan PD,et al.,2015.Localization and abundance analysis of human lncRNAs at single-cell and single-molecule resolution.Genome Biol,16:20.https://doi.org/10.1186/s13059-015-0586-4
    Cajigas I,Leib DE,Cochrane J,et al.,2015.Evf2 lncRNA/BRG1/DLX1 interactions reveal RNA-dependent inhibition of chromatin remodeling.Development,142(15):2641-2652.https://doi.org/10.1242/dev.126318
    Cammas A,Millevoi S,2017.RNA G-quadruplexes:emerging mechanisms in disease.Nucleic Acids Res,45(4):1584-1595.https://doi.org/10.1093/nar/gkw1280
    Cao HF,Wahlestedt C,Kapranov P,2018.Strategies to annotate and characterize long noncoding RNAs:advantages and pitfalls.Trends Genet,34(9):704-721.https://doi.org/10.1016/j.tig.2018.06.002
    Cao Z,Pan XY,Yang Y,et al.,2018.The lncLocator:a subcellular localization predictor for long non-coding RNAs based on a stacked ensemble classifier.Bioinformatics,34(13):2185-2194.https://doi.org/10.1093/bioinformatics/bty085
    Carlevaro-Fita J,Johnson R,2019.Global positioning system:understanding long noncoding RNAs through subcellular localization.Mol Cell,73(5):869-883.https://doi.org/10.1016/j.molcel.2019.02.008
    Chaudhary R,Gryder B,Woods WS,et al.,2017.Prosurvival long noncoding RNA PINCR regulates a subset of p53targets in human colorectal cancer cells by binding to Matrin 3.eLife,6:e23244.https://doi.org/10.7554/eLife.23244
    Chen LL,2016.Linking long noncoding RNA localization and function.Trends Biochem Sci,41(9):761-772.https://doi.org/10.1016/j.tibs.2016.07.003
    Ching T,Himmelstein DS,Beaulieu-Jones BK,et al.,2018.Opportunities and obstacles for deep learning in biology and medicine.J R Soc Interface,15(141):20170387.https://doi.org/10.1098/rsif.2017.0387
    Clark BS,Blackshaw S,2014.Long non-coding RNA-dependent transcriptional regulation in neuronal development and disease.Front Genet,5:164.https://doi.org/10.3389/fgene.2014.00164
    Clemson CM,Hutchinson JN,Sara SA,et al.,2009.An architectural role for a nuclear noncoding RNA:NEAT1RNA is essential for the structure of paraspeckles.Mol Cell,33(6):717-726.https://doi.org/10.1016/j.molcel.2009.01.026
    Cogill SB,Wang LJ,2014.Co-expression network analysis of human lncRNAs and cancer genes.Cancer Inform,13(Suppl 5):49-59.https://doi.org/10.4137/CIN.S14070
    Cogill SB,Wang LJ,2016.Support vector machine model of developmental brain gene expression data for prioritization of Autism risk gene candidates.Bioinformatics,32(23):3611-3618.https://doi.org/10.1093/bioinformatics/btw498
    Cogill SB,Srivastava AK,Yang MQ,et al.,2018.Co-expression of long non-coding RNAs and autism risk genes in the developing human brain.BMC Syst Biol,12(Suppl 7):91.https://doi.org/10.1186/s12918-018-0639-x
    Darnell JC,Fraser CE,Mostovetsky O,et al.,2005.Kissing complex RNAs mediate interaction between the Fragile-Xmental retardation protein KH2 domain and brain polyribosomes.Genes Dev,19(8):903-918.https://doi.org/10.1101/gad.1276805
    Davidovich C,Cech TR,2015.The recruitment of chromatin modifiers by long noncoding RNAs:lessons from PRC2.RNA,21(12):2007-2022.https://doi.org/10.1261/rna.053918.115
    de Rubeis S,He X,Goldberg AP,et al.,2014.Synaptic,transcriptional and chromatin genes disrupted in autism.Nature,515(7526):209-215.https://doi.org/10.1038/nature13772
    Derrien T,Johnson R,Bussotti G,et al.,2012.The GENCODEv7 catalog of human long noncoding RNAs:analysis of their gene structure,evolution,and expression.Genome Res,22(9):1775-1789.https://doi.org/10.1101/gr.132159.111
    ENCODE Project Consortium,2012.An integrated encyclopedia of DNA elements in the human genome.Nature,489(7414):57-74.https://doi.org/10.1038/nature11247
    FerrèF,Colantoni A,Helmer-Citterich M,2016.Revealing protein-lncRNA interaction.Brief Bioinform,17(1):106-116.https://doi.org/10.1093/bib/bbv031
    Geisler S,Coller J,2013.RNA in unexpected places:long non-coding RNA functions in diverse cellular contexts.Nat Rev Mol Cell Biol,14(11):699-712.https://doi.org/10.1038/nrm3679
    Gudenas BL,Wang LJ,2015.Gene coexpression networks in human brain developmental transcriptomes implicate the association of long noncoding RNAs with intellectual disability.Bioinform Biol Insights,9(Suppl 1):21-27.https://doi.org/10.4137/BBI.S29435
    Gudenas BL,Wang LJ,2018.Prediction of lncRNA subcellular localization with deep learning from sequence features.Sci Rep,8(1):16385.https://doi.org/10.1038/s41598-018-34708-w
    Gudenas BL,Srivastava AK,Wang LJ,2017.Integrative genomic analyses for identification and prioritization of long non-coding RNAs associated with autism.PLo SONE,12(5):e0178532.https://doi.org/10.1371/journal.pone.0178532
    Guo Y,Chen X,Xing RX,et al.,2018.Interplay between FMRP and lncRNA TUG1 regulates axonal development through mediating SnoN-Ccd1 pathway.Hum Mol Genet,27(3):475-485.https://doi.org/10.1093/hmg/ddx417
    Guttman M,Rinn JL,2012.Modular regulatory principles of large non-coding RNAs.Nature,482(7385):339-346.https://doi.org/10.1038/nature10887
    Hangauer MJ,Vaughn IW,Mc Manus MT,2013.Pervasive transcription of the human genome produces thousands of previously unidentified long intergenic noncoding RNAs.PLoS Genet,9(6):e1003569.https://doi.org/10.1371/journal.pgen.1003569
    Huarte M,Guttman M,Feldser D,et al.,2010.A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response.Cell,142(3):409-419.https://doi.org/10.1016/j.cell.2010.06.040
    Iyer MK,Niknafs YS,Malik R,et al.,2015.The landscape of long noncoding RNAs in the human transcriptome.Nat Genet,47(3):199-208.https://doi.org/10.1038/ng.3192
    Jackman JE,Alfonzo JD,2013.Transfer RNA modifications:nature’s combinatorial chemistry playground.Wiley Interdiscip Rev RNA,4(1):35-48.https://doi.org/10.1002/wrna.1144
    Jin JJ,Lv W,Xia P,et al.,2018.Long noncoding RNA SYISLregulates myogenesis by interacting with polycomb repressive complex 2.Proc Natl Acad Sci USA,115(42):E9802-E9811.https://doi.org/10.1073/pnas.1801471115
    Ke SD,Alemu EA,Mertens C,et al.,2015.A majority of m6Aresidues are in the last exons,allowing the potential for 3'UTR regulation.Genes Dev,29(19):2037-2053.https://doi.org/10.1101/gad.269415.115
    Kiser DP,Rivero O,Lesch KP,2015.Annual research review:the(epi)genetics of neurodevelopmental disorders in the era of whole-genome sequencing-unveiling the dark matter.J Child Psychol Psychiatry,56(3):278-295.https://doi.org/10.1111/jcpp.12392
    Kumar V,Westra HJ,Karjalainen J,et al.,2013.Human diseaseassociated genetic variation impacts large intergenic noncoding RNA expression.PLoS Genet,9(1):e1003201.https://doi.org/10.1371/journal.pgen.1003201
    Kung JT,Kesner B,An JY,et al.,2015.Locus-specific targeting to the X chromosome revealed by the RNA interactome of CTCF.Mol Cell,57(2):361-375.https://doi.org/10.1016/j.molcel.2014.12.006
    Li L,Zhuang YL,Zhao XS,et al.,2019.Long non-coding RNA in neuronal development and neurological disorders.Front Genet,9:744.https://doi.org/10.3389/fgene.2018.00744
    Li R,Zhu HL,Luo YB,2016.Understanding the functions of long non-coding RNAs through their higher-order structures.Int J Mol Sci,17(5):E702.https://doi.org/10.3390/ijms17050702
    Liao Q,Liu CN,Yuan XY,et al.,2011.Large-scale prediction of long non-coding RNA functions in a coding-non-coding gene co-expression network.Nucleic Acids Res,39(9):3864-3878.https://doi.org/10.1093/nar/gkq1348
    Linder B,Grozhik AV,Olarerin-George AO,et al.,2015.Single-nucleotide-resolution mapping of m6A and m6Am throughout the transcriptome.Nat Methods,12(8):767-772.https://doi.org/10.1038/nmeth.3453
    Liu N,Dai Q,Zheng GQ,et al.,2015.N6-methyladenosinedependent RNA structural switches regulate RNA-protein interactions.Nature,518(7540):560-564.https://doi.org/10.1038/nature14234
    Lu QS,Ren SJ,Lu M,et al.,2013.Computational prediction of associations between long non-coding RNAs and proteins.BMC Genomics,14:651.https://doi.org/10.1186/1471-2164-14-651
    Maurano MT,Humbert R,Rynes E,et al.,2012.Systematic localization of common disease-associated variation in regulatory DNA.Science,337(6099):1190-1195.https://doi.org/10.1126/science.1222794
    Morris KV,2016.Long Non-coding RNAs in Human Disease.Springer International Publishing,Cham,Germany.https://doi.org/10.1007/978-3-319-23907-1
    Muppirala UK,Honavar VG,Dobbs D,2011.Predicting RNA-protein interactions using only sequence information.BMCBioinformatics,12:489.https://doi.org/10.1186/1471-2105-12-489
    Necsulea A,Soumillon M,Warnefors M,et al.,2014.The evolution of lncRNA repertoires and expression patterns in tetrapods.Nature,505(7485):635-640.https://doi.org/10.1038/nature12943
    O'Roak BJ,Vives L,Girirajan S,et al.,2012.Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations.Nature,485(7397):246-250.https://doi.org/10.1038/nature10989
    Pan XY,Fan YX,Yan JC,et al.,2016.IPMiner:hidden ncRNA-protein interaction sequential pattern mining with stacked autoencoder for accurate computational prediction.BMC Genomics,17:582.https://doi.org/10.1186/s12864-016-2931-8
    Patil DP,Chen CK,Pickering BF,et al.,2016.m6A RNAmethylation promotes XIST-mediated transcriptional repression.Nature,537(7620):369-373.https://doi.org/10.1038/nature19342
    Pertea M,Salzberg SL,2010.Between a chicken and a grape:estimating the number of human genes.Genome Biol,11(5):206.https://doi.org/10.1186/gb-2010-11-5-206
    Pian C,Zhang GL,Chen Z,et al.,2016.LncRNApred:classification of long non-coding RNAs and protein-coding transcripts by the ensemble algorithm with a new hybrid feature.PLo S ONE,11(5):e0154567.https://doi.org/10.1371/journal.pone.0154567
    Ponting CP,Oliver PL,Reik W,2009.Evolution and functions of long noncoding RNAs.Cell,136(4):629-641.https://doi.org/10.1016/j.cell.2009.02.006
    Quinn JJ,Chang HY,2016.Unique features of long non-coding RNA biogenesis and function.Nat Rev Genet,17(1):47-62.https://doi.org/10.1038/nrg.2015.10
    Rashid F,Shah A,Shan G,2016.Long non-coding RNAs in the cytoplasm.Genomics Proteomics Bioinformatics,14(2):73-80.https://doi.org/10.1016/j.gpb.2016.03.005
    Rica?o-Ponce I,Wijmenga C,2013.Mapping of immune-mediated disease genes.Annu Rev Genomics Hum Genet,14:325-353.https://doi.org/10.1146/annurev-genom-091212-153450
    Song JH,Yi CQ,2017.Chemical modifications to RNA:a new layer of gene expression regulation.ACS Chem Biol,12(2):316-325.https://doi.org/10.1021/acschembio.6b00960
    Srivastava AK,Schwartz CE,2014.Intellectual disability and autism spectrum disorders:causal genes and molecular mechanisms.Neurosci Biobehav Rev,46:161-174.https://doi.org/10.1016/j.neubiorev.2014.02.015
    Su ZD,Huang Y,Zhang ZY,et al.,2018.iLoc-lncRNA:predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC.Bioinformatics,34(24):4196-4204.https://doi.org/10.1093/bioinformatics/bty508
    Sun QY,Hao QY,Prasanth KV,2018.Nuclear long noncoding RNAs:key regulators of gene expression.Trends Genet,34(2):142-157.https://doi.org/10.1016/j.tig.2017.11.005
    Sun S,del Rosario BC,Szanto A,et al.,2013.Jpx RNA activates Xist by evicting CTCF.Cell,153(7):1537-1551.https://doi.org/10.1016/j.cell.2013.05.028
    Tripathi V,Ellis JD,Shen Z,et al.,2010.The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation.Mol Cell,39(6):925-938.https://doi.org/10.1016/j.molcel.2010.08.011
    van de Vondervoort IIGM,Gordebeke PM,Khoshab N,et al.,2013.Long non-coding RNAs in neurodevelopmental disorders.Front Mol Neurosci,6:53.https://doi.org/10.3389/fnmol.2013.00053
    Verpelli C,Montani C,Vicidomini C,et al.,2013.Mutations of the synapse genes and intellectual disability syndromes.Eur J Pharmacol,719(1-3):112-116.https://doi.org/10.1016/j.ejphar.2013.07.023
    Wang KC,Chang HY,2011.Molecular mechanisms of long noncoding RNAs.Mol Cell,43(6):904-914.https://doi.org/10.1016/j.molcel.2011.08.018
    Wang X,He C,2014.Dynamic RNA modifications in posttranscriptional regulation.Mol Cell,56(1):5-12.https://doi.org/10.1016/j.molcel.2014.09.001
    Wang X,Lu ZK,Gomez A,et al.,2014.N6-methyladenosinedependent regulation of messenger RNA stability.Nature,505(7481):117-120.https://doi.org/10.1038/nature12730
    Wang X,Zhao BS,Roundtree IA,et al.,2015.N6-methyladenosine modulates messenger RNA translation efficiency.Cell,161(6):1388-1399.https://doi.org/10.1016/j.cell.2015.05.014
    Wang Y,Zhao X,Ju W,et al.,2015.Genome-wide differential expression of synaptic long noncoding RNAs in autism spectrum disorder.Transl Psychiatry,5(10):e660.https://doi.org/10.1038/tp.2015.144
    Werner MS,Ruthenburg AJ,2015.Nuclear fractionation reveals thousands of chromatin-tethered noncoding RNAs adjacent to active genes.Cell Rep,12(7):1089-1098.https://doi.org/10.1016/j.celrep.2015.07.033
    Wu P,Zuo XL,Deng HL,et al.,2013.Roles of long noncoding RNAs in brain development,functional diversification and neurodegenerative diseases.Brain Res Bull,97:69-80.https://doi.org/10.1016/j.brainresbull.2013.06.001
    Xu X,Xu YC,Shi CQ,et al.,2017.A genome-wide comprehensively analyses of long noncoding RNA profiling and metastasis associated lncRNAs in renal cell carcinoma.Oncotarget,8(50):87773-87781.https://doi.org/10.18632/oncotarget.21206
    Yang LT,Tang YY,Xiong F,et al.,2018.LncRNAs regulate cancer metastasis via binding to functional proteins.Oncotarget,9(1):1426-1443.https://doi.org/10.18632/oncotarget.22840
    Yoon JH,Abdelmohsen K,Kim J,et al.,2013.Scaffold function of long non-coding RNA HOTAIR in protein ubiquitination.Nat Commun,4:2939.https://doi.org/10.1038/ncomms3939
    Zampetaki A,Albrecht A,Steinhofel K,2018.Long-noncoding RNA structure and function:is there a link?Front Physiol,9:1201.https://doi.org/10.3389/fphys.2018.01201
    Zhang YQ,Hamada M,2018.DeepM6ASeq:prediction and characterization of m6A-containing sequences using deep learning.BMC Bioinformatics,19(Suppl 19):524.https://doi.org/10.1186/s12859-018-2516-4
    Zhang ZH,Jhaveri DJ,Marshall VM,et al.,2014.A comparative study of techniques for differential expression analysis on RNA-seq data.PLo S ONE,9(8):e103207.https://doi.org/10.1371/journal.pone.0103207
    Zheng GXY,Do BT,Webster DE,et al.,2014.Dicer-microRNA-Myc circuit promotes transcription of hundreds of long noncoding RNAs.Nat Struct Mol Biol,21(7):585-590.https://doi.org/10.1038/nsmb.2842
    Zhou Y,Zeng P,Li YH,et al.,2016.SRAMP:prediction of mammalian N6-methyladenosine(m6A)sites based on sequence-derived features.Nucleic Acids Res,44(10):e91.https://doi.org/10.1093/nar/gkw104
    Ziats MN,Rennert OM,2013.Aberrant expression of long noncoding RNAs in autistic brain.J Mol Neurosci,49(3):589-593.https://doi.org/10.1007/s12031-012-9880-8
    Zou Q,Xing PW,Wei LY,et al.,2019.Gene2vec:gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA.RNA,25(2):205-218.https://doi.org/10.1261/rna.069112.118

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700