SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences
详细信息    查看全文
  • 作者:Federico Agostini (1) (2)
    Davide Cirillo (1) (2)
    Riccardo Delli Ponti (1) (2)
    Gian Gaetano Tartaglia (1) (2) (3)

    1. Gene Function and Evolution
    ; Centre for Genomic Regulation (CRG) ; C/ Dr. Aiguader 88 ; 08003 ; Barcelona ; Spain
    2. Universitat Pompeu Fabra (UPF)
    ; C/ Dr. Aiguader 88 ; 08003 ; Barcelona ; Spain
    3. Instituci贸 Catalana de Recerca i Estudis Avan莽ats (ICREA)
    ; 23 Passeig Llu铆s Companys ; 08010 ; Barcelona ; Spain
  • 关键词:Discriminative motif discovery ; Nucleic acids ; ChIP ; seq ; CLIP ; seq
  • 刊名:BMC Genomics
  • 出版年:2014
  • 出版时间:December 2014
  • 年:2014
  • 卷:15
  • 期:1
  • 全文大小:1,333 KB
  • 参考文献:1. Coulon, A, Chow, CC, Singer, RH, Larson, DR (2013) Eukaryotic transcriptional dynamics: from single molecules to cell populations. Nat Rev Genet 14: pp. 572-584 CrossRef
    2. Janga, SC (2012) From specific to global analysis of posttranscriptional regulation in eukaryotes: posttranscriptional regulatory networks. Brief Funct Genomics 11: pp. 505-521 CrossRef
    3. Pichon, X, Wilson, LA, Stoneley, M, Bastide, A, King, HA, Somers, J, Willis, AEE (2012) RNA binding protein/RNA element interactions and the control of translation. Curr Protein Peptide Sci 13: pp. 294-304 CrossRef
    4. Koboldt, DC, Steinberg, KM, Larson, DE, Wilson, RK, Mardis, ER (2013) The next-generation sequencing revolution and its impact on genomics. Cell 155: pp. 27-38 CrossRef
    5. Dassi, E, Quattrone, A (2012) Tuning the engine: an introduction to resources on post-transcriptional regulation of gene expression. RNA Biol 9: pp. 1224-1232 CrossRef
    6. Sinha, S (2003) Discriminative motifs. J Comput Biol: J Comput Mol Cell Biol 10: pp. 599-615 88219" target="_blank" title="It opens in new window">CrossRef
    7. Grau, J, Posch, S, Grosse, I, Keilwagen, J (2013) A general approach for discriminative de novo motif discovery from high-throughput data. Nucleic Acids Res 41: pp. 197 CrossRef
    8. Yao, Z, Macquarrie, KL, Fong, AP, Tapscott, SJ, Ruzzo, WL, Gentleman, RC (2014) Discriminative motif analysis of high-throughput dataset. Bioinformatics (Oxford, England) 30: pp. 775-783 CrossRef
    9. Ma, X, Kulkarni, A, Zhang, Z, Xuan, Z, Serfling, R, Zhang, MQ (2012) A highly efficient and effective motif discovery method for ChIP-seq/ChIP-chip data using positional information. Nucleic Acids Res 40: pp. 50 CrossRef
    10. Weirauch, MT, Cote, A, Norel, R, Annala, M, Zhao, Y, Riley, TR, Saez-Rodriguez, J, Cokelaer, T, Vedenko, A, Talukder, S, Bussemaker, HJ, Morris, QD, Bulyk, ML, Stolovitzky, G, Hughes, TR (2013) Evaluation of methods for modeling transcription factor sequence specificity. Nat Biotechnol 31: pp. 126-134 CrossRef
    11. Bailey, TL (2011) DREME: motif discovery in transcription factor ChIP-seq data. Bioinformatics (Oxford, England) 27: pp. 1653-1659 CrossRef
    12. Mason, MJ, Plath, K, Zhou, Q (2010) Identification of context-dependent motifs by contrasting ChIP, binding data. Bioinformatics (Oxford, England) 26: pp. 2826-2832 CrossRef
    13. Huggins, P, Zhong, S, Shiff, I, Beckerman, R, Laptenko, O, Prives, C, Schulz, MH, Simon, I, Bar-Joseph, Z (2011) DECOD: fast and accurate discriminative DNA motif finding. Bioinformatics (Oxford, England) 27: pp. 2361-2367 CrossRef
    14. Luehr, S, Hartmann, H, S枚ding, J (2012) The XXmotif web server for eXhaustive, weight matriX-based motif discovery in nucleotide sequences. Nucleic Acids Res 40: pp. 104-109 CrossRef
    15. Fauteux, F, Blanchette, M, Str枚mvik, MV (2008) Seeder: discriminative seeding DNA motif discovery. Bioinformatics 24: pp. 2303-2307 CrossRef
    16. Giardine, B, Riemer, C, Hardison, RC, Burhans, R, Elnitski, L, Shah, P, Zhang, Y, Blankenberg, D, Albert, I, Taylor, J, Miller, W, Kent, WJ, Nekrutenko, A (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res 15: pp. 1451-1455 CrossRef
    17. Bailey, TL, Elkan, C (1995) The value of prior knowledge in discovering motifs with MEME. Proc Int Conf Intell Syst Mol Biol; ISMB. Int Conf Intell Syst Mol Biol 3: pp. 21-29
    18. Anders, G, Mackowiak, SD, Jens, M, Maaskola, J, Kuntzagk, A, Rajewsky, N, Landthaler, M, Dieterich, C (2012) doRiNA: a database of RNA interactions in post-transcriptional regulation. Nucleic Acids Res 40: pp. 180-186 CrossRef
    19. Harrow, J, Frankish, A, Gonzalez, JM, Tapanari, E, Diekhans, M, Kokocinski, F, Aken, BL, Barrell, D, Zadissa, A, Searle, S, Barnes, I, Bignell, A, Boychenko, V, Hunt, T, Kay, M, Mukherjee, G, Rajan, J, Despacio-Reyes, G, Saunders, G, Steward, C, Harte, R, Lin, M, Howald, C, Tanzer, A, Derrien, T, Chrast, J, Walters, N, Balasubramanian, S, Pei, B (2012) GENCODE: the reference human genome annotation for the ENCODE, project. Genome Res 22: pp. 1760-1774 CrossRef
    20. Euskirchen, GM, Rozowsky, JS, Wei, C-L, Lee, WH, Zhang, ZD, Hartman, S, Emanuelsson, O, Stolc, V, Weissman, S, Gerstein, MB, Ruan, Y, Snyder, M (2007) Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res 17: pp. 898-909 CrossRef
    21. Hafner, M, Landthaler, M, Burger, L, Khorshid, M, Hausser, J, Berninger, P, Rothballer, A, Ascano, M, Jungkamp, A-C, Munschauer, M, Ulrich, A, Wardle, GS, Dewell, S, Zavolan, M, Tuschl, T (2010) PAR-CliP鈥揳 method to identify transcriptome-wide the binding sites of RNA binding proteins. J Visualized Exper: JoVE.
    22. Lebedeva, S, Jens, M, Theil, K, Schwanh盲usser, B, Selbach, M, Landthaler, M, Rajewsky, N (2011) Transcriptome-wide analysis of regulatory interactions of the RNA-binding protein HuR. Mol Cell 43: pp. 340-352 CrossRef
    23. Kishore, S, Jaskiewicz, L, Burger, L, Hausser, J, Khorshid, M, Zavolan, M (2011) A quantitative analysis of CLIP methods for identifying binding sites of RNA-binding proteins. Nat Methods 8: pp. 559-564 CrossRef
    24. Mukherjee, N, Corcoran, DL, Nusbaum, JD, Reid, DW, Georgiev, S, Hafner, M, Ascano, JM, Tuschl, T, Ohler, U, Keene, JD (2011) Integrative regulatory mapping indicates that the RNA-binding, protein HuR couples pre-mRNA processing and mRNA stability. Mol Cell 43: pp. 327-339 CrossRef
    25. Hoell, JI, Larsson, E, Runge, S, Nusbaum, JD, Duggimpudi, S, Farazi, TA, Hafner, M, Borkhardt, A, Sander, C, Tuschl, T (2011) RNA targets of wild-type and mutant FET family proteins. Nat Struct Mol Biol 18: pp. 1428-1431 CrossRef
    26. Sanford, JR, Wang, X, Mort, M, Vanduyn, N, Cooper, DN, Mooney, SD, Edenberg, HJ, Liu, Y (2009) Splicing factor SFRS1 recognizes a functionally diverse landscape of RNA transcripts. Genome Res 19: pp. 381-394 CrossRef
    27. Tollervey, JR, Curk, T, Rogelj, B, Briese, M, Cereda, M, Kayikci, M, K枚nig, J, Hortob谩gyi, T, Nishimura, AL, Zupunski, V, Patani, R, Chandran, S, Rot, G, Zupan, B, Shaw, CE, Ule, J (2011) Characterizing the RNA targets and position-dependent splicing regulation by TDP-43. Nat Neurosci 14: pp. 452-458 CrossRef
    28. Wang, Z, Kayikci, M, Briese, M, Zarnack, K, Luscombe, NM, Rot, G, Zupan, B, Curk, T, Ule, J (2010) iCLIP predicts the dual splicing effects of TIA-RNA interactions. PLoS Biol 8: pp. 1000530 CrossRef
    29. Mathelier, A, Zhao, X, Zhang, AW, Parcy, F, Worsley-Hunt, R, Arenillas, DJ, Buchman, S, Chen, C, Chou, A, Ienasescu, H, Lim, J, Shyr, C, Tan, G, Zhou, M, Lenhard, B, Sandelin, A, Wasserman, WW (2014) JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles. Nucleic Acids Res 42: pp. 142-147 CrossRef
    30. Jolma, A, Yan, J, Whitington, T, Toivonen, J, Nitta, KR, Rastas, P, Morgunova, E, Enge, M, Taipale, M, Wei, G, Palin, K, Vaquerizas, JM, Vincentelli, R, Luscombe, NM, Hughes, TR, Lemaire, P, Ukkonen, E, Kivioja, T, Taipale, J (2013) DNA-binding specificities of human transcription factors. Cell 152: pp. 327-339 CrossRef
    31. Tanaka, E, Bailey, T, Grant, CE, Noble, WS, Keich, U (2011) Improved similarity scores for comparing motifs. Bioinformatics (Oxford, England) 27: pp. 1603-1609 CrossRef
    32. Kankainen, M, L枚ytynoja, A (2007) MATLIGN: a motif clustering, comparison and matching tool. BMC Bioinformatics 8: pp. 189 CrossRef
    33. Ule, J, Jensen, KB, Ruggiu, M, Mele, A, Ule, A, Darnell, RB (2003) CLIP identifies nova-regulated RNA networks in the brain. Science (New York, N.Y.) 302: pp. 1212-1215 CrossRef
    34. Patel, RY, Stormo, GD (2014) Discriminative motif optimization based on perceptron training. Bioinformatics (Oxford, England) 30: pp. 941-948 CrossRef
    35. Mathelier, A, Wasserman, WW (2013) The next generation of transcription factor binding site prediction. PLoS Comput Biol 9: pp. 1003214 CrossRef
    36. Bellucci, M, Agostini, F, Masin, M, Tartaglia, GG (2011) Predicting protein associations with long noncoding RNAs. Nat Methods 8: pp. 444-445 CrossRef
    37. Agostini, F, Zanzoni, A, Klus, P, Marchese, D, Cirillo, D, Tartaglia, GG (2013) catRAPID omics: a web server for large-scale prediction of protein-RNA interactions. Bioinformatics (Oxford, England) 29: pp. 2928-2930 CrossRef
  • 刊物主题:Life Sciences, general; Microarrays; Proteomics; Animal Genetics and Genomics; Microbial Genetics and Genomics; Plant Genetics & Genomics;
  • 出版者:BioMed Central
  • ISSN:1471-2164
文摘
Background The large amount of data produced by high-throughput sequencing poses new computational challenges. In the last decade, several tools have been developed for the identification of transcription and splicing factor binding sites. Results Here, we introduce the SeAMotE (Sequence Analysis of Motifs Enrichment) algorithm for discovery of regulatory regions in nucleic acid sequences. SeAMotE provides (i) a robust analysis of high-throughput sequence sets, (ii) a motif search based on pattern occurrences and (iii) an easy-to-use web-server interface. We applied our method to recently published data including 351 chromatin immunoprecipitation (ChIP) and 13 crosslinking immunoprecipitation (CLIP) experiments and compared our results with those of other well-established motif discovery tools. SeAMotE shows an average accuracy of 80% in finding discriminative motifs and outperforms other methods available in literature. Conclusions SeAMotE is a fast, accurate and flexible algorithm for the identification of sequence patterns involved in protein-DNA and protein-RNA recognition. The server can be freely accessed at http://s.tartaglialab.com/new_submission/seamote.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700