SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data
详细信息    查看全文
  • 作者:Wenlong Jia (1) (2)
    Kunlong Qiu (1) (2)
    Minghui He (1) (2)
    Pengfei Song (2)
    Quan Zhou (1) (2) (3)
    Feng Zhou (2) (4)
    Yuan Yu (2)
    Dandan Zhu (2)
    Michael L Nickerson (5)
    Shengqing Wan (1) (2)
    Xiangke Liao (6)
    Xiaoqian Zhu (6) (7)
    Shaoliang Peng (6) (7)
    Yingrui Li (1) (2)
    Jun Wang (1) (2) (8) (9)
    Guangwu Guo (1) (2)
  • 刊名:Genome Biology
  • 出版年:2013
  • 出版时间:February 2013
  • 年:2013
  • 卷:14
  • 期:2
  • 全文大小:457 KB
  • 参考文献:1. Mitelman F, Johansson B, Mertens F: Fusion genes and rearranged genes as a linear function of chromosome aberrations in cancer. / Nat Genet 2004, 36:331鈥?34. CrossRef
    2. Mitelman F, Johansson B, Mertens F: The impact of translocations and gene fusions on cancer causation. / Nat Rev Cancer 2007, 7:233鈥?45. CrossRef
    3. Frohling S, Dohner H: Chromosomal abnormalities in cancer. / N Engl J Med 2008, 359:722鈥?34. CrossRef
    4. Tkachuk DC, Westbrook CA, Andreeff M, Donlon TA, Cleary ML, Suryanarayan K, Homge M, Redner A, Gray J, Pinkel D: Detection of bcr-abl fusion in chronic myelogeneous leukemia by in situ hybridization. / Science 1990, 250:559鈥?62. CrossRef
    5. Tomlins SA, Rhodes DR, Perner S, Dhanasekaran SM, Mehra R, Sun XW, Varambally S, Cao X, Tchinda J, Kuefer R, Lee C, Montie JE, Shah RB, Pienta KJ, Rubin MA, Chinnaiyan AM: Recurrent fusion of TMPRSS2 and ETS transcription factor genes in prostate cancer. / Science 2005, 310:644鈥?48. CrossRef
    6. Tomlins SA, Laxman B, Dhanasekaran SM, Helgeson BE, Cao X, Morris DS, Menon A, Jing X, Cao Q, Han B, Yu J, Wang L, Montie JE, Rubin MA, Pienta KJ, Roulston D, Shah RB, Varambally S, Mehra R, Chinnaiyan AM: Distinct classes of chromosomal rearrangements create oncogenic ETS gene fusions in prostate cancer. / Nature 2007, 448:595鈥?99. CrossRef
    7. Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, Fujiwara S, Watanabe H, Kurashina K, Hatanaka H, Bando M, Ohno S, Ishikawa Y, Aburatani H, Niki T, Sohara Y, Sugiyama Y, Mano H: Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. / Nature 2007, 448:561鈥?66. CrossRef
    8. Bass AJ, Lawrence MS, Brace LE, Ramos AH, Drier Y, Cibulskis K, Sougnez C, Voet D, Saksena G, Sivachenko A, Jing R, Parkin M, Pugh T, Verhaak RG, Stransky N, Boutin AT, Barretina J, Solit DB, Vakiani E, Shao W, Mishina Y, Warmuth M, Jimenez J, Chiang DY, Signoretti S, Kaelin WG, Spardy N, Hahn WC, Hoshida Y, Ogino S, / et al.: Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion. / Nat Genet 2011, 43:964鈥?68. CrossRef
    9. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. / Nat Methods 2008, 5:621鈥?28. CrossRef
    10. Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. / Nature 2008, 456:470鈥?76. CrossRef
    11. Hillier LW, Reinke V, Green P, Hirst M, Marra MA, Waterston RH: Massively parallel sequencing of the polyadenylated transcriptome of C. elegans. / Genome Res 2009, 19:657鈥?66. CrossRef
    12. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M: The transcriptional landscape of the yeast genome defined by RNA sequencing. / Science 2008, 320:1344鈥?349. CrossRef
    13. Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC: Genome-wide mapping of alternative splicing in Arabidopsis thaliana. / Genome Res 2010, 20:45鈥?8. CrossRef
    14. McManus CJ, Coolon JD, Duff MO, Eipper-Mains J, Graveley BR, Wittkopp PJ: Regulatory divergence in Drosophila revealed by mRNA-seq. / Genome Res 2010, 20:816鈥?25. CrossRef
    15. Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Fang X, Chen L, Tian W, Tao Y, Kristiansen K, Zhang X, Li S, Yang H, Wang J: Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. / Genome Res 2010, 20:646鈥?54. CrossRef
    16. Wang B, Guo G, Wang C, Lin Y, Wang X, Zhao M, Guo Y, He M, Zhang Y, Pan L: Survey of the transcriptome of Aspergillus oryzae via massively parallel mRNA sequencing. / Nucleic Acids Res 2010, 38:5075鈥?087. CrossRef
    17. Maher CA, Kumar-Sinha C, Cao X, Kalyana-Sundaram S, Han B, Jing X, Sam L, Barrette T, Palanisamy N, Chinnaiyan AM: Transcriptome sequencing to detect gene fusions in cancer. / Nature 2009, 458:97鈥?01. CrossRef
    18. Maher CA, Palanisamy N, Brenner JC, Cao X, Kalyana-Sundaram S, Luo S, Khrebtukova I, Barrette TR, Grasso C, Yu J, Lonigro RJ, Schroth G, Kumar-Sinha C, Chinnaiyan AM: Chimeric transcript discovery by paired-end transcriptome sequencing. / Proc Natl Acad Sci USA 2009, 106:12353鈥?2358. CrossRef
    19. Berger MF, Levin JZ, Vijayendran K, Sivachenko A, Adiconis X, Maguire J, Johnson LA, Robinson J, Verhaak RG, Sougnez C, Onofrio RC, Ziaugra L, Cibulskis K, Laine E, Barretina J, Winckler W, Fisher DE, Getz G, Meyerson M, Jaffe DB, Gabriel SB, Lander ES, Dummer R, Gnirke A, Nusbaum C, Garraway LA: Integrative analysis of the melanoma transcriptome. / Genome Res 2010, 20:413鈥?27. CrossRef
    20. Edgren H, Murumagi A, Kangaspeska S, Nicorici D, Hongisto V, Kleivi K, Rye IH, Nyberg S, Wolf M, Borresen-Dale AL, Kallioniemi O: Identification of fusion genes in breast cancer by paired-end RNA-sequencing. / Genome Biol 2011, 12:R6. CrossRef
    21. Sboner A, Habegger L, Pflueger D, Terry S, Chen DZ, Rozowsky JS, Tewari AK, Kitabayashi N, Moss BJ, Chee MS, Demichelis F, Rubin MA, Gerstein MB: FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data. / Genome Biol 2010, 11:R104. CrossRef
    22. McPherson A, Hormozdiari F, Zayed A, Giuliany R, Ha G, Sun MG, Griffith M, Heravi Moussavi A, Senz J, Melnyk N, Pacheco M, Marra MA, Hirst M, Nielsen TO, Sahinalp SC, Huntsman D, Shah SP: deFuse: an algorithm for gene fusion discovery in tumor RNA-Seq data. / PLoS Comput Biol 2011, 7:e1001138. CrossRef
    23. Kim D, Salzberg SL: TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. / Genome Biol 2011, 12:R72. CrossRef
    24. Li Y, Chien J, Smith DI, Ma J: FusionHunter: identifying fusion transcripts in cancer using paired-end RNA-seq. / Bioinformatics 2011, 27:1708鈥?710. CrossRef
    25. Asmann YW, Hossain A, Necela BM, Middha S, Kalari KR, Sun Z, Chai HS, Williamson DW, Radisky D, Schroth GP, Kocher JP, Perez EA, Thompson EA: A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines. / Nucleic Acids Res 2011, 39:e100. CrossRef
    26. Iyer MK, Chinnaiyan AM, Maher CA: ChimeraScan: a tool for identifying chimeric transcription in sequencing data. / Bioinformatics 2011, 27:2903鈥?904. CrossRef
    27. Ge H, Liu K, Juan T, Fang F, Newman M, Hoeck W: FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution. / Bioinformatics 2011, 27:1922鈥?928. CrossRef
    28. Flicek P, Amode MR, Barrell D, Beal K, Brent S, Chen Y, Clapham P, Coates G, Fairley S, Fitzgerald S, Gordon L, Hendrix M, Hourlier T, Johnson N, Kahari A, Keefe D, Keenan S, Kinsella R, Kokocinski F, Kulesha E, Larsson P, Longden I, McLaren W, Overduin B, Pritchard B, Riat HS, Rios D, Ritchie GR, Ruffier M, Schuster M, / et al.: Ensembl 2011. / Nucleic Acids Res 2011, 39:D800鈥?06. CrossRef
    29. Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. / Genome Biol 2009, 10:R25. CrossRef
    30. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J: SOAP2: an improved ultrafast tool for short read alignment. / Bioinformatics 2009, 25:1966鈥?967. CrossRef
    31. Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. / Genome Res 2008, 18:1851鈥?858. CrossRef
    32. Salzman J, Marinelli RJ, Wang PL, Green AE, Nielsen JS, Nelson BH, Drescher CW, Brown PO: ESRRA-C11orf20 is a recurrent gene fusion in serous ovarian carcinoma. / PLoS Biol 2011, 9:e1001156. CrossRef
    33. Singh D, Chan JM, Zoppoli P, Niola F, Sullivan R, Castano A, Liu EM, Reichel J, Porrati P, Pellegatta S, Qiu K, Gao Z, Ceccarelli M, Riccardi R, Brat DJ, Guha A, Aldape K, Golfinos JG, Zagzag D, Mikkelsen T, Finocchiaro G, Lasorella A, Rabadan R, Iavarone A: Transforming fusions of FGFR and TACC genes in human glioblastoma. / Science 2012, 337:1231鈥?235. CrossRef
    34. Wang J, Mullighan CG, Easton J, Roberts S, Heatley SL, Ma J, Rusch MC, Chen K, Harris CC, Ding L, Holmfeldt L, Payne-Turner D, Fan X, Wei L, Zhao D, Obenauer JC, Naeve C, Mardis ER, Wilson RK, Downing JR, Zhang J: CREST maps somatic structural variation in cancer genomes with base-pair resolution. / Nat Methods 2011, 8:652鈥?54. CrossRef
    35. Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. / Bioinformatics 2009, 25:1754鈥?760. CrossRef
    36. Peng Z, Cheng Y, Tan BC, Kang L, Tian Z, Zhu Y, Zhang W, Liang Y, Hu X, Tan X, Guo J, Dong Z, Bao L, Wang J: Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. / Nat Biotechnol 2012, 30:253鈥?60. CrossRef
    37. Gao F, Liu X, Wu XP, Wang XL, Gong D, Lu H, Xia Y, Song Y, Wang J, Du J, Liu S, Han X, Tang Y, Yang H, Jin Q, Zhang X, Liu M: Differential DNA methylation in discrete developmental stages of the parasitic nematode Trichinella spiralis. / Genome Biol 2012, 13:R100. CrossRef
    38. BLAT Search Genome [http://genome.ucsc.edu/cgi-bin/hgBlat?command=start]
  • 作者单位:Wenlong Jia (1) (2)
    Kunlong Qiu (1) (2)
    Minghui He (1) (2)
    Pengfei Song (2)
    Quan Zhou (1) (2) (3)
    Feng Zhou (2) (4)
    Yuan Yu (2)
    Dandan Zhu (2)
    Michael L Nickerson (5)
    Shengqing Wan (1) (2)
    Xiangke Liao (6)
    Xiaoqian Zhu (6) (7)
    Shaoliang Peng (6) (7)
    Yingrui Li (1) (2)
    Jun Wang (1) (2) (8) (9)
    Guangwu Guo (1) (2)

    1. BGI Tech Solutions Co., Ltd, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
    2. BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen, 518083, China
    3. School of Life Science and Technology, University of Electronic Science and Technology of China, No.4, Section 2, North Jianshe Road, Chengdu, 610054, China
    4. School of Bioscience and Bioengineering, South China University of Technology, Guangzhou Higher Education Mega Centre, Panyu District, Guangzhou, 510006, China
    5. Cancer and Inflammation Program, National Cancer Institute, National Institutes of Health, 1050 Boyles Street, Frederick, MD 21702, USA
    6. School of Computer Science, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan, 410073, China
    7. State Key Laboratory of High Performance Computing, National University of Defense Technology, No.47, Yanwachi street, Kaifu District, Changsha, Hunan, 410073, China
    8. The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, DK-1165, Copenhagen, Denmark
    9. Department of Biology, University of Copenhagen, DK-1165, Copenhagen, Denmark
  • ISSN:1465-6906
文摘
We have developed a new method, SOAPfuse, to identify fusion transcripts from paired-end RNA-Seq data. SOAPfuse applies an improved partial exhaustion algorithm to construct a library of fusion junction sequences, which can be used to efficiently identify fusion events, and employs a series of filters to nominate high-confidence fusion transcripts. Compared with other released tools, SOAPfuse achieves higher detection efficiency and consumed less computing resources. We applied SOAPfuse to RNA-Seq data from two bladder cancer cell lines, and confirmed 15 fusion transcripts, including several novel events common to both cell lines. SOAPfuse is available at http://soap.genomics.org.cn/soapfuse.html.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700