The Prediction of Human Genes in DNA Based on a Generalized Hidden Markov Model
详细信息    查看全文
  • 关键词:Gene prediction ; WWAM ; IMM ; GHMM ; The prefix sum arrays ; The method based on similarity weighting of sequence patterns
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9967
  • 期:1
  • 页码:747-755
  • 全文大小:918 KB
  • 参考文献:1.Cairui, L., Changsong, Z., Guoli, S.: Recent progress in gene mapping through high-throughput sequencing technology and forward genetic approaches. Yi chuan = Hereditas/Zhongguo yi chuan xue hui bian ji 37(8), 765–776 (2015)
    2.Burge, C., Karlin, S.: Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268(1), 78–94 (1997)CrossRef
    3.Burset, M., Seledtsov, I.A., Solovyev, V.V.: Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res. 28(21), 4364–4375 (2000)CrossRef
    4.Guigó, R., et al.: Prediction of gene structure ☆. J. Mol. Biol. 226(1), 141–157 (1992)CrossRef
    5.Haussler, D., David, K., Reese, M.G., Eeckman, F.H.: A generalized hidden Markov model for the recognition of human genes in DNA. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, St. Louis (1996)
    6.Stanke, M., Waack, S.: Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19(suppl 2), 215–225 (2003)CrossRef
    7.Fickett, J.W.: Finding genes by computer: the state of the art. Trends Genet. 12(8), 316–320 (1996)CrossRef
    8.Krogh, A., Mian, I.S., Haussler, D.: A hidden Markov model that finds genes in E. coli DNA. Nucleic Acids Res. 22(22), 4768–4778 (1994)CrossRef
    9.Salzberg, Steven L., D. B. Searls, and S. Kasif. “Computational methods in molecular biology.” Computational Methods in Molecular Biology49.2(1999):191-192
    10.Ryan, M.S., Nudd, G.R.: The viterbi algorithm. Warwick Res. Rep. Rr 37(2), 160–163 (1993)MathSciNet
    11.Majoros, W.H., et al.: Efficient decoding algorithms for generalized hidden Markov model gene finders. BMC Bioinform. 6(2), 8–16 (2005)
    12.Zhang, M.Q., Marr, T.G.: A weight array method for splicing signal analysis. Comput. Appl. Biosci. Cabios 9(5), 499–509 (1993)
    13.Salzberg, S.L., et al.: Microbial gene identification using interpolated Markov models. Nucleic Acids Res. 26(2), 544–548 (1998)CrossRef
  • 作者单位:Rui Guo (21)
    Ke Yan (21)
    Wei He (21)
    Jian Zhang (21)

    21. Bio-Computing Research Center, Shenzhen Graduate School, Harbin Institute of Technology, Shenzhen, China
  • 丛书名:Biometric Recognition
  • ISBN:978-3-319-46654-5
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9967
文摘
The Generalized Hidden Markov Model (GHMM) has been proved to be an excellently general probabilistic model of the gene structure of human genomic sequences. It can simultaneously incorporate different signal descriptions like splicing sites and content descriptions, for instance, compositional features of exons and introns. Enjoying its flexibility and convincing probabilistic underpinnings, we integrate some other modification of submodels and then implement a prediction program of Human Genes in DNA. The program has the capacity to predict multiple genes in a sequence, to deal with partial as well as complete genes, and to predict consistent sets of genes occurring on either or both DNA strands. More importantly, it also can perform well for longer sequences with an unknown number of genes in them. In the experiments, the results show that the proposed method has better performance in prediction accuracy than some existing methods, and over 70 % of exons can be identified exactly.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700