A Genetic Algorithm for Motif Finding Based on Statistical Significance

详细信息查看全文

作者：Josep Basha Gutierrez (20)
Martin Frith (21)
Kenta Nakai (22)

20. Department of Medical Bioinformatics ; Graduate School of Frontier Sciences ; The University of Tokyo ; 5-1-5 Kashiwanoha ; Kashiwa-shi ; Chiba-ken ; 277-8561 ; Japan
21. Computational Biology Research Center ; AIST Tokyo Waterfront Bio-IT Research Building ; 2-4-7 Aomi ; Koto-ku ; Tokyo ; 135-0064 ; Japan
22. Human Genome Center ; The Institute of Medical Science ; The University of Tokyo ; 4-6-1 Shirokane-dai ; Minato-ku ; Tokyo ; 108-8639 ; Japan
关键词：Motif finding ; Genetic Algorithm ; Transcription Factor Binding Site ; Statistical significance
刊名：Lecture Notes in Computer Science
出版年：2015
出版时间：2015
年：2015
卷：9043
期：1
页码：438-449
全文大小：1,354 KB
参考文献：1. Vijayvargiya, S., Shukla, P.: Identification of Transcription Factor Binding Sites in Biological Sequences Using Genetic Algorithm. International Journal of Research & Reviews in Computer Science 2(2) (2011)
2. Tanaka, E., Bailey, T. L., Keich, U.: Improving MEME via a two-tiered significance analysis. Bioinformatics, btu163 (2014)
3. Abnizova, I., Boekhorst, R., Walter, K., Gilks, W.R. (2005) Some statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the Drosophila genome: the fluffy-tail test. BMC Bioinformatics 6: pp. 109 CrossRef
4. Shu, J.J., Li, Y. (2013) A statistical thin-tail test of predicting regulatory regions in the Drosophila genome. Theoretical Biology and Medical Modelling 10: pp. 11 CrossRef
5. Tompa, M., Li, N., Bailey, T.L., Church, G.M., De Moor, B., Eskin, E., Favorov, A.V., Frith, M.C., Fu, Y., Kent, W.J., et al.: Assessing Computational Tools for the Discovery of Transcription Factor Binding Sites. Nat. Biotechnol. 23137鈥?3147 (2005)
6. Pevzner, P.A., Sze, S.H. (2000) Combinatorial approaches to finding subtle signals in DNA sequences. ISMB 8: pp. 269-278
7. Burset, M., Guigo, R. (1996) Evaluation of gene structure prediction programs. Genomics 34: pp. 353-367 CrossRef
8. Wingender, E., Dietze, P., Karas, H., Kn眉ppel, R. (1996) TRANSFAC: a Database on transcription factors and their DNA binding sites. Nucleic Acids Res. 24: pp. 238-241 CrossRef
9. Das, M.K., Dai, H.K.: A survey of DNA motif finding algorithms. BMC Bioinformatics 8(Suppl. 7), S21 (2007)
10. Lenhard, B., Wasserman, W.W. (2002) TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 18: pp. 1135-1136 135" target="_blank" title="It opens in new window">CrossRef
作者单位：Bioinformatics and Biomedical Engineering
丛书名：978-3-319-16482-3
刊物类别：Computer Science
刊物主题：Artificial Intelligence and Robotics
Computer Communication Networks
Software Engineering
Data Encryption
Database Management
Computation by Abstract Devices
Algorithm Analysis and Problem Complexity
出版者：Springer Berlin / Heidelberg
ISSN：1611-3349

文摘

Understanding of transcriptional regulation through the discovery of transcription factor binding sites (TFBS) is a fundamental problem in molecular biology research. Here we propose a new computational method for motif discovery by mixing a genetic algorithm structure with several statistical coefficients. The algorithm was tested with 56 data sets from four different species. The motifs obtained were compared to the known motifs for each one of the data sets, and the accuracy in this prediction compared to 14 other methods both at nucleotide and site level. The results, though did not stand out in detection of false positives, showed a remarkable performance in most of the cases in sensitivity and in overall performance at site level, generally outperforming the other methods in these statistics, and suggesting that the algorithm can be a useful tool to successfully predict motifs in different kinds of sets of DNA sequences.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700