摘要
模体发现是生物信息学的核心问题之一,对于研究基因表达的调控机制有着极为重要的生物意义。植入(l,d)模体发现(Planted (l,d) motif search,PMS)是模体发现领域中一个广为接受的问题模型。本文主要研究了4个基础的算法解决模体发现问题,这些算法可以帮助人们理解模体发现问题。4个精确算法主要包括:(1)实现基于候选模体实例字符串深度优先搜索+剪枝思想解决的位点比对的PMS问题。(2)实现基于候选模体字符深度优先搜索+剪枝思想解决的位点比对的PM S问题。(3)实现基于候选模体字符广度优先搜索+剪枝思想解决的位点比对的PM S问题。(4)实现PM SP算法。
Module discovery is one of the core problems of bioinformatics,which is very important to study the regulation mechanism of gene expression. Implant able( l.d) discovery( planted( l.d) motif search,PMS) is a widely accepted problem model in the field of body discovery. This paper mainly studies four basic problems of algorithm module discovery. These algorithms can help to understand the problem of template discovery. Four of these precise algorithms mainly include :( 1) Implementation of string depth priority search based on candidate template instance + PMS problem of site alignment in cutting technique.( 2) Implementation of candidate mode character depth priority search + PMS problem of site alignment in cutting technique.( 3) Implementation of priority search based on candidate template character breadth + PMS problem of site alignment in cutting technique.( 4)Implementation of PMSP algorithm.
引文
[1] DAVILA J,BALLA S,RAJASEKARAN S. Space and time efficient algorithms for planted motif search[C]//ICCS'06Proceedings of the 6thinternational conference on Computational Science. Reading,UK:ACM,2006:822-829.
[2] ZAMBELLI F,PESOLE G,PAVESI G. Motif discovery and transcription factor binding sites before and after the nextgeneration sequencing era[J]. Briefings in Bioinformatics,2013,14(2):225-238.
[3] MRZEK J. Finding sequence motifs in prokaryotic Genomes-a brief practical guide for a microbiologist[J]. Briefings in Bioinformatics,2009,10(5):525-536.
[4]霍红卫,林帅,于强,等.基于MapReduce的模体发现算法[J].中国科技论文,2012,7(7):487-502.
[5]D’HAESELEER P. What are DNA sequence motif[J]. Nature Biotechnology,2006,24(4):423-425.