哺乳动物转录因子及其靶基因的挖掘分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
转录因子是转录调控中的核心功能蛋白,能够与顺式调控元件相结合并对下游基因的表达进行严格调控,在生命有机体的许多重要生化过程中发挥着不可或缺的关键性作用。鉴于转录因子在转录调控中的重要意义,转录因子及其下游调控靶基因的识别成为后基因组时代的研究热点之一。
     传统工作中,生物学家主要通过实验生物学方法来识别转录因子及其下游靶基因。实验生物学方法能够获得比较准确的数据,但其实验周期较长,因而无法在短时间内提供丰富的转录调控数据。近年来,生物学家开始引入计算生物学方法来加速转录调控领域的研究进程,主要工作集中在转录因子的识别和顺式调控元件模型的构建上。在转录因子的识别工作中,生物学家主要通过机器学习算法来构建转录因子的识别工具。目前已建立基于BLAST算法和最近邻算法的转录因子识别方法,但这些方法在哺乳动物中的应用不是十分理想。在顺式调控元件的相关工作中,生物学家尝试使用各种指标来建立模型,表征转录因子与顺式调控元件的识别偏好性,但二者的结合规律比较复杂,目前仍然处在探索过程中。
     本文采用蛋白质结构域和功能位点信息来组成蛋白质序列的特征向量,并在此基础上建立了基于支持向量机算法的转录因子自动识别机。然后耦合支持向量机和纠错输出编码算法,建立了转录因子自动分类器。使用生物学实验验证的数据对所构建的转录因子识别分类工具进行检测,结果显示:本文的自动识别机和分类器具有优异的性能,对转录因子的识别和分类准确率分别达到了88.22%和97.83%。为进一步评估这两个工具的性能,把自动识别机、分类器与BLAST、最近邻算法建立的转录因子识别分类工具进行了比较,结果表明:相比于BLAST、最近邻算法建立的转录因子相关工具,本文的自动识别机、分类器对转录因子的识别和分类具有更为出色的能力。随后使用自动识别机、分类器对哺乳动物中的人、小鼠、大鼠基因组中的蛋白质序列进行分析,获得了大量的潜在转录因子。
     在转录因子识别工作基础上,为了获得转录因子的下游靶基因信息,引入了反向工程思想,并发展了转录因子-下游调控基因作用对挖掘工具。随后使用该工具对人、小鼠、大鼠基因表达数据进行挖掘,获得了丰富的转录因子下游靶基因信息。使用fisher精确统计方法,对下游靶基因信息的可靠性进行检验,结果显示:在一定程度上,本文的挖掘工具是有效的,所获得的下游靶基因信息是可信的。
     为进一步研究转录因子对下游靶基因的调控机制,对转录因子与顺式调控元件的结合规律进行了初步的探索:在整合多种生物学指标的基础上,通过决策树算法,建立了组合的顺式调控元件描述模型。使用人、小鼠、大鼠基因组中的多组转录因子-顺式调控元件相互作用数据,对组合模型进行测试,结果显示:组合模型能够很好地描述转录因子与顺式调控元件之间的识别偏好性,从而对二者的结合规律进行回答。
     在上述工作基础上,为方便生物学家使用工作中挖掘获得的转录因子及下游靶基因信息,构建了综合的哺乳动物转录因子分析平台。平台不仅包含了丰富的转录调控数据,同时提供了方便的转录因子在线预测工具。该平台将成为转录调控领域的重要资源,并将为相关领域的研究提供有力的支撑。
     本文对哺乳动物的转录因子及其靶基因进行了挖掘分析,有效地解决了目前哺乳动物转录调控数据积累不足的问题。在此基础上,就转录因子与顺式调控元件的结合规律进行了初步研究,提高了人们在分子层面上对转录调控机制的认识。我们相信,通过对转录因子的全景式研究,必将帮助人们在系统层面上对基因组信息进行解读。
Transcription factor (TF) is a core functional protein of transcriptional regulation, and it controls expression level of downstream genes (TF targets) through interacting with cis-regulatory element (CRE), which plays significant roles in some vital biological processes of an organism. Investigation of TFs and their targets becomes a hot research area in post genome era because of their important function to transcription.
     Traditionally, experimental approaches are used to investigate TFs and their targets by biologic scientists. People can obtain accurate information about transcriptional regulation through experimental approaches, but these approaches are time-comsuing and they can not provide abundant information in a short time. Hence biologic scientists begin to explore transcriptional regulation through computational methods recently, which most of works are focus on TF identification and CRE modeling. For TF identification, machine learning algorithm was generally used to build analysis tools. Currently, identifying methods based on BLAST and nearest neighbour algorithm (NNA) are built, however performance of these methods are not satisfied when applied in mammalian. For CRE modeling, biologic scientists try to describe preference between TF and CRE through constructing models with various features. Nevertheless, process of CRE modeling is still on going because of complicate interaction mechanism between TF and CRE.
     In our work, the support vector machine (SVM) algorithm was utilized to construct an automatic detector for TF identification, where protein domains and functional sites were employed as feature vectors. Then a TF classifier was built by combining the error-correcting output coding (ECOC) algorithm with SVM methodology. Datasets valided by biological experiments were used to test performance of the detector and classifier. Test results demonstrated that the two tools had excellent capability for TF analysis, and overall success rate of identification and classification for TF achieved 88.22% and 97.83%. In order to evaluate performance of these tools further, we compared our tools with tools built from BLAST and NNA respectively. Comparison results showed that our tools were superior to tools of BLAST and NNA for TF analysis. After that, the detector and classifier were utilized to analyse protein sequences of Human, Mouse, and Rat. As a result, plentiful putative TFs were obtained.
     Subsequently, a mining tool for TF-target pairs was developed based on reverse engineering theory so as to get regulated genes of TFs. After that, the mining tool was used to analyse microarray data of Human, Mouse, and Rat. As a result, lots of TF-target pairs were gained. The fisher's exact test was carried out to assess reliability of TF-target pairs in work. Results of fisher test indicated that approach used here to predict TF-target pairs were valid, and information of downstream genes for TFs inferred here was believable to some extent.
     In order to further explore regulatory function between TFs and their targets, we investigated interaction mechanism between TF and CRE. In work, a combinational model of CRE was constructed based on decision tree through assembling serverl biologic features. After that, in Human, Mouse, and Rat, many interaction pairs between TF and CRE were employed to estimate performance of the combination model. Results of estimation made clear that the model did have good power to depict binding preference and interaction mechanism between TF and CRE.
     Finally, an integrated platform of TF was built so that biological scientists can conveniently use information of TFs and their targets acquired in our work. In brief, abundant data of transcriptional regulation was contained in the platform, which also provides a prediction tool for TF. We believed that the platform will serve as an import resource for community of transcription researchers, and present strong support for exploration of transcriptional regulation.
     Currently, the data of transcriptional regulation in mammalian is far from sufficient. In order to solve the problem, we mined and presented a great deal of information about TFs and their targets in Human, Mouse, and Rat. Moreover, we investigated binding characteristic between TF and CRE, which will increase people's knowledge of transcriptional regulation machenism. In summary, we think the work of comprehensive research for TF will help people interpret genome information in systems level.
引文
[1].Hill CS,Treisman R:Transcriptional regulation by extracellular signals:mechanisms and specificity.Cell 1995,80(2):199-211.
    [2].Duncan SA,Navas MA,Dufort D,Rossant J,Stoffel M:Regulation of a transcription factor network required for differentiation and metabolism.Science 1998,281(5377):692-695.
    [3].Darnell JE,Jr.:Transcription factors as targets for cancer therapy.Nat Rev Cancer 2002,2(10):740-749.
    [4].Li Q,Verma IM:NF-kappaB regulation in the immune system.Nat Rev Immunol 2002,2(10):725-734.
    [5].Matys V,Fricke E,Geffers R,Gossling E,Haubrock M,Hehl R,Hornischer K,Karas D,Kel AE,Kel-Margoulis OV et al:TRANSFAC:transcriptional regulation,from patterns to profiles.Nucleic Acids Res 2003,31(1):374-378.
    [6].Sandelin A,Alkema W,Engstrom P,Wasserman WW,Lenhard B:JASPAR:an open-access database for eukaryotic transcription,factor binding profiles.Nucleic Acids Res 2004,32(Database issue):D91-94.
    [7].Zhao F,Xuan Z,Liu L,Zhang MQ:TRED:a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies.Nucleic Acids Res 2005,33(Database issue):D103-107.
    [8].Crick F:Central dogma of molecular biology.Nature 1970,227(5258):561-563.
    [9].沈同,王镜岩等:生物化学(第二版).In.北京:高等教育出版社;1991:358-368.
    [10].徐晋麟,徐沁,陈淳:现代遗传学原理.In.北京:科学出版社;2001:307-320.
    [11].Eukaryotic transcription.In:Wikipedia,http://enwikipediaorg/wiki/Eukaryotic_transcription.
    [12].Transcription factor.In:Wikipedia,http://enwikipediaorg/wiki/Transcription_factor.
    [13].van Nimwegen E:Scaling laws in the functional content of genomes.Trends Genet 2003,19(9):479-484.
    [14].Moens CB,Selleri L:Hox cofactors in vertebrate development.Dev Biol 2006,291(2):193-206.
    [15].Lemons D,McGinnis W:Genomic evolution of Hox gene clusters.Science 2006,313(5795):1918-1922.
    [16].Osborne CK,Schiff R,Fuqua SA,Shou J:Estrogen receptor:current understanding of its activation and modulation.Clin Cancer Res 2001,7(12 Suppl):4338s-4342s;discussion 4411s-4412s.
    [17].Shamovsky I,Nudler E:New insights into the mechanism of heat shock response activation.Cell Mol Life Sci 2008, 65(6):855-861.
    [18].Benizri E, Ginouves A, Berra E: The magic of the hypoxia-signaling cascade. Cell Mol Life Sci 2008, 65(7-8): 1133-1149.
    [19]. Weber LW, Boll M, Stampfl A: Maintaining cholesterol homeostasis: sterol regulatory element-binding proteins. World J Gastroenterol 2004, 10(21):3081-3087.
    [20]. Evan G, Harrington E, Fanidi A, Land H, Amati B, Bennett M: Integrated control of cell proliferation and cell death by the c-myc oncogene. Philos Trans R Soc Lond B Biol Sci 1994, 345(1313):269-275.
    [21].Hori S, Nomura T, Sakaguchi S: Control of regulatory T cell development by the transcription factor Foxp3. Science 2003, 299(5609): 1057-1061.
    [22]. Mayo MW, Baldwin AS: The transcription factor NF-kappaB: control of oncogenesis and cancer therapy resistance. Biochim Biophys Acta 2000, 1470(2):M55-62.
    [23]. Kadonaga JT: Regulation of RNA polymerase II transcription by sequence-specific DNA binding factors. Cell 2004, 116(2):247-257.
    [24]. Rosenfeld PJ, Kelly TJ: Purification of nuclear factor I by DNA recognition site affinity chromatography. J Biol Chem 1986,261(3): 1398-1408.
    [25]. Kadonaga JT, Tjian R: Affinity purification of sequence-specific DNA binding proteins. Proc Natl Acad Sci U S A 1986, 83(16):5889-5893.
    [26]. Kadonaga JT, Carner KR, Masiarz FR, Tjian R: Isolation of cDNA encoding transcription factor Sp1 and functional analysis of the DNA binding domain. Cell 1987, 51(6):1079-1090.
    [27]. Singh H, LeBowitz JH, Baldwin AS, Jr., Sharp PA: Molecular cloning of an enhancer binding protein: isolation by screening of an expression library with a recognition site DNA. Cell 1988, 52(3):415-423.
    [28].Bork P, Doerks T, Springer TA, Snel B: Domains in plexins: links to integrins and transcription factors. Trends Biochem Sci 1999, 24(7):261-263.
    [29]. Ghosh D: Object-oriented transcription factors database (ooTFD). Nucleic Acids Res 2000, 28(1):308-310.
    [30]. Iida K, Seki M, Sakurai T, Satou M, Akiyama K, Toyoda T, Konagaya A, Shinozaki K: RARTF: database and tools for complete sets of Arabidopsis transcription factors. DNA Res 2005, 12(4):247-256.
    [31]. Guo A, He K, Liu D, Bai S, Gu X, Wei L, Luo J: DATF: a database of Arabidopsis transcription factors. Bioinformatics 2005, 21(10):2568-2569.
    [32].Qian Z, Cai YD, Li Y: Automatic transcription factor classifier based on functional domain composition. Biochem Biophys Res Commun 2006, 347(1):141-144.
    [33].Pang-Ning Tan MS,Vipin Kumar著,范明、范宏建等译:数据挖掘导论.In.北京:人民邮电出版社;2006:137-139.
    [34].Richard O.Duda PEH,David G.Stork著,李宏东,姚天翔等译:模式分类.In.北京:机械工业出版社;2003:146-151.
    [35].Pang-Ning Tan MS,Vipin Kumar著,范明、范宏建等译:数据挖掘导论.In.北京:人民邮电出版社;2006:92-106.
    [36].Richard O.Duda PEH,David G.Stork著,李宏东,姚天翔等译:模式分类.In.北京:机械工业出版社;2003:318-338.
    [37].Vapnik CCaV:Support-Vector Networks.Machine Learning 1995,20:273-297.
    [38].Pang-Ning Tan MS,Vipin Kumar著,范明、范宏建等译:数据挖掘导论.In.北京:人民邮电出版社;2006:156-168.
    [39].Richard O.Duda PEH,David G.Stork著,李宏东,姚天翔等译:模式分类.In.北京:机械工业出版社;2003:211-216.
    [40].support vector machine.In:Wikipedia,http://enwikipediaorg/wiki/Support_vector_machine.
    [41].Krogh A:What are artificial neural networks? Nat Biotechnol 2008,26(2):195-197.
    [42].Artificial neural network.In:Wikipedia,http://enwikipediaorg/wiki/Artificial_neural_network.
    [43].Lewis DD:Naive(Bayes) at forty:The independence assumption in information retrieval.In:Machine Learning:ECML-98.Berlin/Heidelberg:Springer;1998:4-15.
    [44].Naive Bayes classifier.In:Wikipedia,http://enwikipediaorg/wiki/Naive_Bayesian_classification.
    [45].Wojcik JR,Walber-Rankin J,Smith LL,Gwazdauskas FC:Comparison of carbohydrate and milk-based beverages on muscle damage and glycogen following exercise.Int J Sport Nutr Exerc Metab 2001,11(4):406-419.
    [46].Chou KC,Cai YD:Predicting protein structural class by functional domain composition.Biochem Biophys Res Commun 2004,321(4):1007-1009.
    [47].Yu X,Wang C,Li Y:Classification of protein quaternary structure by functional domain composition.BMC Bioinformatics 2006,7:187.
    [48].Jia P,Qian Z,Zeng Z,Cai Y,Li Y:Prediction of subcellular protein localization based on functional domain composition.Biochem Biophys Res Commun 2007,357(2):366-370.
    [49].Jensen LJ,Gupta R,Blom N,Devos D,Tamames J,Kesmir C,Nielsen H,Staerfeldt HH,Rapacki K,Workman C et al:Prediction of human protein function from post-translational modifications and localization features.J Mol Biol 2002,319(5):1257-1265.
    [50].Bode AM,Dong Z:Post-translational modification of p53 in tumorigenesis.Nat Rev Cancer 2004,4(10):793-805.
    [51].Laufs U,Liao JK:Post-transcriptional regulation of endothelial nitric oxide synthase mRNA stability by Rho GTPase. J Biol Chem 1998, 273(37):24266-24271.
    [52].Furey TS, Cristianini N, Duffy N, Bednarski DW, Schummer M, Haussler D: Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 2000, 16(10):906-914.
    [53]. Donaldson I, Martin J, de Bruijn B, Wolting C, Lay V, Tuekam B, Zhang S, Baskin B, Bader GD, Michalickova K et al: PreBIND and Textomy-mining the biomedical literature for protein-protein interactions using a support vector machine. BMC Bioinformatics 2003,4:11.
    [54]. Simon Tong DK: Support vector machine active learning with application to text classification. The Journal of Machine Learning Research 2002, 2:45-66.
    [55]. Song Q, Hu W, Xie W: Robust support vector machine with bullet hole image clssification. In: IEEE Transactions on System, Man, and Cybernetics: 2002; Singapore; 2002: 440-448.
    [56].Cao LJ, Tay FEH: Support vector machine with adaptive parameters in financial time series forecasting. In: IEEE Transaction on Neural Networks: 2003; Singapore; 2003: 1506-1518.
    [57].Guo G, Z.Li S, Chan KL: Support vector machine for face recognition. Image and Vision computing 2001, 19(10):631-638.
    [58]. Pabo CO, Sauer RT: Transcription factors: structural families and principles of DNA recognition. Annu Rev Biochem 1992, 61:1053-1095.
    [59].Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K et al: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res 2006, 34(Database issue):D108-110.
    [60]. The Universal Protein Resource (UniProt). Nucleic Acids Res 2007, 35(Database issue):D193-197.
    [61]. Wang G, Dunbrack RL, Jr.: PISCES: a protein sequence culling server. Bioinformatics 2003,19(12):1589-1591.
    [62]. Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 2006,22(13): 1658-1659.
    [63]. Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R et al: New developments in the InterPro database. Nucleic Acids Res 2007, 35(Database issue):D224-228.
    [64]. the InterProScan Webpage [http://www.ebi.ac.uk/Tools/InterProScan/]
    [65]. V.Vapnik: The Nature of Statistical Learning Theory. New York: Springer Verlag; 1995.
    [66]. V.Vapnik: Statistical Learning Theory. New York: John Wiley & Sons; 1998.
    [67]. T J: Making large-Scale SVM Learing Practical. In: Advances in Kernal Methods - Support Vector Learing. Cambridge, USA: MIT Press; 1999.
    [68]. The svmlight webpage [http://svmlight.joachims.org/]
    [69]. T.G.Dietterich, G.Bakiri: Solving Multiclass Learning Problems via Error-Correcting Output Codes. Journal of Artificial Intelligence Rearch 1995, 2:263-286.
    [70]. Kong EB, Dietterich TG: Error-correcting output coding corrects bias and variance. In: the 12th International Conference on Machine Learning: 1995; Tahoe City, CA; 1995: 313-321.
    [71].H.Witten I, Frank E: Data Minining: Practical Machine Learning Tools and Techniques(Second Edition). New York: Diane Cerra; 2005.
    [72]. LI K: Using diversity measures for generating error-correcting output codes in classifier ensembles. Pattern Recognition Letters 2005,26(l):83-90.
    [73].Kannan K, Amariglio N, Rechavi G, Jakob-Hirsch J, Kela I, Kaminski N, Getz G, Domany E, Givol D: DNA microarrays identification of primary and secondary target genes regulated by p53. Oncogene 2001,20(18):2225-2234.
    [74]. Qian J, Lin J, Luscombe NM, Yu H, Gerstein M: Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data. Bioinformatics 2003, 19(15):1917-1926.
    [75]. Mukherjee S, Berger MF, Jona G, Wang XS, Muzzey D, Snyder M, Young RA, Bulyk ML: Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat Genet 2004,36(12):1331-1339.
    [76]. the GEO webpage [http://www.ncbi.nlm.nih.gov/geo/]
    [77]. the Affymetrix webpage [http://www.affymetrix.com/]
    [78]. the R webpage [http://www.r-project.org/]
    [79]. Bolstad BM, Irizarry RA, Astrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics 2003, 19(2):185-193.
    [80]. R S, J K, CO D, J W, J S: The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 2002,18(suppl 2):231-240.
    [81]. Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D: How to infer gene networks from expression profiles. Mol Syst Biol 2007,3:78.
    [82]. Margolin AA, Wang K, Lim WK, Kustagi M, Nemenman I, Califano A: Reverse engineering cellular networks. Nat Protoc 2006, 1(2):662-671.
    [83]. Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, Califano A: ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 2006, 7 Suppl 1.S7.
    [84].Begley DA, Krupke DM, Vincent MJ, Sundberg JP, Bult CJ, Eppig JT: Mouse Tumor Biology Database (MTB): status update and future directions. Nucleic Acids Res 2007, 35(Database issue):D638-642.
    [85]. Smith CM, Finger JH, Hayamizu TF, McCright IJ, Eppig JT, Kadin JA, Richardson JE, Ringwald M: The mouse Gene Expression Database (GXD): 2007 update. Nucleic Acids Res 2007, 35(Database issue):D618-623.
    [86].Bult CJ, Eppig JT, Kadin JA, Richardson JE, Blake JA: The Mouse Genome Database (MGD): mouse biology and model systems. Nucleic Acids Res 2008, 36(Database issue):D724-728.
    [87]. the MGI webpage [http://www.informatics.jax.org/]
    [88]. GuhaThakurta D: Computational identification of transcriptional regulatory elements in DNA sequence. Nucleic Acids Res 2006, 34(12):3585-3598.
    [89]. Hannenhalli S: Eukaryotic transcription factor binding sites-modeling and integrative search methods. Bioinformatics 2008, 24(11):1325-1331.
    [90].Stormo GD: Information content and free energy in DNA-protein interactions. J Theor Biol 1998, 195(1):135-137.
    [91]. Stormo GD, Fields DS: Specificity, free energy and information content in protein-DNA interactions. Trends Biochem Sci 1998, 23(3):109-l 13.
    [92]. Stormo GD: DNA binding sites: representation and discovery. Bioinformatics 2000, 16(1):16-23.
    [93]. Boardman PE, Oliver SG, Hubbard SJ: SiteSeer: Visualisation and analysis of transcription factor binding sites in nucleotide sequences. Nucleic Acids Res 2003, 31(13):3572-3575.
    [94]. CBN, U ICoBN: Abbreviations and symbols for nucleic acids, polynucleotides and their constituents. Recommendations. European Journal of Biochemistry 1970, 15:203-208.
    [95]. the IUPAC webpage [http://www.chem.qmul.ac.uk/iubmb/misc/naseq.html]
    [96].Zhu J, Zhang MQ: SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics 1999, 15(7-8):607-611.
    [97]. Ponomarenko MP, Ponomarenko JV, Frolov AS, Podkolodny NL, Savinkova LK, Kolchanov NA, Overton GC: Identification of sequence-dependent DNA features correlating to activity of DNA sites interacting with proteins. Bioinformatics 1999, 15(7-8):687-703.
    [98]. Ponomarenko JV, Ponomarenko MP, Frolov AS, Vorobyev DG, Overton GC, Kolchanov NA: Conformational and physicochemical DNA features specific for transcription factor binding sites. Bioinformatics 1999, 15(7-8):654-668.
    [99]. Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA: Identification of a coordinate regulator of interleukins n4, 13, and 5 by cross-species sequence comparisons. Science 2000, 288(5463):136-140.
    [100]. Blanchette M, Tompa M: Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 2002, 12(5):739-748.
    [101]. Lenhard B, Wasserman WW: TFBS: Computational framework for transcription factor binding site analysis. Bioinformatics 2002, 18(8):1135-1136.
    [102]. Boffelli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, Rubin EM: Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science 2003, 299(5611):1391-1394.
    [103]. Bailey TL, Gribskov M: Combining evidence using p-values: application to sequence homology searches. Bioinformatics 1998,14(1):48-54.
    [104]. Hertz GZ, Stormo GD: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 1999, 15(7-8):563-577.
    [105]. Kel AE, Gossling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E: MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 2003, 31(13):3576-3579.
    [106]. Sinha S, Tompa M: Discovery of novel transcription factor binding sites by statistical overrepresentation. Nucleic Acids Res 2002, 30(24):5549-5560.
    [107]. Ho Sui SJ, Mortimer JR, Arenillas DJ, Brumm J, Walsh CJ, Kennedy BP, Wasserman WW: oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res 2005, 33(10):3154-3164.
    [108]. Bootstrap aggregating. In: Wikipedia, http://enwikipediaorg/wiki/Bootstrap_aggregating.
    [109]. Receiver operating characteristic. In: Wikipedia, http://enwikipediaorg/wiki/Receiver_operating_characteristic.
    [110]. Model-View-Control. In: Wikipedia, http://enwikipediaorg/wiki/Model-view-controller.
    [111]. the MySQL webpage [http://www.mysql.com/]
    [112]. the JavaBean webpage [http://java.sun.com/]
    [113]. the Java Servlet webpage [http://java.sun.com/]
    [114]. the JSP webpage [http://java.sun.com/]
    [115]. the Tomcat webpage [http://tomcat.apache.org/]

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700