基因表达及其转录调控机制的计算分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
高通量实验技术的广泛应用积累了丰富的基因表达数据,这些数据反映出不同时空条件下基因的动态表达水平。利用各种计算方法对其进行分析和处理,提取内在的表达模式和特征,已成为认识和理解基因功能及相关作用机制的重要手段,也是当前生物信息学研究的重点和热点问题。本文围绕“基因表达及其转录调控机制”这一主题,针对基因表达数据缺失值估计、肿瘤样本分型识别、基因表达数据聚类分析、转录因子活动水平预测等问题进行了深入分析和研究,主要成果和创新点总结如下:
     1)针对时序基因表达数据提出了一种缺失值估计的新算法。缺失值估计是表达数据分析的重要预处理步骤。通过分析基因表达过程中共调控/共表达的时间特异性、并进一步提取时序表达数据的表达水平关联和表达趋势关联,本文提出了一种缺失值估计的新算法。该算法基于时间窗口捕获共调控/共表达的时间特异性,综合利用表达水平关联和表达趋势关联评估基因表达谱的相似性,并以这种相似性作为权值基于近邻基因表达谱实现缺失值估计。利用多组数据集设计数值实验与现有估计算法进行了性能比较,结果证实了本文算法的有效性和准确性。
     2)基于非线性维度规约技术进行肿瘤样本分型识别。分析肿瘤样本基因表达数据以提取其内蕴特征,能够为准确的分型识别提供依据,同时也有助于在分子层面增进对肿瘤发生及演化机理的认识和理解。然而,基因表达数据的“高维,小样本”特性导致适用于低维数据的特征提取算法性能受限。针对这一问题,论文采用非线性维度规约技术降低或消除数据噪音及各种干扰的影响,在规约后的数据空间上实施分型识别。我们重点研究了非线性维度规约算法的参数设置,分析比较了线性、非线性维度规约技术对肿瘤样本分型识别算法的性能影响。多组数值实验的结果表明,非线性维度规约技术能够更好地捕获肿瘤基因表达数据的内在非线性结构、提高分型识别的准确度。
     3)结合基因表达和功能注释数据设计了新的模糊聚类算法。聚类分析有助于提取基因表达数据中潜藏的功能模式,是理解表达数据和识别基因功能的重要手段。本文结合表达谱和功能注释语义信息评估基因之间的相似性,设计了新的模糊聚类算法,并利用功能注释数据提出了算法的初始化策略。数值实验表明,该算法能够准确预测基因功能并识别其所属的多个功能类别,得到更具生物意义的聚类结果。
     4)提出了转录因子活动水平预测的新方法。转录因子是基因转录过程的重要调控子,预测其活动水平有助于进一步理解基因表达的转录调控机制。基于非线性偏最小二乘回归模型,本文结合基因表达和ChIP-chip实验数据设计了一种预测转录因子活动水平的新方法。数值实验表明了该方法的有效性。此外,本文还基于该方法预测了11个已知与酵母细胞周期调控相关转录因子的活动水平,并进一步分析了转录因子活动水平的周期性和相关性,构建出酵母细胞周期的转录调控网络。
Large amounts of gene expression data have been produced with the wide application of high-throughput experimental techniques, which reflect genes’dynamic expression levels at different temporal and spatial conditions. It has been an important way to understand gene function and related mechanisms through identifying intrinsic expression patterns within these data. Computational analysis of gene expression data is one of the most important and hot issue in the research of bioinformatics. Focusing on 'gene expression and its transcriptional regulatory mechanisms', the dissertation consists of in-depth and systematic studies on missing value imputation of microarray data, class discovery of tumor samples, clustering of gene expression data, prediction of transcription factor activities and analysis of transcriptional regulation, etc. The main contents and contributions of the dissertation are summarized as follows:
     1) A novel algorithm is designed for missing value imputation of temporal gene expression data. Missing value imputation is an important preprocessing step for analyzing gene expression data. Via studying the temporal specificity of co-regulation/co-expression and extracting the associations of expression level and expression trend, the dissertation presents a novel imputing algorithm. It exploits a temporal window to capture the temporal specificity of co-regulation/co-expression, and combines the associations of expression level and expression trend to score the similarity between gene expression profiles. Via referring to the similarity scores as weights, imputation of missing values is carried out based on the neighbor gene expression profiles. Numerical experiments are designed to compare its performance with several existing algorithms. Results validate the algorithm and show that it can achieve higher accuracy.
     2) Effectiveness of nonlinear dimensionality reduction is validated for class discovery of tumor samples. It improves the class discovery of tumor samples by identifying intrinsic features in tumor microarray expression data, which can potentially enhance the understanding of tumor occurring and evolutionary mechanisms. However, the basic feature of tumor microarray expression data is 'high dimensions and small samples', which causes the traditional clustering algorithms for low dimensional data less effective. To solve this problem, the dissertation exploits nonlinear dimensionality reduction to reduce the effect of data noises and various interferences. We apply algorithms to discovery tumor classes under the reduced lower dimensional data space, and pay emphases on the parameter selection of nonlinear dimensionality reduction and the performance comparison of algorithms for tumor class discovery based on linear and nonlinear methods. Experimental results show that nonlinear dimensionality reduction methods can better capture the intrinsic structure in tumor microarray expression data and improves the accuracy of tumor class discovery algorithms.
     3) A novel fuzzy clustering algorithm is designed via the combination of gene expression and function annotations. Clustering analysis helps to extract functional patterns hidden in the gene expression data. It is an important way for understanding expression data and identifying gene function. The dissertation combines expression and annotation data to assess gene similarity, and designs a novel fuzzy clustering algorithm whose initialization is implemented with gene annotation data. Experimental results show that the algorithm can predict gene function and identify its multiple function categories accurately, thus produces more biologically meaningful clustering.
     4) A novel algorithm is developed to predict transcription factor activity. Transcription factor is an important regulon during the process of gene transcription. Prediction of its activity helps to understand the transcriptional regulatory mechanisms for gene expression. Based on the model of nonlinear partial least-squares regression, the dissertation presents a novel method to predict transcription factor activity by combining gene expression and ChIP-chip data. Experimental results show the validity of the method. The dissertation exploits the method to predict the activities for the 11 known transcription factors involved in S. cerevisiae cell cycle regulation. Based on the predicted results, the dissertation further studies the periodicity of transcription factor activity and the correlation between them, and finally constructs the transcriptional regulatory network for S. cerevisiae cell cycle regulation.
引文
[1] Watson, J.D. and Jordan, E. The Human Genome Program at the National Institutes of Health[J]. Genomics, 1989, 5(3):654-6.
    [2] Pasteris, N.G., Bialecki, M.D. and Gorski, J.L. YAC subclone contig assembly by serial interspersed repetitive sequence (IRS)-PCR product hybridizations[J]. Nucleic Acids Res, 1993, 21(22):5275-6. [3JShe, X., Jiang, Z., Clark, R.A., et al. Shotgun sequence assembly and recent segmental duplications within the human genome[J]. Nature, 2004,431(7011):927-30.
    [4] Brenner, S.E. BLAST, Blitz, BLOCKS and BEAUTY: sequence comparison on the net[J]. Trends Genet, 1995, ll(8):330-l.
    [5] McGinnis, S. and Madden, T.L. BLAST: at the core of a powerful and diverse set of sequence analysis tools[J]. Nucleic Acids Res, 2004, 32(Wcb Server issue):W20-5.
    [6] Agarwal, P. and States, D.J. Comparative accuracy of methods for protein sequence similarity search[J]. Bioinformatics, 1998, 14(l):40-7.
    [7] Lefevre, C. and lkeda, J.E. A fast word search algorithm for the representation of sequence similarity in genomic DNA[J]. Nucleic Acids Res, 1994, 22(3):404-l 1.
    [8] Kim, D.S., Huh, J.W. and Kim, H.S. Transposable elements in human cancers by genome-wide EST alignment[J]. Genes Genet Syst, 2007, 82(2): 145-56.
    [9] Rubin, G.M., Yandell, M.D., Wortman, J.R., et al. Comparative genomics of the eukaryotes[J]. Science, 2000, 287(5461):2204-15.
    [10] Dacks, J.B. and Doolittle, W.F. Reconstructing/deconstructing the earliest eukaryotes: how comparative genomics can help[J]. Cell, 2001, 107(4):419-25.
    [11] Crameri, A., Raillard, S.A., Bermudez, E., et al. DNA shuffling of a family of genes from diverse species accelerates directed evolution[J]. Nature, 1998, 391(6664):288-91.
    [12] Schuler, G.D., Boguski, M.S., Stewart, E.A., et al. A gene map of the human genome[J]. Science, 1996, 274(5287):540-6.
    [13] Frederick R. Blattner, G.P.I., Craig A. Bloch, Nicole T. Perna, Valerie Burland, Monica Riley, Julio Collado-Vides, Jeremy D. Glasner, Christopher K. Rode, George F. Mayhew, Jason Gregor, Nelson Wayne Davis, Heather A. Kirkpatrick, Michael A. Goeden, Debra J. Rose, Bob Mau, Ying Shao. The Complete Genome Sequence of Escherichiacoli K-12[J]. Science, 1997, 277(5331): 1453-1462.
    [14] Venter, J.C., Adams, M.D., Myers, E.W., et al. The sequence of the human genome[J]. Science, 2001, 291 (5507): 1304-51.
    [15] Worley, K.C., Wiese, B.A. and Smith, R.F. BEAUTY: an enhanced BLAST-based search tool that integrates multiple biological information resources into sequence similarity search results[J], Genome Res. 1995,5(2):173-84.
    [16] Keim, P., Heinrikson, R.L. and Fitch, W.M. An examination of the expecteddegree of sequence similarity that might arise in proteins that have converged to similar conformational states. The impact of such expectations on the search for homo logy between the structurally similar domains of rhodanese[J], J Mol Biol, 1981, 151(l):179-97.
    [17] Smith, T.F. and Waterman, M.S. Identification of common molecular subsequences[J].J Mol Biol, 1981, 147(1): 195-7.
    [18] Corel, E., Pitschi, F. and Morgenstern, B. A min-cut algorithm for the consistency problem in multiple sequence alignment[J]. Bio informatics, 2010, 26(8):1015-21.
    [19] Lenhof, H.P., Morgenstern, B. and Reinert, K. An exact solution for the segment-to-segment multiple sequence alignment problem[J], Bioinformatics.1999, 15(3):203-10.
    [20] Pennisi, E. Ideas flyat gene-finding jamboree[J], Science, 2000.287(5461):2182-4.
    [21] Cawley, S.L. and Pachter, L. HMM sampling and applications to gene finding and alternative splicing[J]. Bioinformatics, 2003, 19 Suppl 2:ii36-41.
    [22] Deutsch, J.M. Evolutionary algorithms for finding optimal gene sets in microarray prediction[J]. Bioinformatics, 2003, 19(l):45-52.
    [23] Murakami, K. and Takagi, T. Gene recognition by combination of several gene-finding programs[J]. Bioinformatics, 1998, 14(8):665-75.
    [24] Choi, K. and Gomez, S.M. Comparison of phylogenetic trees through alignment of embedded evolutionary distances[J], BMC Bioinformatics, 2009, 10:423.
    [25] Chor, B. and Tuller, T. Maximum likelihood of evolutionary trees: hardness and approximation[J]. Bioinformatics, 2005, 21 Suppl l:i97-106.
    [26] Purdom, P.W., Jr., Bradford, P.G., Tamura, K., et al. Single column discrepancy and dynamic max-mini optimizations for quickly finding the most parsimonious evolutionary trees[J], Bioinformatics, 2000. 16(2):140-51.
    [27] Davies, K. Cloning the Menkes disease gene[J]. Nature, 1993, 361(6407):98.
    [28] Ueda, H., Howson, J.M., Esposito, L., et al. Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease[J]. Nature, 2003, 423(6939):506-ll.
    [29] Emilsson, V., Thorleifsson, G., Zhang, B., et al. Genetics of gene expression and its effect on disease[J]. Nature, 2008, 452(7186):423-8.
    [30] Sleegers, K. and Van Broeckhoven, C. Motor-neuron disease: Rogue gene in the family[J], Nature. 2009, 458(7237):415-7.
    [31] Fattore, M. and Arrigo, P. Knowledge discovery and system biology in molecular medicine: an application on neurodegenerative diseases[J], Tn Silico Biol, 2005, 5(2): 199-208.
    [32] Baitaluk, M. System biology of gene regulation[J]. Methods Mol Biol, 2009, 569:55-87.
    [33] offeau, A. DNA technology: Molecular fish on chips[J]. Nature. 1997, 385(6613):202-3.
    [34] Abbott. A. DNA chips intensify the sequence search[J]. Nature, 1996, 379(6564):392.
    [35] Ramsay, G. DNA chips: state-of-the art[J]. Nat Biotechnol, 1998, 16(l):40-4.
    [36] Marshall, A. and Hodgson, J. DNA chips: an array of possibilities^]. Nat Biotechnol, 1998, 16(1):27-31.
    [37] http://www.imaxia.com/.
    [38] http://www.moleculardevices.com,
    [39] http://www.genespotter.de/.
    [40] Rebecka Jornsten, H.-Y.W., William J. Welsh and Ming Ouyang. DNA microarray data imputation and significance analysis of differential expression[J]. Bioinformatics. 2005, 21(20):4155-4161.
    [41] Zhang, M., Yao, C, Guo, Z., et al. Apparently low reproducibility of true differential expression discoveries in microarray studies[J]. Bioinformatics, 2008. 24(18):2057-63.
    [42] Zhang, M., Zhang, L., Zou, J., et al. Evaluating reproducibility of differential expression discoveries in microarray studies by considering correlated molecular changes[j]. Bioinformatics, 2009, 25(13):1662-8.
    [43] Golub, T.R., Slonim, D.K., Tamayo, P., et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring[J]. Science, 1999. 286(5439):531-7.
    [44] Bittner, M., Meltzer, P.. Chen, Y., et al. Molecular classification of cutaneous malignant melanoma by gene expression profiling[J]. Nature, 2000, 406(6795):536-40.
    [45] Nguyen, D.V. and Rocke, D.M. Tumor classification by partial least squares using microarray gene expression data[J]. Bioinformatics, 2002, 18(l):39-50.
    [46] Jornsten, R. and Yu, B. Simultaneous gene clustering and subset selection for sample classification via MDL[J]. Bioinformatics, 2003, 19(9): 1100-9.
    [47] Wang, J., Bo, T.H., Jonassen. I., et al. Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data[J]. BMC Bioinformatics, 2003, 4:60.
    [48] Alon, U., Barkai, N., Notterman. D.A., et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays[J]. ProcNatl Acad SciU S A. 1999, 96(12.):6745-50.
    [49] Yeung, K.Y., Fraley, C, Murua, A., et al. Model-based clustering and data transformations for gene expression data[J]. Bioinformatics, 2001, 17(lO):977-87.
    [50] Gasch, A.P. and Eisen, M.B. Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering[J]. Genome Bio I, 2002. 3(ll):RESEARCH0059.
    [51] Ye, J., Pavlicek, A., Lunney, E.A., et al. Statistical method on nonrandom clustering with application to somatic mutations in cancer[J], BMC Bioinformatics, 2010, 11(1):11.
    [52] Tavazoie, S., Hughes, J.D., Campbell, M.J., et al. Systematic determination of genetic network architecture [J]. Nat Genet, 1999, 22(3):281-5.
    [53] Zhao, W., Serpedin, E. and Dougherty, E.R. Inferring gene regulatory networks from time series data using the minimum description length principle[J]. Bioinformatics, 2006, 22(17):2129-35.
    [54] Yu, J., Smith, V.A., Wang, P.P., et al. Advances to Bayesian network inference for generating causal networks from observational biological data[J]. Bioinformatics, 2004, 20(18):3594-603.
    [55] Missal, K., Cross, M.A. and Drasdo, D. Gene network inference from incomplete expression data: transcriptional control of hematopoietic commitment [J]. Bioinformatics, 2006, 22(6):731-8.
    [56] D'Haeseleer, P., Liang, S. and Somogyi, R. Genetic network inference: from co-expression clustering to reverse engineering[J]. Bioinformatics, 2000, 16(8):707-26.
    [57] Yoon, D., Yi, S.G., Kim, J.H., et al. Two-stage normalization using background intensities in cDNA microarray data[J]. BMC Bioinformatics, 2004, 5:97.
    [58] Bengtsson, A. and Bengtsson, H. Microarray image analysis: background estimation using quantile and morphological filters[J]. BMC Bioinformatics. 2006, 7:96.
    [59]李瑶.基因芯片数据分析与处理[M].北京:化学工业出版社,现代生物技术与医药科技出版中心,2006.
    [60] Troyanskaya, O., Cantor, M, Sherlock, G., et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001, 17(6):520-5.
    [61] Kim, H., Golub, G.H. and Park, H. Missing value estimation for DNA microarray gene expression data: local least squares imputation[J]. Bioinformatics, 2005, 21(2):187-98.
    [62] Tuikkala, J., Elo, L., Nevalainen, O.S., et al. Improving missing value estimation in microarray data with gene ontology[J]. Bioinformatics, 2006, 22(5):566-72.
    [63] Choong, M.K., Charbit, M. and Yan, H. Autoregressive-model-based missing value estimation for DNA microarray time series data[J]. IEEE Trans Inf Technol Biomed, 2009. 13(l):131-7.
    [64] Zhao, Y., Li, M.C. and Simon, R. An adaptive method for cDNA microarray normalization[J]. BMC Bioinformatics, 2005, 6:28.
    [65] Baird, D., Johnstone, P. and Wilson, T. Normalization of microarray data using aspatial mixed model analysis which includes splines[J]. Bioinformatics, 2004, 20(17):3196-205.
    [66] Matthias Futschik, T.C. Model selection and efficiency testing for normalization of cDNA microarray data[J]. Genome Biology, 2004, 5(8):R60-R79.
    [67] Baird, D. Normalization of microarray data using a spatial mixed model analysis which includes splines[J]. Bioinformatics. 2004, 20(17):3196-3205.
    [68] Scheel, I., Aldrin, M., Glad, I.K., et al. The influence of missing value imputation on detection of differentially expressed genes from microarray data[J]. Bioinformatics, 2005, 21(23):4272-9.
    [69] Celton, M., Malpertuy, A., Lelandais, G., et al. Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments[J]. BMC Genomics, 2010, 11(1):15.
    [70] Nitsch, D., Tranchevent, L.C., Thienpont, B.. et al. Network analysis of differential expression for the identification of disease-causing genes[J]. PloS One, 2009, 4(5):e5526.
    [71] Manda, S.O., Walls, R.E. and Gilthorpe, M.S. A full Bayesian hierarchical mixture model for the variance of gene differential expressionfJ]. BMC Bioinformatics, 2007, 8:124.
    [72] Brynedal, B., Bomfim, I.L., Olsson, T., et al. Differential expression, and genetic association, of CD58 in Swedish multiple sclerosis patients[J]. Proc Natl Acad Sci USA, 2009, 106(23):E58; author reply E59.
    [73] Allocco, D.J., Kohane, I.S. and Butte, A.J. Quantifying the relationship between co-expression, co-regulation and gene function[J]. BMC Bioinformatics. 2004, 5:18.
    [74] Carter, S.L., Brechbuhler, CM., Griffin, M., et al. Gene co-expression network topology provides a framework for molecular characterization of cellular state[J]. Bioinformatics, 2004, 20(14):2242-50.
    [75] Kim, R.S., Ji, H. and Wong, W.H. An improved distance measure between the expression profiles linking co-expression and co-regulation in mo use [J]. BMC Bioinformatics, 2006, 7:44.
    [76] Bandyopadhyay, S. and Bhattacharyya, M. Analyzing miRNA co-expression networks to explore TF-miRNA regulation[J]. BMC Bioinformatics, 2009, 10:163.
    [77] Wagner, A. How to reconstruct a large genetic network from n gene perturbations in fewer than n(2) easy steps[J]. Bioinformatics, 2001. 17(12):! 183-97.
    [78] Bansal, M., Belcastro, V., Ambesi-Impiombato, A., et al. How to infer gene networks from expression profiles[J], Mol Syst Biol, 2007, 3:78.
    [79] Sontag, E., Kiyatkin, A. and Kholodenko, B.N. Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data[J]. Bioinformatics, 2004, 20(12): 1877-86.
    [80] Irons, D.J. and Monk. N.A. Identifying dynamical modules from genetic regulatory systems: applications to the segment polarity network[J], BMC Bioinformatics, 2007, 8:413.
    [81] Bickel, D.R., Montazeri, Z., Hsieh, P.C., et al. Gene network reconstruction from transcriptional dynamics under kinetic model uncertainty: a case for the second derivative[J]. Bioinformatics, 2009, 25(6):772-9.
    [82] Tomovic, A. and Oakeley, E.J. Position dependencies in transcription factor binding sites[J]. Bioinformatics, 2007, 23(8):933-41.
    [83] Zhou, Q. and Liu, J.S. Modeling within-motif dependence for transcription factor binding site predictions[J]. Bioinformatics, 2004, 20(6):909-16.
    [84] Gunewardena, S. and Zhang, Z. A hybrid model for robust detection of transcription factor binding sites[J]. Bioinformatics, 2008, 24(4):484-91.
    [85] Liao, J.C., Boscolo, R., Yang, Y.L., et al. Network component analysis: reconstruction of regulatory signals in biological systems[J].Proc Natl Acad Sci USA, 2003, 100(26):15522-7.
    [86] Kao, K.C., Yang, Y. L., Boscolo, R., Sabatti, C, Roychowdhury, V., Liao, J. C. Transcriptome-based determination of multiple transcription regulator activities in Escherichia coli by using network component analysis[J]. Proc Natl Acad Sci U S A, 2004, 101(2):641-6.
    [87] Sanguinetti, G., Lawrence, N.D. and Rattray, M. Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities[J]. Bioinformatics, 2006. 22(22)2775-81.
    [88] Qian, J., Lin, 1, Luscombe, N.M., et al. Prediction of regulatory networks: genome-wide identification of transcription factor targets from gene expression data[J]. Bioinformatics, 2003, 19(15): 1917-26.
    [89] Wu, W.S., Li, W.H. and Chen, B.S. Identifying regulatory targets of cell cycle transcription factors using gene expression and ChlP-chip data[J]. BMC Bioinformatics, 2007, 8:188.
    [90] Das, D., Banerjee, N. and Zhang, M.Q. Interacting models of cooperative gene regulation[J]. Proc Natl Acad Sci USA, 2004, 101(46):16234-9.
    [91] Nagamine, N., Kawada. Y. and Sakakibara, Y. Identifying cooperative transcriptional regulations using protein-protein interactions[J]. Nucleic Acids Res. 2005, 33(15):4828-37.
    [92] Balaji, S.. Babu, M.M., Iyer, L.M., et al. Comprehensive analysis of combinatorial regulation using the transcriptional regulatory network of yeast[J]. J Mol Biol, 2006, 360(l):213-27.
    [93] Wang, J. A new framework for identifying combinatorial regulation of transcription factors: a case study of the yeast cell cycle[J]. J Biomed Inform. 2007, 40(6):707-25.
    [94] van Dijk, A.D., ter Braak, C.J., Immink, R.G., et al. Predicting and understanding transcription factor interactions based on sequence level determinants of combinatorial control[J]. Bioinformatics, 2008, 24(l):26-33.
    [95] Laurila. K., Yli-Harja, O. and Lahdesmaki, H. A protein-protein interaction guided method for competitive transcription factor binding improves target predictions[J]. Nucleic Acids Res, 2009, 37(22):el46.
    [96] Chen, K.C., Wang, T.Y., Tseng, H.H., et al. A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae[J]. Bioinformatics, 2005, 21(12):2883-90.
    [97] Barrett C.L. and Palsson, B.O. Iterative reconstruction of transcriptional regulatory networks: an algorithmic approach[J]. PLoS Comput Biol, 2006, 2(5):e52.
    [98] Xiao, Y. and Segal, M.R. Identification of yeast transcriptional regulation networks using multivariate random forests[J]. PLoS Comput Biol, 2009, 5(6):el000414.
    [99] Lewin, B. Genes VIII[M]. London: Pearson Prentice Hall, 2004.
    [100]朱玉贤,李毅。现代分子生物学(第二版)[M]。北京:高等教育出版社,2002.
    [101] Crick, F. Central dogma of molecular biology [J]. Nature. 1970, 227(5258):561-3.
    [102] Lockhart, D.J., Dong, H., Byrne, M.C., et al. Expression monitoring by hybridization to high-density oligonucleotide arrays[J]. Nat Biotechnol, 1996, 14(13):1675-80.
    [103] Schena, M., Shalon, D., Davis, R.W., et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarrayfJ]. Science, 1995, 270(5235):467-70.
    [104] lvis Brazma, P.H., John Quackenbush, et al. Minimum information about a microarray experiment (MIAME)—toward standards for microarray data[J]. nature genetics, 2001, 29(12):365-371.
    [105] Eisen, M.B.a.B., P.O. http://rana.lbl.gov/EisenSoftware.htm. 1999.
    [106] Medigue, C, Rechenmann, F., Danchin, A., et al. Imagene: an integrated computer environment for sequence annotation and analysis[J], Bioinformatics, 1999, 15(1):2-15.
    [107] Fielden. M.R.. Halgren, R.G., Dere, E., et al. GP3: GenePix post-processing program for automated analysis of raw microarray data[J]. Bioinformatics, 2002. 18(5):771-3.
    [108] Soille, P. Morphological Image Analysis: Principles and Applications[M]. New York: Springer Publishing Company, 1999.
    [109] Yee Hwa Yang, M.J.B., Sandrine Dudoit, Terence P Speed. Comparison of Methods for Image Analysis on cDNA Microarray DatafJ], Journal of Computational and Graphical Statistics, 2002, 11 (1): 108-136.
    [110] Speed, G.K.S.a.Y.H.Y.a.T. Statistical issues in cDNA microarray data analysis [J]. Methods in Molecular Biology, 2006, 323(IV):359-366.
    [111] Yin, W., Chen, T., Zhou, S.X., et al. Background correction for cDNA microarray images using the TV+L1 model[J]. Bioinformatics, 2005, 21 (10):2410-6.
    [112] Kerr, M.K., Martin, M. and Churchill, G.A. Analysis of variance for gene expression microarray data[J]. J Comput Biol, 2000, 7(6):819-37.
    [113] Alizadeh, A.A., Eisen; M.B., Davis, R.E., et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profi!ing[J]. Nature. 2000, 403(6769):503-ll.
    [114] Butte AJ. Y.J., Haring HU, Stumvoll M. White MF, Kohane IS. Determining significant fold differences in gene expression analysis[C].Pac. Symp. Biocomput. Hawaii. 2001.
    [115] Xiaobo Zhou, X.W.a.E.R.D. Missing-value estimation using linear and non-linear regression with Bayesian gene selection[J]. Bioinformatics. 2003, 19(17):2302-2307.
    [116] Ouyang, M., Welsh, W.J. and Georgopoulos, P. Gaussian mixture clustering and imputation of microarray data[J]. Bioinformatics, 2004, 20(6):917-23.
    [117] Control genes and variability: absence of ubiquitous reference transcripts in diverse mammalian expression studies[J], Genome Res, 2002.
    [118] Yang, Y.H., Dudoit, S., Luu, P., et al. Normalization for cDNA microarray data: arobust composite method addressing single and multiple slide systematic variation[J]. Nucleic Acids Res. 2002, 30(4):el5.
    [119] Bolstad, B.M.. Irizarry, R.A., Astrand, M., et al. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias[J], Bioinformatics, 2003, 19(2):185-93.
    [120] Workman, C, Jensen, L.J., Jarmer, H., et al. A new non-linear normalization method for reducing variability in DNA microarray experiments[J]. Genome Biol, 2002, 3(9):research0048.
    [121] Ren, B., Robert, F., Wyrick, J.J., et al. Genome-w:ide location and function of DNA binding proteins[J]. Science, 2000, 290(5500):2306-9.
    [122] Shannon. M.F. and Rao, S. Transcription. Of chips and ChIPs[J]. Science, 2002, 296(5568):666-9.
    [123] Matys. V.. Kel-Margoulis. O.V.. Fricke, E., et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes[J]. Nucleic Acids Res, 2006, 34(Database issue):D108-10.
    [124] Teixeira, M.C., Monteiro, P., Jain, P., et al. The YEASTRACT database: a tool for the analysis of transcription regulatory associations in Saccharomyces cerevisiae[J]. Nucleic Acids Res, 2006, 34(Database issue):D446-51.
    [125] Sobering, A.K., Romeo, M.J., Vay, H.A., et al. A novel Ras inhibitor, Eril, engages yeast Ras at the endoplasmic reticulum[J]. Mol Cell Biol, 2003. 23(14):4983-90.
    [126] de Brevern, A.G., Hazout, S. and Malpertuy, A. Influence of microarray sexperiments missing values on the stability of gene groups by hierarchical clustering [J]. BMC Bioinformatics, 2004, 5:114.
    [127] Yeung, K.Y., Medvedovic, M. and Bumgarner, R.E. From co-expression to co-regulation: how many microarray experiments do we need?[J]. Genome Biol. 2004, 5(7):R48.
    [128] Oba, S., Sato, M.A., Takemasa, I., et al. A Bayesian missing value estimation method for gene expression profile data[J]. Bioinformatics, 2003, 19(16):2088-96.
    [129] Bo, T.H., Dysvik, B. and Jonassen, I. LSimpute: accurate estimation of missing values in microarray data with least squares methods[J], Nucleic Acids Res. 2004. 32(3):e34.
    [130] Sehgal, M.S., Gondal, I. and Dooley, L.S. Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data[J]. Bioinformatics, 2005, 21(10):2417-23.
    [131] Wang, X., Li, A., Jiang, Z., et al. Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme[J]. BMC Bioinformatics, 2006, 7:32.
    [132] Johansson, P. and Hakkinen, J. Improving missing value imputation of microarray data by using spot quality weights[J], BMC Bioinformatics, 2006, 7:306.
    [133] Xiang, Q., Dai, X., Deng, Y., et al. Missing value imputation for microarraygene expression data using histone acetylation information[J]. BMC Bioinformatics, 2008, 9:252.
    [134] Ki-Yeol Kim, B.-J.K.a.G.-S.Y. Reuse of imputed data in microarray analysis increases imputation efficiency [J]. BMC Bioniformatics. 2004, 5:160-168.
    [135] Bras, L.P. and Menezes, J.C. Improving cluster-based missing value estimation of DNA microarray data[J]. Biomol Eng, 2007, 24(2):273-82.
    [136] Spellman, P.T., Sherlock, G., Zhang, M.Q., et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization[Jj. Mol Biol Cell, 1998, 9(12):3273-97.
    [137]Tsai, H.K.. Lu, H.H. and Li, W.H. Statistical methods for identifying yeast cell cycle transcription factors[J]. Proc Natl Acad Sci USA, 2005, 102(38): 13532-7.
    [138] Ji, L. and Tan, K.L. Mining gene expression data for positive and negative co-regulated gene clusters[J]. Bioinformatics, 2004. 20(16):2711-8.
    [139]印莹,赵宇海,张斌,王国仁.时间微阵列数据中的同步和异步共调控基因聚类[]].计算机学报,2007,30(8):1302-1314
    [140] Stuart, J.M., Segal, E., Koller, D., et al. A gene-coexpression network for global discovery of conserved genetic modules[J]. Science, 2003, 302(5643):249-55.
    [141] Choi, J.K., Yu, U., Yoo, O.J., et al. Differential coexpression analysis using microarray data and its application to human cancer[J]. Bioinformatics, 2005, 21(24):4348-55.
    [142]王广云.肿瘤基因芯片表达数据分析的相关问题研究[D].长沙:国防科学技术大学,2009.
    [143] Ben-Dor, A., Shamir, R. and Yakhini, Z. Clustering gene expression patterns[J]. J ComputBioL 1999, 6(3-4):281-97.
    [144] Barenco, M, Tomescu, D., Brewer, D., et al. Ranked prediction of p53 targets using hidden variable dynamic modeling[J]. Genome Biol, 2006, 7(3):R25.
    [145] Michael Steinbach, L.E., Vipin Kumar. The challenges of clustering high-dimensional data[C]: Springer-Verlag. 2003.
    [146] Furey, T.S., Cristianini, N., Duffy, N., et al. Support vector machine classification and validation of cancer tissue samples using microarray expression data[J]. Bioinformatics, 2000, 16(10):906-14.
    [147] Khan. J., Wei, J.S., Ringner, M., et al. Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks[J]. Nat Med, 2001. 7(6):673-9.
    [148] Dettling, M. BagBoosting for tumor classification with gene expression data[J]. Bioinformatics, 2004, 20(18):3583-93.
    [149] Alexandridis, R., Lin, S. and Irwin, M. Class discovery and classification of tumor samples using mixture modeling of gene expression data-a unified approach[J]. Bioinformatics, 2004, 20(16):2545-52.
    [150] Tamayo, P., Slonim, D., Mesirov, J., et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation[J]. ProcNatl Acad SciU S A, 1999, 96(6):2907-12.
    [151] Liu, Y. and Ringner, M. Multiclass discovery in arraydata[J]. BMC Bioinformatics, 2004, 5:70.
    [152] von Heydebreck, A., Huber, W., Poustka, A., et al. Identifying splits with clear separation: a new class discovery method for gene expression data[J]. Bioinformatics, 2001, 17 Suppl l:S107-14.
    [153] Yu, Z., Wong, H.S. and Wang, H. Graph-based consensus clustering for class discovery from gene expression data[J]. Bioinformatics, 2007, 23(21):2888-96.
    [154] Steinfeld. I., Navon, R., Ardigo, D., et al. Clinically driven semi-supervised class discovery in gene expression data[J]. Bioinformatics, 2008, 24(16):i90-7.
    [155] Zheng, C.H., Huang, D.S., Zhang, L.. et al. Tumor clustering using nonnegative matrix factorization with gene selection[J]. IEEE Trans Inf Technol Biomed, 2009, 13(4):599-607.
    [156] Yeung, K.Y. and Ruzzo, W.L. Principal component analysis for clustering gene expression data[J]. Bioinformatics, 2001, 17(9):763-74.
    [157] Hornquist, M., Hertz, J. and Wahde, M. Effective dimensionality of large-scale expression data using principal component analysis[J]. Biosystems, 2002. 65{2-3):147-56.
    [158] Hornquist, M., Hertz, J. and Wahde, M. Effective dimensionality for principal component analysis of time series expression data[J], Biosystems, 2003,
    [159] Song, J.J., Ren, Y. and Yan, F. Classification for high-throughput data with an optimal subset of principal components[J]. Comput Biol Chem, 2009. 33(5):408-13.
    [160] Tenenbaum, J.B., de Silva, V. and Langford, J.C. A global geometric framework for nonlinear dimensionality reduction[J]. Science, 2000, 290(5500):2319-23.
    [161] T.H. Cormen, C.E. Introduction to algorithms[M]: The MIT Press, 2001.
    [162] O. Samko, A.D.M.a.P.L.R. Selection of the optimal parameter value for the Isomap algorithm[j]. Pattern Recognition Letters, 2006, 27(9):968-979.
    [163]王勇,吴翊.Isomap的最优嵌入维数的估计算法[J].系统仿真学报,2008,20(22):6066-6069.
    [164] Shipp, M.A., Ross, K.N., Tamayo, P., et al. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning[J]. Nat Med, 2002, 8(l):68-74.
    [165] Staunton, J.E., Slonim, D.K., Coller, H.A., et al. Chemosensitivity prediction by transcriptional profiling[J]. Proc Natl Acad Sci USA, 2001, 98(19):10787-92.
    [166] Pomeroy, S.L.. Tamayo, P., Gaasenbeek, M, et al. Prediction of central nervous system embryonal tumour outcome based on gene expression[J]. Nature, 2002, 4l5(6870):436-42.
    [167] Arabie, L.H.a.P. Comparing partitions[J]. Journal of Classification 1985, 2(1):193-218.
    [168] Amato, R-. Ciaramella, A., Deniskina, N., et al. A multi-step approach to time series analysis and gene expression clustering[J]. Bioinformatics, 2006. 22(5):589-96.
    [169] Balasubramaniyan, R., Hullermeier, E., Weskamp, N.. et al. Clustering of gene expression data using a local shape-based similarity measure[J]. Bioinformatics,2005, 21(7):1069-77.
    [170] Bandyopadhyay, S., Mukhopadhyay, A. and Maulik, U. An improved algorithm for clustering gene expression data[J], Bioinformatics, 2007, 23(21):2859-65.
    [171] Bar-Joseph, Z., Gifford, D.K. and Jaakkola, T.S. Fast optimal leaf ordering for hierarchical clustering[J]. Bioinformatics, 2001, 17 Suppl l:S22-9.
    [172] Bensmail, H., Golek, J., Moody, M.M., et al. A novel approach for clustering proteomics data using Bayesian fast Fourier transform[J]. Bioinformatics, 2005. 21(10):2210-24.
    [173] Yona, G., Dirks, W., Rahman, S., et al. Effective similarity measures for expression profiles[J]. Bioinformatics. 2006, 22(13): 1616-22.
    [174] Eisen, MB., Spellman, P.T., Brown, P.O., et al. Cluster analysis and display of genome-wide expression patterns[J]. Proc Natl Acad Sci USA, 1998, 95(25):14863-8.
    [175] Medvedovic, M. and Sivaganesan, S. Bayesian infinite mixture model based clustering of gene expression profiles[J]. Bioinformatics. 2002, 18(9): 1194-206.
    [176] Ghosh, D. and Chinnaiyan, A.M. Mixture modelling of gene expression data from microarray experiments[Jj. Bioinformatics. 2002, 18(2):275-86.
    [177] McLachlan. G.J., Bean, R.W. and Peel. D. A mixture model-based approach to the clustering of microarray expression data[J]. Bioinformatics, 2002, 18(3):413-22.
    [178] Perou, CM., Jeffrey, S.S., van de Rijn, M., et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers[J]. Proc Natl Acad Sci US A. 1999. 96(16):9212-7.
    [179] Ding, C.H.Q. Analysis of gene expression profiles: class discovery and leaf ordering[C].RECOMB. 2002.
    [180] Luo, F., Khan, L., Bastani, F., et al. A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles[J], Bioinformatics, 2004, 20(16):2605-17.
    [181] Buehler, E.C., Sachs, J.R., Shao, K., et al. The CRASSS plug-in for integrating annotation data with hierarchical clustering results[J]. Bioinformatics, 2004, 20(17):3266-9.
    [182] Savage, R.S., Heller, K., Xu. Y., et al. R/BHC: fast Bayesian hierarchical clustering for microarray data[J]. BMC Bioinformatics, 2009, 10:242.
    [183] Levenstien, M.A., Yang. Y. and Ott, J. Statistical significance for hierarchical clustering in genetic association and microarray expression studies[J]. BMC Bioinformatics, 2003, 4:62.
    [184] Suzuki, R. and Shimodaira, H. Pvclust: an R package for assessing the uncertainty in hierarchical clustering[J]. Bioinformatics, 2006, 22(12): 1540-2.
    [185] Wang. H.. Zheng, H. and Azuaje, F. Poisson-based self-organizing feature maps and hierarchical clustering for serial analysis of gene expression data[J]. IEEE/ACM Trans Comput Biol Bioinform, 2007, 4(2):163-75.
    [186] Kim, T.M., Yim, S.H., Jeong, Y.B., et al. PathCluster: a framework for gene set-based hierarchical clustering[J]. Bioinformatics, 2008, 24(17): 1957-8.
    [187] Dotan-Cohen, D., Melkman, A.A. and Kasif, S. Hierarchicaltree snipping: clustering guided by prior knowledge[J]. Bioinformatics, 2007, 23(24):3335-42.
    [188] Dotan-Cohen, D., Kasif, S. and Melkman, A.A. Seeing the forest for the trees: using the Gene Ontology to restructure hierarchical clustering[J]. Bioinformatics, 2009, 25(14):1789-95.
    [189] Yeung, K.Y., Haynor, D.R. and Ruzzo, W.L. Validating clustering for gene expression data[J]. Bioinformatics, 2001, 17(4):309-18.
    [190] Lu, Y., Lu. S., Fotouhi, F., et al. Incremental genetic K-means algorithm and its application in gene expression data analysis[J]. BMC Bioinformatics, 2004, 5:172.
    [191] Wu, F.X. Genetic weighted k-means algorithm for clustering large-scale gene expression data[J]. BMC Bioinformatics, 2008, 9 Suppl 6:S12.
    [192] Tseng, G.C. Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data[J]. Bioinformatics, 2007. 23(17):2247-55.
    [193] Kraj, P., Sharma, A., Garge, N., et al. ParaKMeans: Implementation of a parallelized K-means algorithm suitable for general laboratory use[J]. BMC Bioinformatics, 2008, 9:200.
    [194] Kim, E.Y., Kim, S.Y., Ashlock, D., et al. MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering[J]. BMC Bioinformatics, 2009, 10:260.
    [195] Herrero, J. and Dopazo, J. Combining hierarchical clustering and self-organizing maps for exploratory analysis of gene expression pattems[J]. J Proteome Res, 2002, l(5):467-70.
    [196] Nikkila, J., Toronen, P., Kaski, S., et al. Analysis and visualization of gene expression data using self-organizing maps[J]. Neural Netw, 2002, 15(8-9):953-66.
    [197] Garrigues, G.E., Cho, D.R., Rubash, H.E., et al. Gene expression clustering using self-organizing maps: analysis of the macrophage response to particulate biomaterials[J]. Biomaterials. 2005, 26(16):2933-45.
    [198] Brameicr, M. and Wiuf, C. Co-clustering and visualization of gene expression data and gene ontology terms for Saccharomyces cerevisiae using self-organizing maps[J]. J Biomed Inform, 2007, 40(2):160-73.
    [199] Chen, X. Curve-based clustering of time course gene expression data using self-organizing maps[J]. J Bioinform Comput Biol, 2009, 7(4):645-61.
    [200] Dembele, D. and Kastner. P. Fuzzy C-means method for clustering microarray data[J]. Bioinformatics, 2003. 19(8):973-80.
    [201] Kim, S.Y., Lee, J.W. and Bae, J.S. Effect of data normalization on fuzzy clustering of DNA microarray data[J]. BMC Bioinformatics, 2006, 7:134.
    [202] Xiang, Z., Qin, Z.S. and He, Y. CRCView: a web server for analyzing and visualizing microarray gene expression data using model-based clustering[J]. Bioinformatics, 2007, 23(14):1843-5.
    [203] Medvedovic, M., Yeung, K.Y. and Bumgarner, R.E. Bayesian mixture model based clustering of replicated microarray data[J]. Bioinformatics.2004. 20(8): 1222-32.
    [204] Ng, S.K., McLachlan, G.J., Wang, K., et al. A mixture model with random-effects components for clustering correlated gene-expression profiles[J]. Bioinformatics. 2006. 22(14):1745-52.
    [205] Dortet-Bernadet, J.L. and Wicker, N. Model-based clustering on the unit sphere with an illustration using gene expression profiles[J]. Biostatistics, 2008. 9(l):66-80.
    [206] Lauretto, M.S., Pereira, C.A. and Stern, J.M. The full Bayesian significance test for mixture models: results in gene expression clustering[J]. Genet Mol Res, 2008, 7(3):883-97.
    [207] Joshi, A., Van de Peer, Y. and Michoel, T. Analysis of a Gibbs sampler method for model-based clustering of gene expression data[J]. Bioinformatics, 2008, 24(2): 176-83.
    [208] Yuan, Y., Li, C.T. and Wilson, R. Partial mixture model for tight clustering of gene expression time-course[J]. BMC Bioinformatics, 2008, 9:287.
    [209] Pan, W. Incorporating gene functions as priors in model-based clustering of microarray gene expression data[J], Bioinformatics, 2006. 22(7):795-801.
    [210] Huang, D., Wei, P. and Pan, W. Combining gene annotations and gene expression data in model-based clustering: weighted method[J]. OMICS, 2006. 10(l):28-39.
    [211] Dai, X., Erkkila, T., Yli-Harja, O., et al. A joint finite mixture model for clustering genes from independent Gaussian and beta distributed data[J]. BMC Bioinformatics, 2009, 10:165.
    [212] Teschendorff, A.E., Wang, Y., Barbosa-Morais, N.L., et al. A variational Bayesian mixture modelling framework for cluster analysis of gene-expression data[J]. Bioinformatics, 2005, 21(13):3025-33.
    [213] Jung, Y.Y., Oh, M.S., Shin, D.W., et al. Identifying differentially expressed genes in meta-analysis via Bayesian model-based clustering[J]. Biom J, 2006, 48(3):435-50.
    [214] Hvidsten, T.R., Laegreid, A. and Komorowski, J. Learning rule-based models of biological process from gene expression time profiles using gene ontology[J]. Bioinformatics, 2003, 19(9):1116-23.
    [215] Al-Shahrour, F., Diaz-Uriarte, R. and Dopazo, J. FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genes [J]. Bioinformatics, 2004, 20(4):578-80.
    [216] King, O.D., Lee, J.C, Dudley, A.M., et al. Predicting phenotype from patterns of annotation[J]. Bioinformatics, 2003, 19 Suppl l:il83-9.
    [217] King, O.D., Foulger, R.E., Dwight, S.S., et al. Predicting gene function from patterns of annotation[J]. Genome Res, 2003, 13(5):896-904.
    [218] Dahlquist, K.D., Salomonis, N., Vranizan, K., et al. GenMAPP, a new tool for viewing and analyzing microarray data on biological pathways[J]. Nat Genet, 2002, 31(l):19-20.
    [219] Khatri, P., Bhavsar, P., Bawa, G., et al. Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional designand interpretation f high-throughput gene expression experiments [J]. Nucleic Acids Res, 2004, 2(Web Server issue):W449-56.
    [220] Robinson. M.D., Grigull, J., Mohammad, N., et al. FunSpec: a web-based cluster terpreter for yeast[J], BMC Bioinformatics. 2002, 3:35.
    [221] Doherty, J.M., Carmichael, L.K. and Mills, J.C. GOurmet: a tool for quantitative mparison and visualization of gene expression profiles based on gene ontology O) distributions[J]. BMC Bioinformatics, 2006, 7:151.
    [222] Khatri, P. and Draghici, S. Ontological analysis of gene expression data: current tools, limitations, and openproblems[J]. Bioinformatics, 2005, 21(18):3587-95.
    [223] Fang, Z., Yang, J., Li, Y., et al. Knowledge guided analysis of microarray data[J], Biomed Inform, 2006, 39(4):401-ll.
    [224] Resnik, P. Using information content to evaluate semantic similarity in a xonomy. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1995.
    [225] Lin, D. An information-theoretic definition of similarity. San Francisco, CA, USA: rgan Kaufmann Publishers Inc. 1998: 296-304.
    [226] Jay.J.Jiang Semantic Similarity Based on Corpus Statistics and Lexical taxonomy. iwan. 1997: 19-33.
    [227] J.C.Dunn. A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters [J]. Journal of Cybernetics, 1973, 3(3):32-57.
    [228] Bezdek, J.C. Pattern Recognition with Fuzzy Objective Function Algorithms[M]. rwell, MA, USA: Kluwer Academic Publishers, 1981.
    [229] Gibbons, F.D. and Roth, F.P. Judging the quality of gene expression-based ustering methods using gene annotation[J]. Genome Res, 2002, 12(10): 1574-81.
    [230] Ronen, M., Rosenberg, R., Shraiman, B.I., et al. Assigning numbers to the arrows: ameterizing a gene regulation network by using accurate expression kinetics[J]. oc Natl Acad Sci USA, 2002, 99(16): 10555-60.
    [231] Lee, T.I.. Rinaldi, N.J., Robert, F., et al. Transcriptional regulatory networks in Saccharomycescerevisiae[J]. Science, 2002, 298(5594):799-804.
    [232] Boscolo, R., Sabatti, C, Liao, J.C., et al. A generalized framework for network component analysis[J], IEEE/ACM Trans Comput Biol Bioinform, 2005., 4):289-301.
    [233] Tran, L.M., Brynildsen, M.P., Kao, K.C., et al. gNCA: a framework for termining transcription factor activity based on transcriptome: identifiability d numerical implementation[J]. Metab Eng, 2005, 7(2):128-41.
    [234] Galbraith, S.J., Tran, L.M. and Liao, J.C. Transcriptome network component analysis with limited microarray data[J]. Bio informatics, 2006, 22(15):1886-94.
    [235] Wang, C, Xuan, J., Chen, L., et al. Motif-directed network component analysis for regulatory network inference[J]. BMC Bioinformatics, 2008, 9 Suppl 1:S21.
    [236] Ye, C, Galbraith, S.J., Liao, J.C, et al. Using network component analysis to dissect regulatory networks mediated by transcription factors in yeast[J]. PLoS Comput Biol, 2009, 5(3):el000311.
    [237] Chang, C, Ding, Z., Hung, Y.S., et al. Fast network componentanalysis astNCA) for gene regulatory network reconstruction from microarray data[J], oinformatics, 2008, 24(1l):1349-58.
    [238] G.H. Golub, A.H.a.G.W.S. A generalization of the Eckart-Young-Mirsky matrix approximation theorem[J]. Linear Algebra and its Applications. 1987, 889(4):317-327.
    [239] Alter, O. and Golub, G.H. Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription[J]. Proc Natl Acad Sci USA, 2004, 101(47):16577-82.
    [240] Gao. F.. Foat, B.C. and Bussemaker. H.J. Defining transcriptional networks rough integrative modeling of mRNA expression and transcription factorbinding data[J]. BMC Bioinformatics, 2004, 5:31.
    [241] Boulesteix, A.L. and Strimmer, K. Predicting transcription factor activities from combined analysis of microarray and ChIP data: a partial least squaresapproach[J]. Theor Biol Med Model, 2005, 2:23.
    [242] Pournara, I. and Wernisch, L. Factor analysis for gene regulatory networks and transcription factor activity profiles[J]. BMC Bioinformatics, 2007, 8:61.
    [243] Sanguinetti, G., Rattray, M. and Lawrence, N.D. A probabilistic dynamical modelfor quantitative inference of the regulatory mechanism of transcription[J].Bioinformatics, 2006, 22(14): 1753-9.
    [244] Nachman, I., Regev, A. and Friedman, N. Inferring quantitative models ofregulatory networks from expression data[J]. Bioinformatics, 2004, 20 Suppl1:1248-56.
    [245] Li, Z., Shaw, S.M., Yedwabnick, M.J., et al. Using a state-space model with hidden variables to infer transcription factor activities[J]. Bioinformatics, 2006, 22(6):747-54.
    [246] Khanin. R., Vinciotti, V., Mersinias, V., et al. Statistical reconstruction oftranscription factor activity using Michaelis-Menten kinetics[J]. Biometrics, 2007.63(3):816-23.
    [247] Rogers, S., Khanin, R. and Girolami, M. Bayesian model-based inference of transcription factor activity[J]. BMC Bioinformatics, 2007, 8 Suppl 2:S2.
    [248] Barenco, M., Papouli, E., Shah, S., et al. rHVDM: an R package to predict the activity and targets of a transcription factor[J]. Bioinformatics, 2009,25(3):419-20.
    [249] Gao, P., Honkela, A., Rattray, M., et al. Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities[J].Bioinformatics- 2008, 24(16):i70-5.
    [250] Cheng, C, Yan, X., Sun, F., et al. Inferring activity changes of transcription factors by binding association with sorted expression profiles[J]. BMCBioinformatics, 2007, 8:452.
    [251] Ashburner. M., Ball, C.A.. Blake, J.A.. et al. Gene ontology: tool for theunification of biology. The Gene Ontology Consortium[J]. Nat Genet, 2000.25(l):25-9.
    [252] Richard S. Savage, Z.G., Jim E. Griffin. Bernard J. de la Cruz and David L. Wild.Discovering transcriptional modules by Bayesian dataintegration[J].Bioinformatics, 2010, 26(lSMB2010):il58-il67.
    [253]王惠文,吴载斌,孟洁。偏最小二乘回归的线性和非线性方法[M].北京:国防工业出版社,2006.
    [254] Wichert, S., Fokianos. K. and Strimmer, K. Identifying periodically expressed transcripts in microarray time series data[J]. Bioinformatics, 2004, 20(1):5-20.
    [255] Yang, Y.L., Suen, J., Brynildsen, M.P., et al. Inferring yeast cell cycle regulators and interactions using transcription factor activities[J]. BMC Genomics. 2005. 6(1):90.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700