酵母和大肠杆菌基因表达谱与蛋白质相互作用的相关性分析
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
生物体受到外界环境的刺激是多种多样的,要对如此多样的刺激作出应答以维持自身的稳定就需要多个蛋白质之间的相互配合,即蛋白质相互作用,这是近十几年来蛋白质组学研究的重点之一。经外界环境刺激后引发的蛋白质相互作用,按照作用程度可以分为瞬时型作用和牢固型作用。有研究显示,蛋白质相互作用通常对外界不同刺激的应答相对稳定,非相互作用的蛋白其变化幅度却相对明显,对瞬时刺激引起的应答调节有显著效应。因此,我们推测在生物体受到外界刺激后,非相互作用的蛋白往往通过各种联系调控细胞应答以应对外界的瞬间刺激。针对上述问题,我们利用生物信息学的方法,结合基因表达谱与蛋白质相互作用的数据进行深入研究。
     本研究以酵母和大肠杆菌作为研究对象。首先,通过整合两种模式生物的基因表达谱与蛋白质相互作用关系的数据,构建了蛋白质相互作用正负样本集,结合基于Pearson相关系数的共表达基因,分析了受到外界刺激后,生物体内有相互作用的蛋白对与非相互作用的蛋白对之间的关系。结果显示,两物种在应激反应之后至恢复稳态的时间内,相互作用的蛋白对与非相互作用的蛋白对之间的变化差异极显著(P<2.2e-16),且通过对比发现非相互作用的蛋白对的基因表达谱具有较高的变化幅度。
     然后,设定阂值,筛选出Pearson相关系数r≥0.75且经受环境刺激后共表达系数上升幅度大于0.1的基因,用于构建网络。通过对网络中的基因进行富集分析,我们发现大部分基因主要富集在核糖体的合成,氨基酸的合成/代谢,能量代谢等途径,而这些途径均与蛋白降解和蛋白合成有关,即通过蛋白替代的方式快速的完成细胞状态的改变。
     最后,为了寻找共表达调控的共性,我们将共表达基因与对应的蛋白质相互作用网络相结合。通过对蛋白质相互作用网络的深入分析以及对基因共表达的研究,我们更清楚的认识到细胞的表达调控不是对单个基因的调控,而是以某种机制将一簇基因关联起来,从而对整体进行调控。
     本研究不仅为理解生物体应答外界环境刺激的作用机制提供了重要线索,而且通过整合与分析不同类型的数据,加深了对共表达机制在生物大分子网络中重要性的认识。
Organisms can live with the various outside stimuli. In order to respond to so many stresses and to maintain the homeostasis, it is indispensable with the coordinating protein-protein interactions which were a hot issue of proteomics research over the last decade. According to the extent of the role, protein-protein interactions stimulated by the external environment can be divided into transient-type and solid-type actions. It was reported that responses caused by protein-protein interactions coping with different stimuli were relatively stable, but non-protein interactions changed obviously which were favorable to regulate responses caused by the transient stimulation. Thereby, we speculate that, when organisms are affected by external stimulation, non-protein interactions usually regulate cell responses through a variety of linkages in response to external transient stimuli. In order to address this phenomenon, based on the bioinformatics approaches, a comprehensive analysis was carried out by combining the data of gene expression profiles with protein-protein interactions.
     Firstly, in this thesis two model organisms, yeast and E. coli were selected as research subjects to establish protein-protein interaction positive and negative sample sets through investigating the relationship between gene expression profiles and protein-protein interactions, also the relationship between protein-protein interactions with non-protein interactions when suffered the outside simulation was explored combining with co-expression genes based on Pearson correlation coefficients. The results showed that there were significant differences (P< 2.2e-16) between protein-protein interactions and non-protein interactions during the time when stress response first came up until the relatively steady-state time appeared. Also the changing range of non-protein interactions on the gene expression profiles was higher than that of protein-protein interactions.
     Secondly, the threshold value was set, genes with Pearson correlation coefficient r≥0.75 and genes which co-expression coefficient increased more than 0.1 after external stimuli were selected to construct a co-expression network. The gene enrichment analysis showed that, most of the genes mainly enriched in the processes of ribosome synthesis, amino acid synthesis/metabolism and energy metabolism, and all of these processes were related to protein synthesis and degradation, and by rapidly changing the protein alternative method to finish the process of cell condition transformation when stimulated by the external stimuli.
     Thirdly, in order to research the similarity of co-expression, the co-expression genes were combined with the corresponding protein-protein interactions network. In addition, the protein-protein interactions network and co-expression genes were deeply studied, we claimed that, the process of cell expression regulated from the whole level through marking a cluster gene with certain mechanism instead of regulation from a single gene level.
     This research not only provided important clues to understand the mechanism of organisms response to the various outside stimuli, but also through the comprehensive analysis combining different types of data, the importance of co-expression mechanism on biological macromolecules network was deeply recognized.
引文
[1]李霞,李亦学,廖飞.生物信息学[M].人民卫生出版社,2010.
    [2]Hieter P, Boguski M. Functional genomics:it's all how you read it[J]. Science,1997,278(5338): 601-602.
    [3]Campbell A M, Heyer L J. Discovering genomics proteomics and bioinformatics[M]. Benjamin Cummings,2004.
    [4]Dyson F J. The Sun, The Genome, And The Internet:Tools Of Scientific Revolution[M]. Oxford University Press,2001.
    [5]Watson J D, Crick F H. Molecular structure of nucleic acids:a structure for deoxyribose nucleic acid[J]. Nature,1953,171(4356):737-738.
    [6]Southern E M. Detection of specific sequences among DNA fragments separated by gel electrophoresis.1975.[J]. Biotechnology,1992,24:122-139.
    [7]Alwine J C, Kemp D J, Stark G R. Method for detection of specific RNAs in agarose gels by transfer to diazobenzyloxymethyl-paper and hybridization with DNA probes[J]. Proc Natl Acad Sci U S A,1977,74(12):5350-5354.
    [8]Bowden J R, Brennan P A. DNA microarray technology:insights for oral and maxillofacial surgeons[J]. Br J Oral Maxillofac Surg,2004,42(6):542-545.
    [9]Schena M, Shalon D, Davis R W, et al. Quantitative monitoring of gene expression patterns with a complementary DNA microarray [J]. Science,1995,270(5235):467-470.
    [10]Fodor S P, Read J L, Pirrung M C, et al. Light-directed, spatially addressable parallel chemical synthesis[J]. Science,1991,251(4995):767-773.
    [11]Lockhart D J, Dong H, Byrne M C, et al. Expression monitoring by hybridization to high-density oligonucleotide arrays[J]. Nat Biotechnol,1996,14(13):1675-1680.
    [12]Schena M, Shalon D, Heller R, et al. Parallel human genome analysis:microarray-based expression monitoring of 1000 genes[J]. Proc Natl Acad Sci USA,1996,93(20):10614-10619.
    [13]Ramsay G. DNA chips:state-of-the art[J]. Nat Biotechnol,1998,16(1):40-44.
    [14]Lipshutz R J, Fodor S P, Gingeras T R, et al. High density synthetic oligonucleotide arrays[J]. Nat Biotechnol,1999,21(1):20-24.
    [15]Harrington C A, Rosenow C, Retief J. Monitoring gene expression using DNA microarrays[J]. Curr Opin Microbiol,2000,3(3):285-291.
    [16]李瑶,贺佳.基因芯片数据分析与处理[M].化学工业出版社,2006.
    [17]Burton P R, Clayton D G, Cardon L R, et al. Association scan of 14,500 nonsynonymous SNPs in four diseases identifies autoimmunity variants[J]. Nat Genet,2007,39(11):1329-1337.
    [18]Pokholok D K, Harbison C T, Levine S, et al. Genome-wide map of nucleosome acetylation and methylation in yeast[J]. Cell,2005,122(4):517-527.
    [19]Barski A, Cuddapah S, Cui K, et al. High-resolution profiling of histone methylations in the human genome[J]. Cell,2007,129(4):823-837.
    [20]Mikkelsen T S, Ku M, Jaffe D B, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells[J]. Nature,2007,448(7153):553-560.
    [21]Robertson A G, Bilenky M, Tam A, et al. Genome-wide relationship between histone H3 lysine 4 mono-and tri-methylation and transcription factor binding[J]. Genome Res,2008,18(12): 1906-1917.
    [22]Ren B, Robert F, Wyrick J J, et al. Genome-wide location and function of DNA binding proteins[J]. Science,2000,290(5500):2306-2309.
    [23]Horak C E, Luscombe N M, Qian J, et al. Complex transcriptional circuitry at the G1/S transition in Saccharomyces cerevisiae[J]. Genes Dev,2002,16(23):3017-3033.
    [24]Garrels J I. Yeast genomic databases and the challenge of the post-genomic era[J]. Funct Integr Genomics,2002,2(4-5):212-237.
    [25]Eisenberg D, Marcotte E M, Xenarios I, et al. Protein function in the post-genomic era[J]. Nature, 2000,405(6788):823-826.
    [26]Auerbach D, Thaminy S, Hottiger M O, et al. The post-genomic era of interactive proteomics:facts and perspectives[J]. Proteomics,2002,2(6):611-623.
    [27]Legrain P, Wojcik J, Gauthier J M. Protein-protein interaction maps:a lead towards cellular functions[J]. Trends Genet,2001,17(6):346-352.
    [28]Drewes G, Bouwmeester T. Global approaches to protein-protein interactions[J]. Curr Opin Cell Biol,2003,15(2):199-205.
    [29]Wang J. Protein recognition by cell surface receptors:physiological receptors versus virus interactions[J]. Trends Biochem Sci,2002,27(3):122-126.
    [30]Kong B C, Kuncewicz T, Zhang W, et al. Protein interactions with nitric oxide synthases: controlling the right time, the right place, and the right amount of nitric oxide[J]. Am J Physiol Renal Physiol,2003,285(2):178-190.
    [31]Cohen F E, Prusiner S B. Pathologic conformations of prion proteins[J]. Annu Rev Biochem,1998, 67:793-819.
    [32]Loregian A, Marsden H S, Palu G. Protein-protein interactions as targets for antiviral chemotherapy [J]. Rev Med Virol,2002,12(4):239-262.
    [33]Selkoe D J. The cell biology of beta-amyloid precursor protein and presenilin in Alzheimer's disease[J]. Trends Cell Biol,1998,8(11):447-453.
    [34]Parrish J R, Gulyas K D, Finley R L. Yeast two-hybrid contributions to interactome mapping[J]. Curr Opin Biotechnol,2006,17(4):387-393.
    [35]Pandey A, Mann M. Proteomics to study genes and genomes[J]. Nature,2000,405(6788): 837-846.
    [36]Hall D A, Ptacek J, Snyder M. Protein microarray technology[J]. Mech Ageing Dev,2007,128(1): 161-167.
    [37]Deng M, Tu Z, Chen T, et al. Mapping Gene Ontology to proteins based on protein-protein interaction data[J]. Bioinformatics,2004,20(6):895-902.
    [38]Vazquez A, Flammini A, Maritan A, et al. Global protein function prediction from protein-protein interaction networks[J]. Nat Biotechnol,2003,21(6):697-700.
    [39]Steffen M, Petti A, Aach J, et al. Automated modeling of signal transduction networks[J]. BMC Bioinformatics,2002,3(1):34-44.
    [40]余鑫煜,许正平.蛋白质相互作用数据库及其应用[J].中国生物化学与分子生物学报,2008,24(3):189-196.
    [41]Bharswaj N, Lu H. Correlation between gene expression profiles and protein-protein interactions within and across genomes[J]. Bioinformatics,2005,21(11):2730-2738.
    [42]Strong M, Eisenberg D. The protein network as a tool for finding novel drug targets[J]. Prog Drug Res,2007,64:191,193-215.
    [43]Hu Z, Mellor J, Wu J, et al. Towards zoomable multidimensional maps of the cell[J]. Nat Biotechnol,2007,25(5):547-554.
    [44]Zhu X, Gerstein M, Snyder M. Getting connected:analysis and principles of biological networks[J]. Genes Dev,2007,21(9):1010-1024.
    [45]Reece R J. Analysis of genes and genomes[M]. John Wiley and Sons Ltd,2004.
    [46]Chu S, DeRisi J, Eisen M, et al. The transcriptional program of sporulation in budding yeast[J]. Science,1998,282(5389):699-705.
    [47]Gasch A P, Spellman P T, Kao C M, et al. Genomic expression programs in the response of yeast cells to environmental changes[J]. Mol Biol Cell,2000,11(12):4241-4257.
    [48]Grigoriev A. A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae[J]. Nucleic Acids Res, 2001,29(17):3513-3519.
    [49]Eisen M B, Spellman P T, Brown P O, et al. Cluster analysis and display of genome-wide expression patterns[J]. Proc Natl Acad Sci USA,1998,95(25):14863-14868.
    [50]Hahn A, Rahnenfuhrer J, Talwar P, et al. Confirmation of human protein interaction data by human expression data[J]. BMC Bioinformatics,2005,6:112.
    [51]Ge H, Liu Z, Church G M, et al. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae[J]. Nat Genet,2001,29(4):482-486.
    [52]Jansen R, Greenbaum D, Gerstein M. Relating whole-genome expression data with protein-protein interactions[J]. Genome Res,2002,12(1):37-46.
    [53]Ramasamy A, Mondry A, Holmes C C, et al. Key issues in conducting a meta-analysis of gene expression microarray datasets[J]. PLoS Med,2008,5(9):184.
    [54]Fields S, Johnston M. Cell biology. Whither model organism research[J]. Science,2005,307(5717): 1885-1886.
    [55]Edgar R, Domrachev M, Lash A E. Gene Expression Omnibus:NCBI gene expression and hybridization array data repository[J]. Nucleic Acids Res,2002,30(1):207-210.
    [56]Barrett T, Troup D B, Wilhite S E, et al. NCBI GEO:mining tens of millions of expression profiles-database and tools update[J]. Nucleic Acids Res,2007,35(Database issue):D760-D765.
    [57]Salwinski L, Miller C S, Smith A J, et al. The Database of Interacting Proteins:2004 update[J]. Nucleic Acids Res,2004,32(Database issue):D449-D451.
    [58]Jensen L J, Kuhn M, Stark M, et al. STRING 8-a global view on proteins and their functional interactions in 630 organisms[J]. Nucleic Acids Res,2009,37(Database issue):D412-D416.
    [59]Breitkreutz B J, Stark C, Reguly T, et al. The BioGRID Interaction Database:2008 update[J]. Nucleic Acids Res,2008,36(Database issue):D637-D640.
    [60]Aranda B, Achuthan P, Alam-Farugue Y, et al. The IntAct molecular interaction database in 2010[J]. Nucleic Acids Res,2010,38(Database issue):D525-D531.
    [61]Alfarano C, Andrade C E, Anthony K, et al. The biomolecular interaction network database and related tools 2005 update[J]. Nucleic Acids Res,2005,33(Database issue):D418-D424.
    [62]Bader G D, Cary M P, Sander C. Pathguide:a pathway resource list[J]. Nucleic Acids Res,2006, 34(Database issue):D504-D506.
    [63]Shannon P, Markiel A, Ozier O, et al. Cytoscape:a software environment for integrated models of biomolecular interaction networks[J]. Genome Res,2003,13(11):2498-2504.
    [64]Gautier L, Cope L, Bolstad B M, et al. affy-analysis of Affymetrix GeneChip data at the probe level[J]. Bioinformatics,2004,20(3):307-315.
    [65]Smyth G K, Speed T. Normalization of cDNA microarray data[J]. Methods,2003,31(4):265-273.
    [66]Tinker A V, Boussioutas A, Bowtell D D. The challenges of gene expression microarrays for the study of human cancer[J]. Cancer Cell,2006,9(5):333-339.
    [67]Irizarry R A, Hobbs B, Collin F, et al. Exploration, normalization, and summaries of high density oligonucleotide array probe level data[J]. Biostatistics,2003,4(2):249-264.
    [68]Bolstad B M, Irizarry R A, Astrand M, et al. A comparison of normalization methods for high density oligonucleotide array data based on variance and bias[J]. Bioinformatics,2003,19(2): 185-193.
    [69]Huh W K, Falvo J V, Gerke L C, et al. Global analysis of protein localization in budding yeast[J]. Nature,2003,425(6959):686-691.
    [70]Schwikowski B, Uetz P, Fields S. A network of protein-protein interactions in yeast[J]. Nat Biotechnol,2000,18(12):1257-1261.
    [71]Chen Y, Xu D. Computational analysis of high-throughput protein-protein interaction data[J]. Curr Protein Pept Sci,2003,4(3):159-181.
    [72]Mewes H W, Albermann K, Heumann K, et al. MIPS:a database for protein sequences, homology data and yeast genome information[J]. Nucleic Acids Res,1997,25(1):28-30.
    [73]Aggarwal A, Li G D, Hoshida Y, et al. Topological and functional discovery in a gene coexpression meta-network of gastric cancer[J]. Cancer Res,2006,66(1):232-241.
    [74]Lee H K, Hsu A K, Sajdak J, et al. Coexpression analysis of human genes across many microarray data sets[J]. Genome Res,2004,14(6):1085-1094.
    [75]Hu H, Yan X, Huang Y, et al. Mining coherent dense subgraphs across massive biological networks for functional discovery[J]. Bioinformatics,2005,21(1):213-221.
    [76]Ashburner M, Ball C A, Blake J A, et al. Gene ontology:tool for the unification of biology. The Gene Ontology Consortium[J]. Nat Genet,2000,25(1):25-29.
    [77]Kanehisa M, Goto S, Furumichi M, et al. KEGG for representation and analysis of molecular networks involving diseases and drugs[J]. Nucleic Acids Res,2010,38(Database issue): D355-D360.
    [78]Dennis G J, Sherman B T, Hosack D A, et al. DAVID:Database for Annotation, Visualization, and Integrated Discovery[J]. Genome Biol,2003,4(5):3.
    [79]Fromont R M, Senger B, Saveanu C, et al. Ribosome assembly in eukaryotes[J]. Gene,2003,313: 17-42.
    [80]Cruz D L, Kressler D, Linder P. Unwinding RNA in Saccharomyces cerevisiae:DEAD-box proteins and related families[J]. Trends Biochem Sci,1999,24(5):192-198.
    [81]Ye J, Fang L, Zheng H, et al. WEGO:a web tool for plotting GO annotations[J]. Nucleic Acids Res, 2006,34(Web Server issue):W293-W297.
    [82]Yu J, Hu S, Wang J, et al. A draft sequence of the rice genome (Oryza sativa L. ssp. indica)[J]. Science,2002,296(5565):79-92.
    [83]Xia Q, Zhou Z, Lu C, et al. A draft sequence for the genome of the domesticated silkworm (Bombyx mori)[J]. Science,2004,306(5703):1937-1940.
    [84]Assenov Y, Ramirez F, Schelhorn S E, et al. Computing topological parameters of biological networks[J]. Bioinformatics,2008,24(2):282-284.
    [85]Pagel P, Kovac S, Oesterheld M, et al. The MIPS mammalian protein-protein interaction database[J]. Bioinformatics,2005,21(6):832-834.
    [86]Mathivanan S, Periaswamy B, Gandhi T K, et al. An evaluation of human protein-protein interaction data in the public domain[J]. BMC Bioinformatics,2006,7:19.
    [87]胡永林.核糖体的结构与功能研究-2009年诺贝尔化学奖简介[J].生物化学与生物物理进展,2009,36(10):1239-1243.
    [88]Imming P, Sinning C, Meyer A. Drugs, their targets and the nature and number of drug targets[J]. Nat Rev Drug Discov,2006,5(10):821-834.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700