人类蛋白共进化网络研究与交互式转录组注释系统构建
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
随着人类基因组计划的顺利完成,基因组学、转录组学和蛋白质组学等组学研究都进入了快速发展时期。而DNA测序技术的革新和进步,导致了生物信息数据的爆炸式增长。当前生物信息学研究的重要命题,就是如何对这些组学数据进行合理存储、整理、挖掘并高效使用。本论文的研究将围绕蛋白质组学和转录组学中的两个具体问题展开,以生物信息学数据挖掘方法和数据管理模式来解决这两个问题。
     蛋白质共进化网络是蛋白质组学研究的一个重要方向,也是揭示蛋白质相互作用关系的重要方法。当前蛋白质相互作用的研究手段主要包括实验方法和生物信息学方法两大类,与生物实验方法相比,生物信息学方法更加省时有效,更适合组学数据的深入挖掘。近年来,已有众多物种完成全基因组测序,这为研究人的蛋白质共进化网络提供了前提和基础。基于此,本课题主要进行人类蛋白质共进化网络的构建,通过真核生物全基因组同源基因之间的进化距离,采用镜像树方法,以NCBI的HomoloGene数据库中18个真核生物的18,283个同源蛋白家族为研究对象,构建不同物种蛋白家族间的距离矩阵,并计算了每两个蛋白家族之间的皮尔森相关系数与向量数量值,得到人类蛋白质共进化网络。最后应用蛋白质复合物数据、DIP和HPRD数据库中的蛋白质互作数据以及代谢调控网络数据对蛋白质共进化网络进行有效性检验,验证结果表明共进化网络可以用于揭示蛋白质之间的相互作用关系。我们又进一步分析了蛋白质共进化模型相关系数过于聚集的原因,采用了不同物种宽度比较其进化距离差异,得出当前真核生物全基因组同源注释的物种较少,物种间进化距离宽度不明显,与人类远源物种数量较少是造成相关系数过于聚集的主因。后续更多物种测序完成,必将改善真核生物的蛋白质共进化网络研究。
     随着蛋白质组和基因功能的系统性研究顺利进行,转录组信息的需求也在不断地增加。尤其是研究不同细胞生理状态下和不同病理状态下的基因调控和功能方面,转录本与所编码蛋白质的具体分布和功能的关联性尤为重要。如何把这些转录组数据深入的整理、归纳、注释、存储以及合理的利用是我们研究的重要目标。近年来,综合型转录组数据库已经归纳整理并存储了各种不同测序技术的转录组数据,受到了广泛的使用。然而,当转录组学数据需要进行交互注释和深度挖掘时,这类数据库就无法满足了。因此,我们专门构建了人体转录组交互式注释系统,该系统以人体结构有向图为组织框架,利用链接表存储方式和深度优先遍历根路径算法存储和遍历人体结构图,搜索到的细胞或组织根路径方便了数据的查找和获取,最重要的是系统建立在Web2.0交互式平台上,扩展空间巨大。由于进行课题研究时,EST的测序技术较为成熟,数据覆盖面广、使用量大,所以,我们采用了EST作为系统的首选数据源。结合EST的文库信息,按照在人体健康与病理细胞中的表达情况,把其分类到相应的细胞或组织中。除此之外,进一步挖掘人的看家基因、组织特异基因、基因在染色体上的表达信息以及基因的GO功能分类,并将以上各种分析处理的数据综合起来补充人类转录组注释系统的数据信息。该系统基于mediawiki引擎,可提供交互式服务,用户不仅可以搜索、浏览、数据下载,也能够进行上传、注释等操作,方便系统中数据的实时更新,让每一位用户都成为管理员,使得系统高效有序地运行。最新数据库状态表明,短期内的高注册率和高访问量说明人类转录组注释系统具有较高的实用性。
With the successful completion of Human Genome Project, genomics, transcriptomics and proteomics research have achieved a rapid development period. Meanwhile, the innovation and progress of DNA sequencing technology result in bioinformatics massive data growth. An important bioinformatics question is how to rationally store, manage and process these omics data. This study will focus on two specific issues in proteomics and transcriptiomics, protein co-evolution model and transcriptomics interactive annotation system.
     Protein co-evolution network is an important method to reveal relations of protein-protein interactions(PPI). Currently, PPI investigation methods mainly consist of two categories, biological experimental methods and bioinformatics methods. Compared with biological experiments, bioinformatics methods are more effective and more suitable for genomics data mining. In recent years, more whole genomes have been published, which promote the study of human protein co-evolution network. Based on those points, we have constructed human protein co-evolution model, which begins with18,283homologous protein families of18eukaryotic species from NCBI HomoloGene database. We computed the evolutionary distance between eukaryotic genome-wide homologous genes, built distance matrix with mirror tree methods between different species protein families, and calculated the Pearson correlation coefficient and the vectors number of each protein family. Finally, we identified the efficiency of the protein co-evolution model with data from human protein complexes, PPIs from DIP and HPRD databases, and proteins from human metabolic networks. The results show that the protein co-evolution network model can be used to reveal the interactions between proteins. We further analyzed why the correlation coefficient is too concentrative in the protein co-evolution model. The evolution width of different species is used to compare their evolutionary distances. We found that the eukaryotic species with homology gene annotations are less as well as the species number which is distal with human.This may cause the correlation coefficient gathered too tight. We believe that more completed whole genomes will improve the eukaryotic protein co-evolution network research.
     With the rapid development of proteomics and systematic study of gene function, the demand of transcriptome information is constantly increasing. The distribution and function pertinence of transcripts and encoded protein are so important to study gene regulation and function with different physiological and pathological states. The main problem is how to collate, store, process and annotate these transcriptome data. In recent year, the well-known comprehensive transcriptome databases collate and store various types of transcriptomic data, but these databases cannot satisfy the interactive annotations and the deeply data-mining of those data. Therefore, we built the human transcriptome interactive annotation system, Wikicell, which is based on the organizational framework by human body structure graph. Searching and accessing data or body structure graph is convenient in Wikicell using adjacency list storage and depth first search methods. The whole system is built on Web2.0interactive platform, which have a huge space for expansion.The major data source is EST data, because EST sequencing technology is more mature, and EST data has widely coverage and more popular for users. We classified EST data into the appropriate cells or tissues in accordance with their library information. We also supplied the housekeeping and tissue-specific genes tables; the gene expression information table divided by chromosomes and GO functional classification tables. The system is bulit on mediawiki engine, and provides interactive services, on which users can not only search, browse, download the data, but also upload, comment and make other operations to facilitate real-time updates of data. Every user is an administrator to make the system efficiently and orderly operation. The latest state of the database shows that high registration rate and high page views confirm the human transcriptome annotation system has high practicability.
引文
1. Shi TL and Li YX. Status and prospects of systems biology. BNNSFC,2005.19(5):282-6.
    2. Bauer A and Kuster B. Affinity purification-mass spectrometry. Powerful tools for the characterization of protein complexes. Eur J Biochem,2003.270(4):570-8.
    3. Alberts B. The cell as a collection of protein machines:preparing the next generation of molecular biologists. Cell,1998.92(3):291-4.
    4. Uetz P, Giot L, Cagney G, et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature,2000.403(6770):623-7.
    5. Walhout AJ and Vidal M. Protein interaction maps for model organisms. Nat Rev Mol Cell Biol,2001. 2(1):55-62.
    6. Rhodes DR, Tomlins SA, Varambally S, et al. Probabilistic model of the human protein-protein interaction network. Nat Biotechnol,2005.23(8):951-9.
    7. Hart GT, Ramani AK and Marcotte EM. How complete are current yeast and human protein-interaction networks? Genome Biol,2006.7(11):120.
    8. Scott MS and Barton GJ. Probabilistic prediction and ranking of human protein-protein interactions. BMC Bioinformatics,2007.8:239.
    9. Fields S and Song O. A novel genetic system to detect protein-protein interactions. Nature,1989. 340(6230):245-6.
    10. Ito T, Chiba T, Ozawa R, et al. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A,2001.98(8):4569-74.
    11. Dove SL, Joung JK and Hochschild A. Activation of prokaryotic transcription through arbitrary protein-protein contacts. Nature,1997.386(6625):627-30.
    12.符庆瑛,高钰琪.大规模高通量方法在蛋白质相互作用研究中的应用.生物化学与生物物理进展,2008.35(3):246-54.
    13. Serebriiskii IG, Fang R, Latypova E, et al. A combined yeast/bacteria two-hybrid system:development and evaluation. Mol Cell Proteomics,2005.4(6):819-26.
    14. Smith DB and Johnson KS. Single-step purification of polypeptides expressed in Escherichia coli as fusions with glutathione S-transferase. Gene,1988.67(1):31-40.
    15.陈谋通,刘建军.蛋白质相互作用的研究方法.生物技术通报,2009.1:50-4.
    16.张艳贞,王罡,胡汉桥.农杆菌介导将Bt杀虫蛋白基因导入优良玉米自交系的研究.遗传,2002.24(1):35-9.
    17. Lueking A, Horn M, Eickhoff H, et al. Protein microarrays for gene expression and antibody screening. Anal Biochem,1999.270(1):103-11.
    18. Puig O, Caspary F, Rigaut G, et al. The tandem affinity purification (TAP) method:a general procedure of protein complex purification. Methods,2001.24(3):218-29.
    19. Murry LE, Elliott LG, Capitant SA, et al. Transgenic corn plants expressing MDMV strain B coat protein are resistant to mixed infections of maize dwarf mosaic virus and maize chlorotic mottle virus. Biotechnology (N Y),1993.11(13):1559-64.
    20. Salwinski L, Miller CS, Smith AJ, et al. The database of interacting proteins:2004 update. Nucleic Acids Res,2004.32(Database issue):D449-51.
    21. Bader GD, Betel D and Hogue CW. BIND:the biomolecular interaction network database. Nucleic Acids Res,2003.31(1):248-50.
    22. von-Mering C, Huynen M, Jaeggi D, et al. STRING:a database of predicted functional associations between proteins. Nucleic Acids Res,2003.31(1):258-61.
    23. Franceschini A, Szklarczyk D, Frankild S, et al. STRING v9.1:protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res,2013.41(Database issue):D808-15.
    24. Dandekar T, Snel B, Huynen M, et al. Conservation of gene order:a fingerprint of proteins that physically interact. Trends Biochem Sci,1998.23(9):324-8.
    25. Pellegrini M, Marcotte EM, Thompson MJ, et al. Assigning protein functions by comparative genome analysis:protein phylogenetic profiles. Proc Natl Acad Sci U S A,1999.96(8):4285-8.
    26. Enright AJ, Iliopoulos I, Kyrpides NC, et al. Protein interaction maps for complete genomes based on gene fusion events. Nature,1999.402(6757):86-90.
    27. Pazos F and Valencia A. In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins,2002.47(2):219-27.
    28. Gobel U, Sander C, Schneider R, et al. Correlated mutations and residue contacts in proteins. Proteins, 1994.18(4):309-17.
    29. Fraser HB, Hirsh AE, Wall DP, et al. Coevolution of gene expression among interacting proteins. Proc Natl Acad Sci U S A,2004.101(24):9033-8.
    30. Pieper U, Eswar N, Braberg H, et al. MODBASE, a database of annotated comparative protein structure models, and associated resources. Nucleic Acids Res,2004.32(Database issue):D217-22.
    31. Lu L, Lu H and Skolnick J. MULTIPROSPECTOR:an algorithm for the prediction of protein-protein interactions by multimeric threading. Proteins,2002.49(3):350-64.
    32. Smith GR and Sternberg MJ. Prediction of protein-protein interactions by docking methods. Curr Opin Struct Biol,2002.12(1):28-35.
    33. Aytuna AS, Gursoy A and Keskin O. Prediction of protein-protein interactions by combining structure and sequence conservation in protein interfaces. Bioinformatics,2005.21(12):2850-5.
    34. Sprinzak E and Margalit H. Correlated sequence-signatures as markers of protein-protein interaction. J Mol Biol,2001.311(4):681-92.
    35. Deng M, Mehta S, Sun F, et al. Inferring domain-domain interactions from protein-protein interactions. Genome Res,2002.12(10):1540-8.
    36. Bock JR and Gough DA. Predicting protein--protein interactions from primary structure. Bioinformatics, 2001.17(5):455-60.
    37.任仙文,李北平,王月兰,et al.蛋白质相互作用的生物信息学研究进展.生物技术通讯,2006.17(6):976-80.
    38. Jansen R, Yu H, Greenbaum D, et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science,2003.302(5644):449-53.
    39.吕品一,郑珩,劳兴珍.蛋白质共进化分析研究进展.生物信息学,2010.8(1):34-7.
    40. Fares MA and McNally D. CAPS:coevolution analysis using protein sequences. Bioinformatics,2006. 22(22):2821-2.
    41. Pazos F, Helmer-Citterich M, Ausiello G, et al. Correlated mutations contain information about protein-protein interaction. J Mol Biol,1997.271(4):511-23.
    42. Lee BC and Kim D. A new method for revealing correlated mutations under the structural and functional constraints in proteins. Bioinformatics,2009.25(19):2506-13.
    43. Taylor WR and Hatrick K. Compensating changes in protein multiple sequence alignments. Protein Eng, 1994.7(3):341-8.
    44. Neher E. How frequent are correlated changes in families of protein sequences? Proc Natl Acad Sci U S A,1994.91(1):98-102.
    45. Korber BT, Farber RM, Wolpert DH, et al. Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein:an information theoretic analysis. Proc Natl Acad Sci USA,1993.90(15):7176-80.
    46. Fodor AA and Aldrich RW. Influence of conservation on calculations of amino acid covariance in multiple sequence alignments. Proteins,2004.56(2):211-21.
    47. Dunn SD, Wahl LM and Gloor GB. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics,2008.24(3):333-40.
    48. Suel GM, Lockless SW, Wall MA, et al. Evolutionarily conserved networks of residues mediate allosteric communication in proteins. Nat Struct Biol,2003.10(1):59-69.
    49. Dekker JP, Fodor A, Aldrich RW, et al. A perturbation-based method for calculating explicit likelihood of evolutionary co-variance in multiple sequence alignments. Bioinformatics,2004.20(10):1565-72.
    50. Yip KY, Patel P, Kim PM, et al. An integrated system for studying residue coevolution in proteins. Bioinformatics,2008.24(2):290-2.
    51.林栋,叶波平,郑珩.应用信息论方法分析蛋白质中共进化残基.药物生物技术,2007.14(4): 281-6.
    52. Atchley WR, Wollenberg KR, Fitch WM, et al. Correlations among amino acid sites in bHLH protein domains:an information theoretic analysis. Mol Biol Evol,2000.17(1):164-78.
    53. Buck MJ and Atchley WR. Networks of coevolving sites in structural and functional domains of serpin proteins. Mol Biol Evol,2005.22(7):1627-34.
    54. Altschuh D, Lesk AM, Bloomer AC, et al. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J Mol Biol,1987.193(4):693-707.
    55. Oliveira L, Paiva AC and Vriend G. Correlated mutation analyses on very large sequence families. Chembiochem,2002.3(10):1010-7.
    56. Fleishman SJ, Yifrach O and Ben-Tal N. An evolutionarily conserved network of amino acids mediates gating in voltage-dependent potassium channels. J Mol Biol,2004.340(2):307-18.
    57. Dutheil J, Pupko T, Jean-Marie A, et al. A model-based approach for detecting coevolving positions in a molecule. Mol Biol Evol,2005.22(9):1919-28.
    58. Pollock DD, Taylor WR and Goldman N. Coevolving protein residues:maximum likelihood identification and relationship to structure. J Mol Biol,1999.287(1):187-98.
    59. Barker D and Pagel M. Predicting functional gene links from phylogenetic-statistical analyses of whole genomes. PLoS Comput Biol,2005.1(1):e3.
    60. Lichtarge O, Bourne HR and Cohen FE. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol,1996.257(2):342-58.
    61. Del SA, Pazos F and Valencia A. Automatic methods for predicting functionally important residues. J Mol Biol,2003.326(4):1289-302.
    62. Hannenhalli SS and Russell RB. Analysis and prediction of functional sub-types from protein sequence alignments. J Mol Biol,2000.303(1):61-76.
    63. Landgraf R, Xenarios I and Eisenberg D. Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. J Mol Biol,2001.307(5):1487-502.
    64. Casari G, Sander C and Valencia A. A method to predict functional residues in proteins. Nat Struct Biol, 1995.2(2):171-8.
    65. Kass I and Horovitz A. Mapping pathways of allosteric communication in GroEL by analysis of correlated mutations. Proteins,2002.48(4):611-7.
    66. Marttinen P, Corander J, Toronen P, et al. Bayesian search of functionally divergent protein subgroups and their function specific residues. Bioinformatics,2006.22(20):2466-74.
    67. Harrington ED, Jensen LJ and Bork P. Predicting biological networks from genomic data. FEBS Lett, 2008.582(8):1251-8.
    68. Wass MN, David A and Sternberg MJ. Challenges for the prediction of macromolecular interactions. Curr Opin Struct Biol,2011.21(3):382-90.
    69. Goh CS, Bogan AA, Joachimiak M, et al. Co-evolution of proteins with their interaction partners. J Mol Biol,2000.299(2):283-93.
    70. Pazos F and Valencia A. Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng,2001.14(9):609-14.
    71. Ranea JA, Yeats C, Grant A, et al. Predicting protein function with hierarchical phylogenetic profiles: the Gene3D Phylo-Tuner method applied to eukaryotic genomes. PLoS Comput Biol,2007.3(11):e237.
    72. Zhou Y, Wang R, Li L, et al. Inferring functional linkages between proteins from evolutionary scenarios. J Mol Biol,2006.359(4):1150-9.
    73. Ta HX, Koskinen P and Holm L. A novel method for assigning functional linkages to proteins using enhanced phylogenetic trees. Bioinformatics,2011.27(5):700-6.
    74. Costa V, Angelini C, De Feis I, et al. Uncovering the complexity of transcriptomes with RNA-Seq. J Biomed Biotechnol,2010 (2010):853916-34.
    75. Gomase VS and Tagore S. Transcriptomics. Curr Drug Metab,2008.9(3):245-9.
    76. Subramanian A, Tamayo P, Mootha VK, et al. Gene set enrichment analysis:a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A,2005.102(43): 15545-50.
    77. Shendure J and Ji H. Next-generation DNA sequencing. Nat Biotechnol,2008.26(10):1135-45.
    78. Marguerat S, Wilhelm BT and Bahler J. Next-generation sequencing:applications beyond genomes. Biochem Soc Trans,2008.36(Pt 5):1091-6.
    79. Mardis ER. Next-generation DNA sequencing methods. Annu Rev Genomics Hum Genet,2008.9: 387-402.
    80. Wang Z, Gerstein M and Snyder M. RNA-Seq:a revolutionary tool for transcriptomics. Nat Rev Genet, 2009.10(1):57-63.
    81. von-Bubnoff A. Next-generation sequencing:the race is on. Cell,2008.132(5):721-3.
    82. Ota T, Suzuki Y, Nishikawa T, et al. Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat Genet,2004.36(1):40-5.
    83. Tanino M, Debily MA, Tamura T, et al. The human anatomic gene expression library (H-ANGEL), the H-Inv integrative display of human gene expression across disparate technologies and platforms. Nucleic Acids Res,2005.33(Database issue):D567-72.
    84. Adams MD, Kelley JM, Gocayne JD, et al. Complementary DNA sequencing:expressed sequence tags and human genome project. Science,1991.252(5013):1651-6.
    85. Velculescu VE, Zhang L, Vogelstein B, et al. Serial analysis of gene expression. Science,1995. 270(5235):484-7.
    86. Saha S, Sparks AB, Rago C, et al. Using the transcriptome to annotate the genome. Nat Biotechnol, 2002.20(5):508-12.
    87. Anisimov SV. Serial analysis of gene expression (SAGE):13 years of application in research. Curr Pharm Biotechnol,2008.9(5):338-50.
    88. Wilhelm BT and Landry JR. RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods,2009.48(3):249-57.
    89. Nagalakshmi U, Wang Z, Waern K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science,2008.320(5881):1344-9.
    90. Cloonan N, Forrest AR, Kolle G, et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods,2008.5(7):613-9.
    91. Mortazavi A, Williams BA, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods,2008.5(7):621-8.
    92. Vera JC, Wheat CW, Fescemyer HW, et al. Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol,2008.17(7):1636-47.
    93. Wilhelm BT, Marguerat S, Watt S, et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature,2008.453(7199):1239-43.
    94. Lister R, O'Malley RC, Tonti-Filippini J, et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell,2008.133(3):523-36.
    95. Clark TA, Sugnet CW and Ares M. Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science,2002.296(5569):907-10.
    96.张骞,盛军.基因芯片技术的发展和应用.中国医学科学院学报,2008.30(3):344-7.
    97. Draghici S, Khatri P, Eklund AC, et al. Reliability and reproducibility issues in DNA microarray measurements. Trends Genet,2006.22(2):101-9.
    98.胡德华,张洁,方平.生物信息学数据库调查分析及其利用研究.生物信息学,2005.3(1):22-5.
    99. Genbank. Available from:http://www.ncbi.nlm.nih.gov/genbank/.
    100. EBI. Available from:http://www.ebi.ac.uk/.
    101. DDBJ. Available from:http://www.ddbi.nig.ac.jp/.
    102. PIR. Available from:http://pir.georgetown.edu/.
    103. Swiss-Prot. Available from:http://web.expasv.org/docs/swiss-prot guideline.html.
    104. PDB. Available from:http://www.rcsb.org/pdb/home/home.do.
    105. TransFac. Available from:http://www.gene-regulation.com/pub/databases.html.
    106. Vector. Available from:http://vectordb.atcg.com/vectordb/vector pages/Phage.html.
    107. CUTG. Available from:http://www.kazusa.or.jp/codon/.
    108. Blocks. Available from:http://nar.oxfordjournals.org/content/24/1/197.full.
    109. Pfam. Available from:http://pfam.sanger.ac.uk/.
    110. Prints. Available from:http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/index.php
    111. InterPro. Available from:http://www.ebi.ac.uk/interpro/.
    112. Prosite. Available from:http://prosite.expasy.org/.
    113. DIP. Available from:http://dip.doe-mbi.ucla.edu/dip/Main.cgi.
    114. BIND. Available from:http://bond.unleashedinformatics.com/Action.
    115. EMP. Available from:http://biobase.com/emphome.html/.
    116. WIT. Available from:http://wit.mcs.anl.gov/WIT2/.
    117. Galperin MY. The molecular biology database collection:2004 update. Nucleic Acids Res,2004. 32(Database issue):D3-22.
    118.万跃华,何立民.生物信息学数据库资源建设.现代图书情报技术,2002.93:89-93.
    119. Bo L, Ward C. The Wiki way:quick collaboration of the web.2001, UK:Addison-Wesley.
    120.Pico AR, Kelder T, van-Iersel MP, et al. WikiPathways:pathway editing for the people. PLoS Biol,2008. 6(7):e184.
    121. Hoffmann R. A wiki for the life sciences where authorship matters. Nat Genet,2008.40(9):1047-51.
    122. Hodis E, Prilusky J, Martz E, et al. Proteopedia-a scientific 'wiki' bridging the rift between three-dimensional structure and function of biomacromolecules. Genome Biol,2008.9(8):R121.
    123. Rice Wiki. Available from:http://ricewiki.big.ac.cn/index.php/Main Page.
    124. Yang M, Ge Y, Wu J, et al. Coevolution study of mitochondria respiratory chain proteins:toward the understanding of protein--Protein interaction. J Genet Genomics,2011.38(5):201-7.
    125. Tuller T, Kupiec M and Ruppin E. Co-evolutionary networks of genes and cellular processes across fungal species. Genome Biol,2009.10(5):R48.
    126. Cohen O, Ashkenazy H, Burstein D, et al. Uncovering the co-evolutionary network among prokaryotic genes. Bioinformatics,2012.28(18):i389-i394.
    127. HomoloGene. Available from:http://www.ncbi.nlm.nih.gov/homologene/.
    128. Dimmic MW, Hubisz MJ, Bustamante CD, et al. Detecting coevolving amino acid sites using Bayesian mutational mapping. Bioinformatics,2005.21 Suppl 1:i126-35.
    129. Burger L and van-Nimwegen E. Accurate prediction of protein-protein interactions from sequence alignments using a Bayesian method. Mol Syst Biol,2008.4:1-14.
    130. Ettwiller L and Veitia RA. Protein coevolution and isoexpression in yeast macromolecular complexes. Comp Funct Genomics,2007(58721):1-4.
    131. Yeang CH and Haussler D. Detecting coevolution in and among protein domains. PLoS Comput Biol, 2007.3(11):e211.
    132. Craig RA and Liao L. Phylogenetic tree information aids supervised learning for predicting protein-protein interaction based on distance matrices. BMC Bioinformatics,2007.8(6):1-12.
    133. Jothi R, Kann MG and Przytycka TM. Predicting protein-protein interaction by searching evolutionary tree automorphism space. Bioinformatics,2005.21 Suppl 1:i241-50.
    134. RDCT. R:A Language and Environment for Statistical Computing.2011.
    135. Tillier ER and Charlebois RL. The human protein coevolution network. Genome Res,2009.19(10): 1861-71.
    136.于军.启动以细胞为基本功能单元的系统人类基因转录组研究.生命科学,2007.19(3):264-71.
    137. Lee Y, Tsai J, Sunkara S, et al. The TIGR gene indices:clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucleic Acids Res,2005.33(Database issue):D71-4.
    138. Nagaraj SH, Gasser RB and Ranganathan S. A hitchhiker's guide to expressed sequence tag (EST) analysis. Brief Bioinform,2007.8(1):6-21.
    139. Campagne F and Skrabanek L. Mining expressed sequence tags identifies cancer markers of clinical interest. BMC Bioinformatics,2006.7(481):1-13.
    140. Kent WJ, Sugnet CW, Furey TS, et al. The human genome browser at UCSC. Genome Res,2002.12(6): 996-1006.
    141. Kuhn RM, Karolchik D, Zweig AS, et al. The UCSC genome browser database:update 2007. Nucleic Acids Res,2007.35(Database issue):D668-73.
    142. Kent WJ. BLAT--the BLAST-like alignment tool. Genome Res,2002.12(4):656-64.
    143. Zhu J, He F, Hu S, et al. On the nature of human housekeeping genes. Trends Genet,2008.24(10): 481-4.
    144. Zhu J, He F, Song S, et al. How many human genes can be defined as housekeeping with current expression data? BMC Genomics,2008.9:172.
    145. NCBI. Available from:http://www.ncbi.nlm.nih.gov/.
    146. Ashburner M, Ball CA, Blake JA, et al. Gene ontology:tool for the unification of biology. The gene ontology consortium. Nat Genet,2000.25(1):25-9.
    147.聂绪发,雷中劲.人体解剖学.科学出版社,2009.Vol.1.
    148.盖一峰.人体解剖学.人民卫生出版社,2005.Vol.1.
    149.屈婉玲,耿素云,张立昂.离散数学.高等教育出版社,2008.Vol.14.
    150. Appache. Available from:http://www.apache.org
    151. PHP. Available from:http://www.php.net/.
    152. MySQL. Available from:http://www.mysql.com/.
    153. Mediawiki. Available from:http://www.mediawiki.org/.
    154. MOPED. Available from: https://www.proteinspire.org/MOPED/mopedviews/proteinExpressionDatabase.jsf.
    155. GeneCards. Available from:http://www.genecards.org
    156. MetaCyc. Available from:http://metacyc.org/.
    157. Reactome. Available from:http://www.reactome.org/ReactomeGWT/entrypoint.html.
    158. Mediawiki Extensions. Available from:http://www.mediawiki.org/wiki/Category:Extensions.
    159. Wang L, Feng Z, Wang X, et al. DEGseq:an R package for identifying differentially expressed genes from RNA-seq data. Bioinformatics,2010.26(1):136-8.
    160. Roberts A, Pimentel H, Trapnell C, et al. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics,2011.27(17):2325-9.
    161. Trapnell C, Pachter L and Salzberg SL. TopHat:discovering splice junctions with RNA-Seq. Bioinformatics,2009.25(9):1105-11.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700