22周孕龄人胎肝转录组及SARS-CoV(BJ-01)基因组的生物信息学研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
研究目的:肝脏在人体生命活动中具有重要的生理功能,而4-6月孕龄人胎肝还是造血、免疫系统干/祖细胞的主要来源,并表达大量与细胞植入、定居、转移相关的基因。本文研究目的是借助生物信息学的工具,通过分析本室测定的人胎肝EST数据及来自公共数据库中的芯片数据,以了解人胎肝转录组特点,并通过分析为蛋白质组研究及基因功能研究打下良好的基础。
     另外由于SARS爆发,为了解SARS-CoV所表达的蛋白质种类及它们的功能,促进蛋白质组鉴定工作,又开展了另一部分研究:即对SARS-CoV(BJ-01)进行基因预测,并推测所得蛋白质功能。
     研究内容:首先通过EST预处理,获得人胎肝EST有效序列;其次对EST进行正确的聚类,得到EST丰度信息,并对EST进行鉴定;第三,对已知基因进行GO分类和KEGG分类,克服基因功能人为分类缺点,建立标准化的人胎肝表达谱:第四,通过对人胎肝已知基因数据与芯片数据进行比较,获得人胎肝的特点,并分析相关组织之间的关系:第五,对人胎肝功能未知EST进行电子拼接、验证,获得全长cDNA或完整ORF;第六,对未知基因进行功能推测,为基因功能研究打下基础;第七,建立人胎肝转录组数据库及蛋白质组质谱肽段鉴定体系;第八,对SARS-CoA(BJ01)进行基因预测及功能推测。
     研究方法:第一,对人胎肝EST进行预处理,排除重复测序序列、外来序列、长度小于100bp的序列,确保后续分析序列为有效序列,并通过VecScreen程序去除载体序列,通过本地化repeat_masker程序去除重复序列:第二,比对NT数据库,根据分值不小于200并按照功能已知与否把EST分为功能已知和功能未知两类:第三,利用Blast比对UniGene数据库、DoTs数据库、MGC数据库和Twinscan所预测的人转录组数据库,获得较准确的EST丰度信息;第四,通过DAVID软件对功能已知基因进行GO分类,同时进行KEGG分类;第五,从芯片数据中选择相关的五种组织,通过DAVID对它们进行分析,获得人胎肝转录组特点,并通过层次聚类对五种组织的关系进行分析:第六,通过Phrap软件对未知EST进行电子拼接,并比对原始EST及四个转录组数据库进行验证,同时用ATGpr软件检验完整性和检出ORF,建立相应的蛋白质数据库;第七,对所获得的功能未知基因进行Prosite、Pfam、PSORT、SOSUI及电子基因定位等分析;第八,对于SARS-CoV(BJ—01)基因组,首先比较12种基因预测方法,然后选用启发式模型(Heuristic models)、基因鉴定(Gene identification)、
Bioinformatics Study on Transcriptome of Human Fetal Liver Aged 22 Weeks ofGestationBackgrounds: Human fetal liver aged 22 wk of gestation (HFL22w), consistes of hepatic parenchyma cells and hematopoietic stem/progenitor cells, and corresponds to the turning point between immigration and emigration of the hematopoietic system. We had studied HFL22w before, but with improvements of data sources including: (1) The rapid growth of expressed sequence tags (ESTs) in dbEST; (2) The renewal of the GenBank non-redundant database; (3)The establishment of Gene Ontology (GO); (4) The increase of tissues expression profiling data coming from microarray; (5) The continuous perfection of UniGene, DoTs, MGC and Twinscan program, we must study on HFL22w once more, for the purpose of protein identification in proteomics, protein-protein interaction network research, and new gene function study.Aims: (1) Clustering of EST to get EST frequency information, and identifying gene; (2) GO classification for known genes to build standard expression profiling about HFL22w and to compare with those of other tissues; (3) validation of the results to get predicted proteins and their functional informations from unknown ESTs.Methods: The ESTs were first searched against the GenBank non-redundant database, UniGene, DoTs, MGC and Twinscan database for the identification of gene and a more perfected clustering of the ESTs. After classifying those known ESTs by using GO, those unknown ESTs were assembled by using PHRAP, then validated, and obtained full length ORF database of HFL22w. The encoding proteins were studied to get their function information. Finally, the known ESTs profile was compared with five tissues expression profile from the microarray data.Results: There are 16674 ESTs sequenced from a 3'-directed cDNA library of HFL22w. Among them, 8097 (48.6%) (Group I) matched to known genes or had partial homology to known
    genes; 4271 (25.6%) (Group II) exhibited no significant homology to known genes; and the remaining 4306 (25.8%) (Group III) were genomic sequences of unknown function, mitochondrial genomic sequences, bacterial DNA and repetitive sequences. The 2483 genes corresponding to Group I can be divided into 425 gene categories by GO classification. Some of the genes are related to metabolism, biosynthesis, development, cell proliferation, defense response, cell migration, hemopoiesis and endocytosis. The correlation coefficient (0.994) between the Group I and fetal liver data from microarray indicates their high similarity. Comparison on microarray data of five tissues (including fetal liver, bone marrow, liver, thymus and lymph node) indicates that genes related to reproduction, coagulation, homeostasis, regulation of gene expression (epigenetic), biosynthesis, energy pathways, cell migration, response to pathogenic bacteria, and natural killer cell mediated cytolysis in fetal liver are more than in other four tissues. Hierarchical clustering of these tissues shows that thymus and lymph node are closely related, thymus and bone marrow, liver and fetal liver are the next, fetal liver and bone marrow are the last. 2416 genes corresponding to Group II were assembled and their average length was lengthened from 342 bp to 1682 bp. 2098 genes (86.84%) of unknown ESTs had been prolonged. In these 2098 genes, 1037 genes (49.43%) were validated by UniGene, DoTs, MGC and Twinscan database. Then we predicted the characteristics of proteins (1921 genes) with length not less than 30 aa and obtained 277 profiles or patterns. More than 10 types were discussed.Conclusions: (1) The results of ESTs clustering show that the number of high expressed genes is small, but these genes include more ESTs than the others; (2) We obtained 1379 new genes; (3) GO analysis showed that human fetal liver display some typical characteristics of gene expression patterns related to special physiological functions; (4) Comparison of gene expression on five tissues showed that human fetal liver and liver have closer relation than the other tissues; (5) 1037 full length cDNAs and ORFs were obtained by assembling unknown ESTs and validation.Bioinformatics Study on Genome of SARS-CoV(BJ-Ol)SIGNIFICANCES: SARS, an atypical pneumonia of unknown aetiology, was recognized at the end of February 2003. For understanding the disease and cured it, we must got its gene informations.
引文
1 http://www.ornl.gov/sci/techresources/Human_Genome/project/clintonl.shtml
    2 International Human Genome Sequencing Consortium. Initial sequencing and analysis of thehuman genome. Nature. 2001.409: 860 - 921.
    3 J. Craig Venter, Mark D. Adams, Eugene W. Myers, Peter W. Li, Richard J. Mural, Granger G. Sutton, Hamilton 0. Smith, Mark Yandell, Cheryl A. Evans, Robert A. Holt, Jeannine D. Gocayne, Peter Amanatides, Richard M. Ballew, Daniel H. Huson, Jennifer Russo Wortman, Qing Zhang, Chinnappa D. Kodira, Xiangqun H. Zheng, Lin Chen,et al. The Sequence of the Human Genome. Science. 2001. 291 (5507): 1304-1351.
    4 http://www.genome.gov/10005139
    5 http://www.sciam.com/article.cfm?articleID=000064A6-4A21-lC6F-84A9809EC588EF21
    6 Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, et al. Complementary DNA sequencing: expressed sequence tags and human genome project. Science. 1991 Jun 21. 252(5013):1651-6.
    7 http://genome.wustl.edu/est/
    8 Ramirez M, Graham MA, Bianco-Lopez L, Silvente S, Medrano-Soto A, Blair MW, Hernandez G, Vance CP, Lara M. Sequencing and analysis of common bean ESTs. Building a foundation for functional genomics. Plant Physiol. 2005 Apr; 137(4): 1211-27.
    9 Tuggle CK, Green JA, Fitzsimmons C, Woods R, Prather RS, Malchenko S, Soares BM, Kucaba T, Crouch K, Smith C, Tack D, Robinson N, O'Leary B, Scheetz T, Casavant T, Pomp D, Edeal BJ, Zhang Y, Rothschild MF, Garwood K, Beavis W. EST-based gene discovery in pig: virtual expression patterns and comparative mapping to human. Mamm Genome. 2003 Aug;14(8):565-79.
    10 Lippert D, Zhuang J, Ralph S, Ellis DE, Gilbert M, Olafson R, Ritland K, Ellis B, Douglas CJ, Bohlmann J. Proteome analysis of early somatic embryogenesis in Picea glauca. Proteomics. 2005 Feb;5(2):461-73.
    11 Suliman-Pollatschek S, Kashkush K, Shats H, Hillel J, Lavi U. Generation and mapping of AFLP, SSRs and SNPs in Lycopersicon esculentum. Cell Mol Biol Lett. 2002;7(2A):583-97.
    12 http://www.ncbi.nlm.nih.gov/genome/flcdna/
    13 Mao M, Fu G, Wu JS, et al. Identification of genes expressed in human CD34(+) hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning. Proc Natl Acad Sci U S A,1998,95:8175-8180.
    14 http://www.cdgdc.edu.cn/yxbslw/pxjg/2001/zhangweiping.htm
    15 Yongtao Yu, Chenggang Zhang, Gangqiao Zhou, et al. Gene expression profiling in human fetal liver and identification of tissue-and developmental-stage-specific genes through compiled expression profiles and efficient cloning of full-length cDNAs. Genome Research, 2001, 11:1392-1403.
    16 Gelfand M.S., Mironov A.A., Pevzner P.A. Gene recognition via spliced sequence alignment, Proc. Natl. Acad. Sci. USA. 1996. 93: 9061-9066.
    17 http://www-hto.usc.edu/software/procrustes/
    18 Birney,E. and Durbin,R. Dynamite: a flexible code generating language for dynamic programming methods used in sequence comparison. Proc. Int. Conf. Intell. Syst. Mol. Biol., 1997. 5: 56-64.
    19 Florea,L., Hartzell,G., Zhang,Z., Rubin,G.M. and Miller,W. A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998. 8: 967-974.
    20 Batzoglou,S., Pachter,L., Mesirov,J., Berger,B. and Lander,E.S. Human and mouse gene structure: comparative analysis and application to exon prediction. Genome Res., 2000. 10: 950-958.
    21 Burge,C. and Karlin,S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol., 1997. 268:78-94.
    22 Reese,M.G., Eeckman,F.H., Kulp,D. and Haussler,D. Improved splice site detection in Genie. In Istrail,S., Pevzner,P. and Waterman,M. (eds), First Annual International Conference on Computational Molecular Biology (RECOMB). ACM Press, New York, NY, 1997. pp. 232-240.
    23 Zhang,M.Q. Identi?cation of protein coding regions in the human genome by quadratic discriminant analysis. Proc. Natl Acad. Sci. USA, 1997. 94: 565-568. [Erratum (1997) Proc. Natl Acad. Sci. USA, 94, 5495]
    24 Snyder,E.E. and Stormo,G.D. Identi?cation of protein coding regions in genomic DNA. J. Mol. Biol., 1995. 248:1-18.
    25 Korf I, Flicek P, Duan D, Brent MR.Integrating genomic homology into gene structure prediction. Bioinformatics. 2001 ;17 Suppl 1:S140-8.
    26 Meyer IM, Durbin R. Gene structure conservation aids similarity based gene prediction. Nucleic Acids Res. 2004 Feb 4; 32(2): 776-83. Print 2004.
    27 http://www.who.int/csr/sars/country/2003_06_05/en/
    28 http://database.cpst.net.cn/popul/special/sars/artic/30517133107.html
    29 J S M Peiris, S T Lai, L L M Poon, Y Guan, L Y C Yam, W Lim, J Nicholls, W K S Yee, W W Yan, M T Cheung, V C C Cheng,K H Chan, D N C Tsang, RWH Yung, T K Ng, K Y Yuen, and members of the SARS study group. Coronavirus as a possible cause of severe acute respiratory syndrome. Lancet, 2003, 361: 1319~1325.
    30 Drosten C, Gunther S, Preiser W, van der Werf S, Brodt HR, Becker S, Rabenau H, Panning M, Kolesnikova L, Fouchier RA, Berger A, Burguiere AM, Cinatl J, Eickmann M, Escriou N, Grywna K, Kramme S, Manuguerra JC, Muller S, Rickerts V, Stunner M, Vieth S, Klenk HD, Osterhaus AD, Schmitz H, Doerr HW. Identification of a novel coronavirus in patients with severe acute respiratory syndrome. N Engl J Med, 2003,348:1967~1976.
    31 Thomas G Ksiazek, D V M, Ph.D., Dean Erdman, Dr. P.H.,Cynthia Goldsmith, M.S., Sherif R. Zaki, M.D., Ph.D., Teresa Peret, Ph.D.,Shannon Emery, B.S., Suxiang Tong, Ph.D., Carlo Urbani, M.D.James A. Comer, Ph.D., M.P.H., Wilina Lim, Pierre E. Rollin, M.D,Kim Ha Nghiem, B.A., Scott Dowell, M.D., M.P.H., Ai-Ee Ling, M.D.,Charles Humphrey, Ph.D., Wun-Ju Shieh, M.D., Jeannette Guarner, M.D.,Christopher D. Paddock, M.D., Paul Rota, Ph.D., Barry Fields, Ph.D.,Joseph DeRisi, Ph.D., Jyh-Yuan Yang, Ph.D., Nancy Cox, Ph.D., James Hughes, M.D.,James W. LeDuc, Ph.D., William J. Bellini, Ph.D., Larry J. Anderson, M.D.,and the SARS Working Group. A novel coronavirus associated with severe acute respiratory syndrome. N Engl J Med, 2003,348:1947~1958.
    32 Avendano M, Derkach P, Swan S. Clinical course and management of SARS in health care workers in Toronto: a case series.CMAJ,2003,168(13):1649~1660.
    33 QIN Lei, XIONG Bin, LUO Cheng, GUO Zong-Ming, HAO Pei, SU Jiong, NAN Peng, FENG Ying, SHI Yi-Xiang, YU Xiao-Jing, LUO Xiao-Min, CHEN Kai-Xian, SHEN Xu3, SHEN Jian-Hua, ZOU Jian-Ping, ZHAO Guo-Ping, SHI Tie-Liu, HE Wei-Zhong, ZHONG Yang, JIAGN Hua-Liang, LI Yi-Xue. Identification of probable genomic packaging signal sequence from SARS- CoV genome by bioinformatics analysis. Acta Pharmacol Sin, 2003,24 (6): 489~ 496.
    34 Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde RF,Moreno RF. Complementary DNA sequencing: expressed sequence tags and human genome project.Science. 1991. 252(5013): 1651-1656.
    35 http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen.html
    36 http://www.ncbi.nlm.nih.gov/VecScreen/contam.html#Definition
    37 http://www.ncbi.nlm.nih.gov/VecScreen/UniVec.html
    38 ftp://ftp.ncbi.nih.gov/pub/UniVec/
    39 http://www.ncbi.nlm.nih.gov/VecScreen/VecScreen_docs.html
    40 http://www.ebi.ac.uk/blastall/vectors.html
    41 http://firstmarket.com/firstmarket/cutter/cut2.html
    42 Smith, T.F. and Waterman, M.S. Identification of common molecular subsequences. J Mol Biol. 1981.147:195-197.
    43 http://bozeman.mbt.washington.edu/phrap.docs/phrap.html
    44 Pontius, J.U., Wagner, L., and Schuler, G.D. UniGene: A unified view of the transcriptome. In The NCBI handbook. National Center for Biotechnology Information, Bethesda, MD.
    45 Altschul, S.F., Gish, W., Miller, W., Myers, E.W. & Lipman, D.J. "Basic local alignment search tool." J. Mol. Biol. 1990. 215:403-410.
    46 Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D.J. "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Nucleic Acids Res.1997. 25:3389-3402.
    47 ftp://ftp.ncbi.nih.gov/repository/UniGene/
    48 Schuler GD. Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med.,1997. 75(10): 694-698.
    49 Schuler GD et al. A gene map of the human genome. Science 1996; 274: 540-546.
    50 Boguski MS, Schuler GD. ESTablishing a human transcript map. Nat Gene, 1995.10: 369-371.
    51 Harris, N.L. Genotator: A workbench for sequence annotation. Genome Research 1997. 7(7): 754-762.
    52 http://www.fruitfly.org/~nomi/genotator/need.html
    53 http://binfo.ym.edu.tw/core/intro/genotator_guid.html
    54 http://www.cgr.ki.se/cgr/groups/sonnhammer/Blixem.html
    55 Sonnhammer, ELL and Durbin, R. "A workbench for Large Scale Sequence HomologyAnalysis". Comput. Applic. Biosci, 1994. 10:301-307.
    56 Sonnhammer, ELL and Durbin, R. "An expert system for processing sequence homology data". ISMB 1994.2:363-368
    57 http://binfo.ym.edu.tw/core/intro/genotator_guid.html
    58 http://www.cgb.ki.se/cgb/groups/sonnhammer/Belvu.html
    59 http://www.ncbi.nlm.nih.gov/BLAST/
    60 http://binfo.ym.edu.tw/core/mtro/gene_class_intro.htm
    61 Yongtao Yu, Chenggang Zhang, Gangqiao Zhou, et al. Gene expression profiling in human fetal liver and identification of tissue-and developmental-stage-specific genes through compiled expression profiles and efficient cloning of full-length cDNAs. Genome Research, 2001, 11:1392-1403.
    62 http://www.ncbi.nlm.nih.gov/UniGene/FAQ.shtml
    63 张成岗,贺福初,生物信息学方法与实践,科学出版社, 2002. 第一版, P63。
    64 Schuler GD, Boguski MS, Stewart EA, Stein LD, Gyapay G, Rice K, White RE, Rodriguez -Tome P, Aggarwal A, Bajorek E, Bentolila S, Birren BB, Butler A, Castle AB, Chiannilkulchai N, Chu A, Clee C, Cowles S, Day PJ, Dibling T, Drouot N, Dunham I, Duprat S, East C, Hudson TJ, et al. A gene map of the human genome.Science. 1996 Oct 25: 274(5287):540-6.
    65 http://www.ncbi.nlm.nih.gov/UniGene/FAQ.shtml
    66 Yudate HT, Suwa M, Irie R, Matsui H, Nishikawa T, Nakamura Y, Yamaguchi D, Peng ZZ, Yamamoto T, Nagai K, Hayashi K, Otsuki T, Sugiyama T, Ota T, Suzuki Y, Sugano S, Isogai T, Masuho Y. HUNT: launch of a full-length cDNA database from the Helix Research Institute. Nucleic Acids Res. 2001 Jan l;29(l):185-8.
    67 Glasscock AE, Singhania A, Tanouye MA. The mei-P26 gene encodes an RBCC-NHL protein that regulates seizure susceptibility in Drosophila. Genetics. 2005 Jun 3; [Epub ahead of print]
    68 Yin LL, Li JM, Zhou ZM, Sha JH. Identification of a novel testis-specific gene and its potential roles in testis development/spermatogenesis. Asian J Androl. 2005 Jun;7(2):127-37.
    69 Moh MC, Zhang C, Luo C, Lee LH, Shen S. Structural and functional analyses of a novel Ig-like cell adhesion molecule, hepaCAM, in the human breast carcinoma MCF7 cells. J Biol Chem. 2005 May 25; [Epub ahead of print]
    70 Trevaskis J, Walder K, Foletta V, Kerr-Bayles L, McMillan J, Cooper A, Lee S, Bolton K, Prior M, Fahey R, Whitecross K, Morton GJ, Schwartz MW, Collier GR. SH3-domain GRB2-like (endophilin) interacting protein 1 (SGIP1), a novel neuronal protein that regulates energy balance. Endocrinology. 2005 May 26; [Epub ahead of print]
    71 The Gene Ontology Consortium. Gene Ontology: tool for the unification of biology. nature genetics, 2003. 25: 25-29
    72 http://www.geneontology.org/
    73 http://godatabase.org/cgi-bin/go.cgi?view=blast&session_id=14791084782536
    74 http://www.genome.ad.jp/kegg/
    75 Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucleic Acids Res. 2004 Jan 1: 32 Database issue:D277-80.
    76 http://www.genome.ad.jp/kegg/
    77 Kanehisa M. The KEGG database.Novartis Found Symp. 2002; 247:91-101; discussion 101-3, 119-28, 244-52. Review.
    78 http://www.geneontology.org/GO.tools.html
    79 Sauer I, Dunay IR, Weisgraber K, Bienert M, Dathe M.An apolipoprotein E-derived peptide mediates uptake of sterically stabilized liposomes into brain capillary endothelial cells.Biochemistry. 2005 Feb 15; 44(6):2021-9.
    80 Zhu Y, Hui DY. Apolipoprotein E binding to low density lipoprotein receptor-related protein-1 inhibits cell migration via activation of cAMP-dependent protein kinase A.J Biol Chem. 2003 Sep 19;278(38):36257-63.
    81 Qu X., Wei H., Zhai Y., Que H., Chen Q., Tang F., Wu Y., Xing G., Zhu Y, Liu S., Fan M., He F.. Identification, Characterization, and Functional Study of the Two Novel Human Members of the Semaphorin Gene Family. J. Biol. Chem., 2002, 277(): 35574-35585
    82 Klostermann A, Lutz B, Gertler F, Behl C. The orthologous human and murine semaphorin 6A-1 proteins (SEMA6A-l/Sema6A-l) bind to the enabled/vasodilator-stimulated phosphoprotein-like protein (EVL) via a novel carboxyl-terminal zyxin-like domain. J Biol Chem. 2000 Dec 15;275(50):39647-53.
    83 Deuel TF, Senior RM, Chang D, Griffin GL, Heinrikson RL, Kaiser ET. Platelet factor 4 is chemotactic for neutrophils and monocytes. Proc Natl Acad Sci U S A. 1981 Jul;78(7):4584-7.
    84 Sironen RK, Karjalainen HM, Torronen KJ, Elo MA, Hyttinen MM, Helminen HJ, Lammi MJ. Reticulon 4 in chondrocytic cells: barosensitivity and intracellular localization. Int J Biochem Cell Biol. 2004 Aug;36(8): 1521 -31.
    85 Watari A, Yutsudo M. Multi-functional gene ASY/Nogo/RTN-X/RTN4: apoptosis, tumor suppression, and inhibition of neuronal regeneration. Apoptosis. 2003 Jan;8(1):5-9.
    86 Zermati Y, Garrido C, Amsellem S, Fishelson S, Bouscary D, Valensi F, Varet B, Solary E, Hermine 0. Caspase activation is required for terminal erythroid differentiation. J Exp Med. 2001 Jan 15;193(2):247-54.
    87 Sordet O, Rebe C, Plenchette S, Zermati Y, Hermine O, Vainchenker W, Garrido C, Solary E, Dubrez-Daloz L. Specific involvement of caspases in the differentiation of monocytes into macrophages. Blood. 2002 Dec 15;100(13):4446-53. Epub 2002 Aug 8.
    88 Stifani S, Blaumueller CM, Redhead NJ, Hill RE, Artavanis-Tsakonas S. Human homologs of a Drosophila Enhancer of split gene product define a novel family of nuclear proteins.Nat Genet. 1992 Oct;2(2): 119-27. Erratum in: Nat Genet. 1992 Dec;2(4):343.
    89 Baudin-Creuza V, Vasseur-Godbillon C, Pato C, Prehu C, Wajcman H, Marden MC. Transfer of human alpha- to beta-hemoglobin via its chaperone protein: evidence for a new state. J Biol Chem. 2004 Aug 27;279(35):36530-3. Epub 2004 Jun 26.
    90 Kihm AJ, Kong Y, Hong W, Russell JE, Rouda S, Adachi K, Simon MC, Blobel GA, Weiss MJ. An abundant erythroid protein that stabilizes free alpha-haemoglobin. Nature. 2002 Jun 13; 417 (6890): 758-63.
    91 Maione TE, Gray GS, Petro J, Hunt AJ, Dormer AL, Bauer SI, Carson HF, Sharpe RJ. Inhibition of angiogenesis by recombinant human platelet factor-4 and related peptides.Science. 1990 Jan 5;247(4938):77-9.
    92 Han ZC, Bellucci S, Tenza D, Caen JP. Negative regulation of human megakaryo- cytopoiesis by human platelet factor 4 and beta thromboglobulin: comparative analysis in bone marrow cultures from normal individuals and patients with essential thrombocythaemia and immune thrombocytopenic purpura. Br J Haematol. 1990 Apr; 74(4): 395 -401.
    93 M. Kanehisa, A database for post-genome analysis., Trends Genet., 13:375{376,1997.
    94 Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M., KEGG: Kyoto Ency-clopedia of Genes and Genomes, Nucleic. Acids Res., 1999. 27:29-34.
    95 http://wombat.gnf.org/index.html
    96 ftp://ftp.ncbi.nih.gov/repository/UniGene/
    97 Yee SB, Harkema JR, Ganey PE, Roth RA. The coagulation system contributes to synergistic liver injury from exposure to monocrotaline and bacterial lipopolysaccharide. Toxicol Sci. 2003 Aug;74(2):457-69.
    98 Bolandrina E, Soardi F. Liver And Blood Coagulation. I. Behavior Of Factor Xii (Hageman), Of Factor X (Stuart) And Of Fibrinogen In Liver Patients.Gazz Int Med Chir. 1964 Jan 15. 69: 63-79.
    99 Meyer C, Dostou JM, Welle SL, Gerich JE. Role of human liver, kidney, and skeletal muscle in postprandial glucose homeostasis. Am J Physiol Endocrinol Metab. 2002 Feb;282(2):E419-27.
    100 ftp://ftp.ncbi.nih.gov/blast/db/FASTA/
    101 http://www.genome.washington.edu/UWGC/analysistools/Phrap.cfm
    102 http://www.phrap.org/phrap.docs/phrap.html
    103 http://www.phrap.org/consed/consed.html#howToGet
    104 http://deepc2.zool.iastate.edu/aat/cap/capdoc.html
    105 http://deepc2.zool.iastate.edu/aat/cap/cap.html
    106 Huang, X. On Global Sequence Alignment. Computer Applications in the Biosciences. 1994. 10,227-235.
    107 Huang, X. and Madan, A. CAP3: A DNA Sequence Assembly Program. Genome Research, 1999.9:868-877.
    108 http://www.zmdb.iastate.edu/zmdb/EST/assembly.html
    109 Sutton G., White, O., Adams, M., and Kerlavage, A. TIGR Assembler: A new tool for assembling large shotgun sequencing projects. Genome Science & Technology 1995. 1:9-19.
    110 Smith SW, Overbeek R, Woese CR, Gilbert W, Gillevet PM., The Genetic Data Environment: An expandable GUI for multiple sequence analysis. CABIOS. 1994. 10: 671-675.
    111 Sutton GG, White O, Adams MD, Kerlavage AR,(1995), TIGR Assembler: A New Tool for Assembling Large Shotgun Sequencing Projects. Genome Science & Technology. 1995. 1: 9-19.
    112 http://www.tigr.org/tdb/hcd/overview.html
    113 ftp://ftp.tigr.org/pub/software/Assembler/
    114 http://mapage.noos.fr/hubert.wassner/ESTclusters/zEST_intro.html#comparison
    115 http://www.ncbi.nlm.nih.gov/UniGene/FAQ.shtml
    116 http://mapage.noos.fr/hubert.wassner/technology.html
    117 Genome Sequence Assembly: Algorithms and Issues
    118 http://www.chevreux.org/mira_downloads.html
    119 http://www.chevreux.org/uploads/media/mira.html
    120 http://www.cse.ucsc.edu/~learithe/browser/goldenPath/algo.html
    121 Myers, E. et al. Toward simplifying and accurately formulating fragment assembly. J.Comp. Biol, 1995. 2, 275-290.
    122 Serafim Batzoglou, David B. Jaffe, Ken Stanley, Jonathan Butler, Sante Gnerre, Evan Mauceli, Bonnie Berger, Jill P. Mesirov, Eric S. Lander, ARACHNE: A Whole-Genome Shotgun Assembler, Genome Research 177-189
    123 Pavel A. Pevzner, Haixu Tang, and Michael S. Waterman, An Eulerian path approach to DNA fragment assembly, PNAS, 2001. 98(17), 9748-9753, August 14.
    124 Mullikin JC, Ning Z., The phusion assembler. Genome Res., 2003. 13 (1): 81-90.
    125 http://www.hgmp.mrc,ac.uk/ESTBlast/
    126 http://www.genome.ad.jp/manuscripts/GIW93/WS/GIW93W04.html
    127 Dear S., Staden R.; "A sequence assembly and editing program for efficient management of large projects."; Nucleic Acids Res. 1991. 19: 3907-3911.
    128 Staden R.; "Software for sequence analysis."; Genome News 1993. 13: 21-23.
    129 http://www.allgenes.org
    130 http://www.allgenes.org/docs/Posters/GenolnfoCSHL03.htm
    131 Strausberg RL, Feingold EA, Grouse LH, Derge JG, Klausner RD, Collins FS, Wagner L, Shenmen CM, Schuler GD, Altschul SF, Zeeberg B, Buetow KH, Schaefer CF, Bhat NK, Hopkins RF, Jordan H, Moore T, Max SI, Wang J, Hsieh F, Diatchenko L, Marusina K, Farmer AA, Rubin GM, Hong L, Stapleton M, Soares MB, Bonaldo MF, Casavant TL, Scheetz TE, Brownstein M J, Usdin TB, Toshiyuki S, Caminci P, Prange C, Raha SS, Loquellano NA, Peters GJ, Abramson RD, Mullahy SJ, Bosak SA, McEwan PJ, McKernan KJ, Malek JA, Gunaratne PH, Richards S, Worley KC, Hale S, Garcia AM, Gay LJ, Hulyk SW, Villalon DK, Muzny DM, Sodergren EJ, Lu X, Gibbs RA, Fahey J, Helton E, Ketteman M, Madan A, Rodrigues S, Sanchez A, Whiting M, Madan A, Young AC, Shevchenko Y, Bouffard GG, Blakesley RW, Touchman JW, Green ED, Dickson MC, Rodriguez AC, Grimwood J, Schmutz J, Myers RM, Butterfield YS, Krzywinski MI, Skalska U, Smailus DE, Schnerch A, Schein JE, Jones SJ, Marra MA. Mammalian Gene Collection Program Team. Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci U S A., 2002.99(26): 16899-903. Epub 2002 Dec 11.
    132 http://mgc.nci.nih.gov
    133 http://genome.gsc.riken.go.jp/hgmis/publicat/02santa/function.html
    134 http://www.phrap.org/green_group/est_assembly/
    135 http://www.ncbi.nlm.nih.gov/UniGene/FAQ.shtml
    136 http://www.phrap.org/green_group/est_assembly/
    137 Yudate HT, Suwa M, Irie R, Matsui H, Nishikawa T, Nakamura Y, Yamaguchi D, Peng ZZ, Yamamoto T, Nagai K, Hayashi K, Otsuki T, Sugiyama T, Ota T, Suzuki Y, Sugano S, Isogai T, Masuho Y. HUNT: launch of a full-length cDNA database from the Helix Research Institute. Nucleic Acids Res. 2001. 29(1): 185-8.
    138 Suzuki Y, Yamashita R, Nakai K, Sugano S., DBTSS: DataBase of human Transcriptional Start Sites and full-length cDNAs. Nucleic Acids Res. 2002. 1; 30(1): 328-31.
    139 The FANTOM Consortium and the RIKEN Genome Exploration Research Group Phase Ⅰ & Ⅱ Team, Analysis of the mouse transcriptome based on functional annotation of 60, 770 full-length cDNAs. Nature. 2002. 420: 563-573.
    140 http://www.sanbi.ac.za/Dbases.html
    141 http://www.tigr.org/tdb/tgi.shtml
    142 Amos Bairoch,Rolf Apweiler, The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000,Nucleic Acids Research, 2000. 28(1), 45-48.
    143 http://www.ornl.gov/sci/techresources/Human_Genome/publicat/hgn/v8n2/12genpep.shtml
    144 http://genes.cs.wustl.edu/
    145 Korf, I., P. Flicek, D. Duan, and M.R. Brent. Integrating genomic homology into gene structure prediction. Bioinformatics 2001. 17: S140-148.
    146 Hu, P. and M.R. Brent. Using Twinscan to Predict Gene Structures in Genomic DNA Sequences. Wiley, New York.2003.
    147 http://www.tigem.it/Bioinformatics/TheESTMachine.htm
    148 http://image.llnl.gov
    149 Salamov AA, Nishikawa T, Swindells MB. Assessing protein coding region integrity in cDNA sequencing projects. Bioinformatics. 1998 Jun;14(5):384-90.
    150 http://www.ncbi.nlm.nih.gov/UniGene/FAQ.shtml
    151 Kozak,M.. Point mutations define a sequence flanking the AUG initiation codon that modulates translation by eukaryotic ribosomes. Cell, 1986. 44,283-292.
    152 Kozak, M. Interpreting cDNA sequences:some insights from studies on translation. Mamm. Genome,1996. 7,563-574.
    153 http://www.hri.co.jp/atgpr/
    154 Salamov AA, Nishikawa T, Swindells MB. Assessing protein coding region integrity in cDNA sequencing projects.Bioinformatics. 1998 Jun;14(5):384-90.
    155 Nishikawa T, Ota T, Isogai T.,Prediction whether a human cDNA sequence contains initiation codon by combining statistical information and similarity with protein sequences. Bioinformatics. 2000 Nov;16(ll):960-7. Erratum in: Bioinformatics 2001 Mar;17(3):290.
    156 Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y. CDS annotation in full-length cDNA sequence.Genome Res. 2003 Jun;13(6B):1478-87.
    157 Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y. CDS annotation in full-length cDNA sequence. Genome Res.2003. 13(6B):1478-87.
    158 Claverie, J.-M. Computational methods for the identification of genes in vertebrate genomic sequences. Human Mol. Genet. 1997. 6: 1735-1744.
    159 http://bioweb.uwlax.edu/GenWeb/Molecular/Seq_Anal/Translation/translation.html
    160 http://csmres.jmu.edu/bioweb/Bio480/Spring03/groupl/results.htm
    161 http://www.hri.co.jp/atgpr/
    162 Salamov AA, Nishikawa T, Swindells MB. Assessing protein coding region integrity in cDNA sequencing projects.Bioinformatics. 1998 Jun;14(5):384-90.
    163 http://www.hri.co.jp/atgpr/table_help.html
    164 http://www.ncbi.nlm.nih.gov/gorf/gorf.html
    165 Green P . Against a whole-genome shotgun. Genome Res.1997. 7(5):410-417.
    166 Full-length cDNAs: more than just reaching the ends,Manjula Das, Isabelle Harvey, Lee Lee Chu, Manisha Sinha,And Jerry Pelletier,Physiol Genomics. 2001. 6: 57-80.
    167 Michael R. Brent, (2002). Predicting full-length transcripts,TRENDS in biotechnology 20(7).
    168 Levinson,B., Kenwrick,S., Gamel,P, Fisher, K., Gitschier,J., Evidence for a third transcript from the human factor VIII gene.Genomics. 1992. 14,585-589.
    169 De Backer,O., Verheyden,A.M., Martin, B., et al. Structure,chromosomal location,and expression pattern of three mouse genes homologous to the human MAGE gene. Genomics. 1995.28,74-83.
    170 Legouis,R., Hardelin, J.P., Levilliers,J., Claverie,J.M., et al. the candidate gene for the X-linked Kallmann syndrome encodes a protein related to adhesion molecules . Cell 1991. 67,423-435.
    171 Tatusov RL, Koonin EV, Lipman DJ. A genomic perspective on protein families. Science. 1997 Oct 24;278(5338):631-7.
    172 http://au.expasy.org/tools/pi_tool.html
    173 http://au.expasy.org/prosite/
    174 Falquet L., Pagni M., Bucher P., Hulo N., Sigrist C.J, Hofmann K., Bairoch A.The PROSITE database, its status in 2002 Nucleic Acids Res.2002. 30:235-238(2002).
    175 Sigrist C.J.A., Cerutti L., Hulo N., Gattiker A., Falquet L., Pagni M., Bairoch A., Bucher P. PROSITE: a documented database using patterns and profiles as motif descriptors.Brief Bioinform. 2002. 3:265-274.
    176 Hulo N., Sigrist C.J.A., Le Saux V., Langendijk-Genevaux P.S., Bordoli L., Gattiker A., De Castro E., Bucher P., Bairoch A. Recent improvements to the PROSITE databaseNucl. Acids. Res. 2004. 32:D134-D137.
    177 ftp://au.expasy.org/databases/prosite/tools/ps_scan/
    178 http://www.sanger.ac.uk/Software/Pfam/index.shtml
    179 http://sosui.proteome.bio.tuat.ac.jp/sosuiframeO.html
    180 http://sosui.proteome.bio.tuat.ac.jp/sosuisubmit.html
    181 Hirokawa, T., Boon-Chieng, S., Mitaku, S., SOSUI: classification and secondary structure prediction system for memebrane proteins, Bioinformatics, 1998. 14,378-379.
    182 http://www.ebi.ac.uk/InterProScan/
    183 http://www.ebi.ac.uk/interpro/User-FAQ-Scan.htm1#N418
    184 ftp://ftp.ebi.ac.uk/databases/interpro/iprscan/
    185 http://www.ncbi.nlm.nih.gov/genome/sts/epcr.cgi
    186 Lu D, Searles MA, Klug A. Crystal structure of a zinc-finger-RNA complex reveals two modes of molecular recognition. Nature. 2003 Nov 6;426(6962):96-100.
    187 Ze-Guang Han, Qing-Hua Zhang, Min Ye, Li-Xin Kan, Bai-Wei Gu, Kai-Li He, Shao-Lin Shi, Jun Zhou, Gang Fu, Mao Mao, Sai-Juan Chen, Long Yu, and Zhu Chen. Molecular Cloning of Six Novel Kruppel-like Zinc Finger Genes from Hematopoietic Cells and Identification of a Novel Transregulatory Domain KRNB. J Biol Chem,1999,274(50): 35741-35748.
    188 Gilman A.G. G proteins: transducers of receptor-generated signals. Annu. Rev. Biochem. 1987;56:615-649.
    189 Smith T.F., Gaitatzes C, Saxena K., Neer E.J. The WD repeat: a common architecture for diverse functions. Trends Biochem. Sci. 1999; 24:181-185.
    190 Neer E.J., Schmidt C.J., Nambudripad R., Smith T.F. The ancient regulatory-protein family of WD-repeat proteins. Nature 1994; 371:297-300.
    191 http://au.expasy.org/cgi-bin/nicedoc.pl?PDOC00271
    192 Hershko, A., and A. Ciechanover. The ubiquitin system. Annu. Rev. Biochem. 1998; 67: 425-479.
    193 http://au.expasy.org/cgi-bin/nicedoc.pl?PDOC50330
    194 Gorina S., Pavletich N.P. Structure of the p53 tumor suppressor bound to the ankyrin and SH3 domains of 53BP2. Science. 1996. 274:1001-1005.
    195 Luh F. Y, Archer S.J., Domaille P.J., Smith B.O., Owen D., Brotherton D.H., Raine A.R., Xu X., Brizuela L., Brenner S.L., Laue E.D. Nature 1997. 389:999-1003.
    196 Batchelor A.H., Piper D.E., De La Brousse F.C., McKnight S.L., Wolberger C. Science 1998. 279:1037-1041.
    197 Jacobs M.D., Harrison S.C. Structure of an IkappaBalpha/NF-kappaB complex. Cell. 1998. 95: 749-758.
    198 http://prospector.ucsf.edU/ucsfhtm14.0/msdigest.htm
    199 http://nbrfa.georgetown.edu/pir/
    200 George D. G., Barker W. C, Mewes H.-W., Pfeiffer F., Tsugita A.The PIR-International Protein Sequence Database, Nucleic Acids Res. 1996. 24(1): 17-20.
    201 http://www.mips.biochem.mpg.de/
    202 http://bioinf.proteomics.com.cn/Search/SWISS-PROT-index.htm
    203 Bairoch A., Apweiler R. The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res. 1997. 25: 31-36.
    204 http://expasy.hcuge.ch/sprot/sprot-top.html
    205 The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000, Amos Bairoch and Rolf Apweiler. Nucleic acids research. 2000, 28(1). 45-48.
    206 http://www.ebi.ac.uk/trembl/
    207 O'Donovan C, Martin MJ, Gattiker A, Gasteiger E, Bairoch A, Apweiler R. High-quality protein knowledge resource: SWISS-PROT and TrEMBL. Brief Bioinform. 2002 Sep; 3(3): 275-84.
    208 http://www.ebi.ac.uK/IPI/IPIhelp.html
    209 RefSeq and LocusLink: NCBI gene-centered resources. Kim D. Pruitt and Donna R. Maglott, nucleic acids research, 2001, 29(1). 137-140.
    210 http://www.ensembl.org/
    211 ftp://ftp.ncbi.nlm.nih.gov/blast/db/
    212 H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N, Shindyalov, P. E. Bourne: The Protein Data Bank. Nucleic Acids Research, 2000.28 pp. 235-242.
    213 http://www.prf.or.jp/en/dbi.html
    214 Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004. D115-9.
    215 http://www.cbi.pku.edu.cn/tools/emboss/digest.html
    216 Fukami-Kobayashi K, Saito N.How to make good use of CLUSTALW.Tanpakushitsu Kakusan Koso, 2002. 47(9): 1237~1239_
    217 张文广,李金泉,周欢敏.冠状病毒的新成员-SARS-CoV的基因组特性.遗传学报,2003,30(6):501~508.
    218 陈蕴佳,高歌,鲍明.Rodrigo LOPEZ,吴健民,蔡涛,叶志强,顾孝诚,罗静初.SARS冠状病毒全基因组序列初步分析.遗传学报,2003,30(6):493~500.
    219 李伍举,刘涛.SARS病毒抗原表位预测.解放军医学杂志,传染性非典型肺炎专刊.2003,28:s7~s8.
    220 http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=NC_004718.221 Marra M A, Jones S J, Astell C R, Holt RA, Brooks-Wilson A, Buttertield Y S, Khattra J, Asano J K, Barber S A, Chan S Y, Cloutier A, Coughlin S M, Freeman D, Girn N, Griffith O L, Leach SR, Mayo M, McDonald H, Montgomery S B, Pandoh P K, Petrescu A S, Robertson A G, Schein J E, Siddiqui A, Smailus D E, Stott.J M, Yang G S, Plummer F, Andonov A, Artsob H, Bastien N, Bernard K, Booth T F, Bowness D, Czub M, Drebot M, Fernando L, Flick R, Garbutt M, Gray M, Grolla A, Jones S, Feldmann H, Meyers A, Kabani A, Li Y, Normand S, Stroher U, Tipples GA, Tyler S, Vogrig R, Ward D, Watson B, Brunham R C, Krajden M, Petric M, Skowronski D M, Upton C, Roper R L. The genome sequence of the SARS-Associated coronavirus.Sciencexpress/www.sciencexpress.org/science.1085953: Pagel/10.1126/1, 2003.
    222 QIN E'de, ZHU Qingyu, YU Man, FAN Baochang, CHANG Guohui, SI Bingyin, YANG Bao'an, PENG Wenming, JIANG Tao, LIU Bohua, DENG Yongqiang, LIU Hong, ZHANG Yu, WANG Cui'e, LI Yuquan, GAN Yonghua, LI Xiaoyu, LU Fushuang, TAN Gang, CAO Wuchun, YANG Ruifu, A complete sequence and comparative analysis of a SARS-associated virus (Isolate BJ01). Chinese Science Bulletin, 2003, 48(10): 941~948.
    223 YiJun Ruan, Chia Lin Wei, Ling Ai Ee, Vinsensius B Vega, Herve Thoreau, Se Thoe Su Yun, Jer-Ming Chia, Patrick Ng, Kuo Ping Chiu, Landri Lim, Zhang Tao, Chan Kwai Peng, Lynette

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700