基于公共数据库棉花非冗余性EST-SSR新标记的开发、评价及应用
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
近年来由于公共数据库尤其是核酸数据库数据呈指数式增长以及生物信息学技术革命,使得基因组学、蛋白组学等飞速发展。如何合理、高效利用这些数据并应用到基因组学研究中来是一个迫切需要解决的问题。然而冗余性是分子标记开发过程面临的一个重要问题,但相关报道较少。至今没有合适的软件同时分析一对引物的冗余性,造成研究开发的重复性,浪费时间和成本。为系统集成研究棉花EST资源,本文开发非冗余的功能标记并进行相关应用研究,为基因组测序、转录组测序产生的海量信息积累技术资料。
     本研究对公共数据库现有的393753条棉花EST序列分析得到349815条非冗余EST序列,借助自主开发的SSRmine软件共发掘SSR位点11372个,分布于10507条EST中,EST-SSR的频率是3%,平均相隔21kb出现一个SSR。再利用上述去冗余的且在棉属中没有被开发过的EST序列设计引物,利用自主开发的SSRD软件通过SSR引物序列下载、预处理等6个步骤去除来源于自身部分同源序列以及与棉花CMD(http://www.cottonmarker.org/)网上释放的相似性SSR引物,得到了1000对非冗余性引物,定名为CICRXXX。并分别选用12个不同棉种的代表性材料对其中的100对进行引物功效评价,包括多态信息含量及引物通用性研究以及一套陆陆群体的初步定位。结果显示,100对SSR引物筛选出的56对均能在12份材料间扩增出稳定明显的条带,其中多态性引物35对,多态率占35%。引物的PIC变幅为0.097~0.888,平均为0.482;1对海岛棉EST-SSR引物在12份材料间的通用性为100%,25对亚洲棉引物通用性为81%,74对陆地棉引物通用性为80.1%。
     对开发的1000对新引物又重点进行了海陆BC1群体遗传图谱定位、67份野生棉材料遗传多样性评价、以及新标记对应的EST序列进行功能注释及KEGG代谢分析等应用研究。结果如下:
     1.从1000对EST-SSR新引物筛选出的380对多态性SSR引物均能在67份野生棉材料间扩增出稳定明显的条带,共检测出660个片段,平均每对引物为1.73个。多态信息含量的变幅为0.026-0.824,平均为0.384;有效等位基因数(Ne)在1.024-5.698变动,平均为2.64;基因多样性(H′)平均为4.38。UPGMA聚类分析显示,在遗传相似系数为0.8处将67份材料分为八类,聚类结果大体和Wendel(2010)的大体一致,结果还表明不同来源地的同一棉种材料聚类结果与染色体组相关,与地理来源相关性不显著。
     2.中国农科院棉花研究所袁有禄研究员实验室已经构建一张基于中棉所36和海1的BC1群体全长4000多cM的遗传图谱(文章未发),本研究将1000个CICR标记定位到该图谱上的有132个,涉及136个位点,其中A亚组63个,D亚组73个。涉及棉花全基因组的1-26条染色体,其中19号染色体最多,有17个标记,偏分离位点数共42个。
     3.对1000个CICR标记对应的EST序列进行功能注释,在level2水平,把有功能的分成细胞组分、分子功能和生物进程3个类型。其中976条属于细胞组分,597条属于分子功能,1126条属于生物进程。对应的EST序列中有239条(约23.9%)序列有代谢途径。最多的是碳水化合物和能量代谢,其次是氨基酸代谢。
     通过初步评价和应用研究表明新开发的非冗余性EST-SSR标记功效尚可,首次构建的这套冗余性检测评价方法较为可行,可以进行相关基因组学等应用研究。
In recent years exponential growth of public database especially nucleic acid data and bioinformatics technology revolution has made genomics and proteomics develop rapidly. How to use these data reasonably, efficiently and apply to genomics research is an urgent problem. However, redundancy has become a very important problem on the process of molecular markers development, and less relevant reports. Until now there is no right software to analyse the redundancy of a pair of primers meanwhile, causing the repeat research and development, a waste of time and cost. For system integration cotton EST(Expressed Sequence Tag) resources, the paper developed nonredundancy functional markers and carried out related applied research. It will accumulate technology materials for the abundant information from genome sequencing and transcriptome sequencing.
     A software Clustal X was used to analyse the redundancy of393753ESTs of Gossypium available in public database. By mining349815non-redundant ESTs, a total of11372SSR(Simple Sequence Repeat) loci derived from10507ESTs using a software SSRmine developed by ourselves were observed. The frequency of ESTs containing SSRs was3%, with an average of one SSR in every21kb of EST sequence. One thousand of new nonredundancy EST-SSR primers were developed based on the mentioned above EST sequences removed the redundancy which have not been developed so far in Gossypium, And we used a software SSRD developed by ourselves to obtain non-similarity primers, designated CICR (China Institute of Cotton Research)XXX through six steps, including SSR primer sequences download, pretreatment, Blastn, extraction of primer numbers of similarity score more than81%, extraction of redundant primers pairs and making redundant primers in a line, to remove homologous sequences from themselves and similar primers released in CMD(Cotton Marker Database) from different cotton species. Among them,100primers were evaluated in polymorphism information content (PIC), transferability using twelve cotton species including seven representative diploids species and five tetraploid species and preliminary mapping based on a F2population of G. hirsutum L x G. hirsutum. The results showed that a total of56from the100pairs of SSR primers could be amplified the stable and clear polymorphic bands in the12accessions mentioned above, moreover,35out of56pairs of primers were polymorphic, with the primer polymorphism ratio of35%. PIC of these primers ranged from0.097to0.888, with the average of0.482. Totally, the transferability among twelve cotton species was100%for a pair of EST-SSR primers from Gossypium barbadense L.,81%for25primers from G. arboreum and80.1%for74primers from G. hirsutum, respectively. It showed the new non-redundant EST-SSR markers efficacy is good and the method is feasible.
     One thousand of pairs of new primers were carried out the application study such as the genetic group mapping with a BC1population of G. hirsutum x G. barbadense, assessment of genetic diversity with67wild cotton materials and the function annotation and KEGG metabolic pathways analysis of the corresponding ESTs. Including:
     1. A total of380from1000pairs of SSR primers were used to amplify67accessions from wild cotton, which could produce stable and clear polymorphic bands. Six hundred and sixty DNA fragments were obtained among all materials with the average of1.73. The polymorphism information content (PIC) of these primers was from0.026-0.824, with the average of0.384, effective number of alleles (Ne) varied from1.024to5.698with the average of2.64and the Shannon-Weaver diversity index (H) with with the average of4.38. The UPGMA cluster analysis showed that when the genetic similarity coefficient was0.8, it classified materials into eight categories. The clustering result was approximately in accord with Wendel (2010) results. Meanwhile, it demonstrated that the clustering result of the same cotton species sources from different places was related with chromosome group, there was no significant correlation with geographical origin.
     2. A genetic proup had been constructed in the lab of Professor Yuan Youlu in Cotton Research Institute, Chinese Academy of Agricultural Sciences with a total genetic distance of over4000cM, based on the genetic linkage analysis with135BCi population of ZMS36x Hail and SSR primers screened(unpublished data). One hundred and thirty-two CICR markers were mapped the genetic proup, involving136loci (A subgenome63, D subgenome73, respectively), covering all26chromosomes and there were the most markers on Chromosome19. And there were forty-two segregation distortion loci related to CICR markers in the map.
     3. The function annotation of the corresponding ESTs of1000CICR markers was carried out and on level2, these EST were classified into three types including components, molecular function and biological process. Among them976EST belongded to cell components,597belonged to molecular function and1126belonged to biological processes. Two hundred and thirty-nine (account for23.9%) ESTs among them were associated with metabolic pathways. Carbohydrates and energy metabolism was most, amino acid metabolism was second.
     Through the preliminary evaluation and the application research, it showed that the new nonredundancy EST-SSR markers efficacy was good. The redundancy identification and evaluation methods were feasible and can be carried out related genomics application research.
引文
1.安泽伟,赵彦宏,程汉,等.橡胶树EST-SSR标记的开发与应用.2009,31(3):311-319
    2.蔡彩平.四倍体栽培种高密度遗传图谱的构建及应用.[博士学位论文].南京:南京农业大学,2009
    3.陈相艳,李伟,戴海英,等.大豆EST资源的SSR信息分析大豆科学.大豆科学,2009,28(3):394-399
    4.郭旺珍,王凯,张天真.利用SSR标记技术研究棉属A、D染色体组的进化.遗传学报2003,30(2):183-188
    5.吕远大,蔡彩平,王磊,等.海岛棉EST-SSRs分布特征及新标记的开发与利用.科学通报,2010,55(19):1886-1890.
    6.孟艳艳.一氧化氮对棉花叶片衰老过程中抗氧化物酶及叶片蛋白质组的影响.[博士学位论文].北京:中国农业科学院,2011
    7.宋国立,崔荣霞,王坤波,等.改良CTAB法快速提取棉花DNA.棉花学报,1998,10(5)273-275
    8.王长彪,郭旺珍,蔡彩平,等.雷蒙德氏棉EST-SSRs分布特征及开发与利用.科学通报,2006,21(3):316-320
    9.王长彪.与棉纤维发育相关的EST生物信息学分析.[硕士学位论文].南京:南京农业大学,2007
    10.魏利斌,张海洋,郑永战,等.芝麻EST-SSR标记的开发和初步研究.作物学报,2008,34(12):2077-2084.
    11.徐照龙,易金鑫,余桂红,等.藜科6种耐盐植物遗传多样性的EST-SSR分析.植物遗传资源学报,2011,12(1):113-120
    12.余渝,王志伟,冯常辉,等.草棉EST-SSRs的遗传评价.作物学报,2008,34(12):2085-2091
    13.俞渝.棉花种问群体配子重组率差异、偏分离研究与高密度分子标记遗传图谱构建.[博士学位论文].武汉:华中农业大学,2010
    14.张军,武耀廷,郭旺珍,等.棉花微卫星标记的PAGE/银染快速检测.棉花学报,2000,12(5):267-269
    15.张培培,王夏青,余杨,等.首批海岛棉基因组来源的微卫星标记的分离、评价和定位.作物学报,2009,35(6):1013-1020
    16.张伟,刘方,黎绍惠,等.陆地棉重组近交系产量及其构成因素的QTL分析.作物学报,2011,37(3):433-442
    17.张艳欣,林忠旭,李武,等.海岛棉EST-SSR引物的开发与应用研究.科学通报,2007,52(15):1779-1787.
    18.赵亮,蔡彩平,张天真,等.陆地棉红株基因(R1)的精细定位.科学通报,2009,54:888—891
    19.朱华玉.四倍体棉种D染色体组起源及棉纤维发育相关基因的起源分化研究.[博士学位论文].南京:南京农业大学,2010
    20. Adams K L, Cronn R, Percifield R. Genes duplicated by polyploidy show unequal contributions to the transcriptome and organ-specific reciprocal silencing. Proc Natl Acad Sci USA,2003,100(8): 4649-4654
    21. Baker R.J.,Longmire J.L.,Van den Bussche R. A. Organization of repetitive dements in the upland cotton genome (Gossypium hirsutum). Journal of Heredity,1995,86:178-185.
    22. Bassam BJ, Caetano-Anoles G, Gresshoff PM. Fast and sensitive silver staining of DNA in polyacrylamide gels. Anal Biochem,1991,196:80-83
    23. Beasley OJ.Meiotic chromosome behavior in species hybrids,haploids,and induced polyploids of Gossypium. Genetics,1942,27:25-54.
    24. Beasley OJ. The origin of the American tetraploid Gossypium species. Am. Nat,1940,74:285-286.
    25. Botstein D, White R L,Skolnick M.Construction of a genetic linkage map in man using restriction fragment length polymorphisms.Am J Human Genet,1980,32:314-31
    26. Cardle L, Ratnsay L, Milbourne D. Computational and experimental characterization of physically clustered simple sequence repeats in plants. Genetics,2000,156:847-854
    27. Chen Z.J., Scheffler B.E., Dennis E., Triplett B.A., Zhang T.Z., Guo WZ., Chen X.Y., Stelly D.M., Rabinowicz PD., Town C.D., Arioli T., Brubaker C., Cantrell R.G., Lacape J.M., Ulloa M., Chee P., Gingle A.R., Haigler C.H., Percy R., Saha S., Wilkins T., Wright R.J., Deynze A.V., Zhu Y.,Yu S.,Abdurakhmonov I., Katageri I., Kumar PA., Rahman M., Zafar Y., Yu J.Z.,Kohel R.J.,Wendel J.F., and PatersonA.H.,2007,Toward sequencing cotton(Gossypium) genomes, Plant Physiology,145:1303-1310.
    28.Connell JP, Pammi S, Iqbal MJ,et al. A high throughput procedure for capturing microsatellites from complex plant genomes. Plant Molecular Biology Reporter,1998,16:341-349
    29. Cordeiro G M,Casu R, McIntyre C L,Manners J M,Henry R J. Microsatellite markers from sugarcane (Saccharum spp.) ESTs cross transferable to erianthus and sorghum.Plant Sci,2001,160:1115-1123
    30. Daojun Yuan, Lili Tu, Xianlong Zhang. Generation, Annotation and Analysis of First Large-Scale Expressed Sequence Tags from Developing Fiber of Gossypium barbadense L. PloS ONE,2011,6(7):1-15
    31. Deyong Lai,Huaizhu Li, Shuli Fan, et al. Generation of ESTs for Flowering Gene Discovery and SSR Marker Development in Upland Cotton. PloS ONE,2011,6(12):1-11
    32.Dong CG,Ding YZ,Guo WZ, et al. Fine mapping of the dominant glandless gene G12e in Sea island cotton(Gossypium barbadense L.). Chinese Sci Bull,2007,52:3105-3109
    33. Eujayl I, Sorrells M E, Baum M.Isolation of EST-derived microsatellite markers for genotyping the A and B genomes of wheat. Theor Appl Genet,2002,104:399-407
    34. Feingold S, Lloyd J, Norero N. Mapping and characterization of new EST-derived microsatellites for potato(Solanum luberosum L). Theor Appl Genet,2005,111:456-466
    35. Frelichowski Jr JE, Palmer MB, Mam D, et al. Cotton genome mapping with new microsatellites from Acala 'Maxxa' BAC-ends.Molecular Genetics Genomics,2006,275(5):479-491
    36. Fryxell PA. A revised taxonomic interpretion of Gossypium (Malvaceae). Rheedea,1992,2:108-165.
    37. Fryxell PA.The natural history of the cotton tribe.Texas A&M University Press, College Station, Texas,1979.
    38. Guo WZ, Cai CP, Wang C, et al. A preliminary analysis of genome structure and composition in Gossypium hirsutum. BMC Genomics,2008,9:314
    39. Guo WZ, Cai CP, Wang CB, et al. A microsatellite-based, gene-rich linkage map reveals genome structure, function and evolution in Gossypium. Genetics,2007,176:527-541
    40. GuoW Z, Wang W, Zhou B L, Zhang T Z. Cross-species transferability of G. arboreum-derived EST-SSRs in the diploid species of Gossypium. Theor Appl Genet,2006,112:1573-1581
    41. Hackauf B, Wehling P. Identification of microsatellite polymorphisms in an expressed portion of the rye genome. Plant Breed,2002,121:17-25
    42. Han Z G, Guo W Z, Song X L, Zhang T Z. Genetic mapping of EST-derived microsatellites from the diploid Gossypium arboreum in allotetraploid cotton. Mol Genet Genom,2004,272:308-327
    43. Han Z, Wang C, Song X, Guo W, Gou J, Li C, Chen X, Zhang T. Characteristics, development and mapping of Gossypium hirsutum derived EST-SSRs in allotetraploid cotton. Theor Appl Genet,2006,112:430-439
    44. Hoffman SM, Yu JZ, Grum DS, et al. Identification of700new microsatellite loci from cotton (G. hirsutum L.). Journal of Cotton Science,2007,11(4):208-241
    45. Hua-Yu Zhu, Tian-Zhen Zhang, Lu-Ming Yang, et al. EST-SSR sequences revealed the relationship of D-genome in diploid and tetraploid Species in Gossypium. Plant Science,2009,176:397-405
    46. Jonathan F. Wendel and Richard C. Cronn. Polyploidy and the evolutionary history of cotton. Advances in Agronomy,2003,78:139-186.
    47. Junkang Rong, Colette Abbey, John E. Bowers, and et al. A3347-locus genetic recombination map of sequence tagged sites reveals features of genome organization, transmission and evolution of cotton(Gossypium)., Genetics,2004,166:389-417.
    48. K. Aramuganathan, E. D. Earle. Nuclear DNA content of some important plant species. Plant Molecular Biology Reporter.1991,9(3):208-218
    49. Kosambi DD:The estimation of map distance from recombination values. Ann Eugen1944,12:172-175.
    50. Liu RZ, Wang BH, Guo WZ, et al. Quantitative trait loci mapping for yield and its components by using two immortalized populations of a heterotic hybrid in Gossypium hirsutum L. Mol Breeding,2011, DOI:10.1007/s11032-011-9547-0
    51. Liu S, Saha S, Stelly D, et al. Chromosomal assignment of microsatellite loci in cotton. Journal of Heredity,2000b,91(4):326-32.
    52. Metzgar D, Bytof J, Wills C. Selection against frameshift mutations limits microsatellite expansion in coding DNA. Genome Res,2000,10:72-80
    53. Nguyen TB, Giband M, Brottier P, et al. Wide coverage of the tetraploid cotton genome using newly developed microsatellite markers. Theor. Appl. Genet.,2004,109:167-175
    54. Park Y-H, Alabady MS, Ulloa M, et al. Genetic mapping of new cotton fiber loci using EST-derived microsatellites in an interspecific recombinant inbred line (RIL) cotton population. Molecular Genetics and Genomics,2005,274:428-441.
    55. Peng J H, Nore L, Lapitan V. Characterization of EST-derived microsatellites in the wheat genome and development of eSSR markers. Funct Integr Genom,2005,5:80-96
    56. Qian N, Zhang X W, Guo W Z, et al. Fine mapping of open bud duplicate genes in homoelogous chromosomes of tetraploid cotton. Euphytica,2009,165:325-331
    57. Qin HD, Guo WZ, Zhang YM, et al. QTL mapping of yield and fiber traits based on a four-way cross population in Gossypium hirsutum L. Theor Appl Genet,2008,117:883-894
    58. Qureshi SN, Saha S, Kantety RV, et al. EST-SSR:A new class of genetic markers in cotton. Journal of Cotton Science,2004,8:112-123
    59. Reddy OUK, Pepper AE, Abdurakhmonov I, et al. New dinucleotide and trinucleotide microsatellite marker resources for cotton genome research. Journal of Cotton Science,2001,5:103-113
    60. Scott K D, Eggler P, Seaton G, Rossetto M, Ablett E M,. Lee S L, Henry R J. Analysis of SSRs derived from grape ESTs. Theor Appl Genet,2000,100:723-726
    61. Shen X L, Guo W Z, Zhu X F, Yuan Y L Kohel R J, Zhang T Z. Molecular mapping of QTLs for qualities in three diverse lines in Upland cotton using SSR markers. Mol Breed,2005,15:169-181
    62. Song X L, Zhang T Z. Identification of quantitative trait loci controlling seed physical and nutrient traits in cotton. Seed Sci Res,2007,17:243-251
    63. Taliercio E, Allen RD, Essenberg M, et al. Analysis of ESTs from multiple Gossypium hirsutum tissues and identification of SSRs. Genome,2006,49:306-319.
    64. Thiel T, Michalek W, Varshney R K. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L). Theor Appl Genet,2003,106:411-422
    65. Van Ooijen, J. W., and R. E. Voorrips,2001, Joinmap Version3.0:Software for the Calculation of Genetic Linkage Maps. CPRO-DLO, Wageningen, The Netherlands.
    66. Wang CB, Guo WZ, Cai CP, et al. Characterization, development and exploitation of EST-derived microsatellites in Gossypium raimondii Ulbrich. Chinese Science Bulletin,2006,51(5):557-561.
    67. Wendel J F. New World cottons contain Old World cytoplasm. Proc Natl Acad Sci USA,1989,86:4132-4136
    68. Wendel J. F., Schnabel A, Seleman T. An unusual ribosomal DNA sequence from Gossypium gossypioides reveals ancient, cryptic, intergenomic introgression. Molecular Phylogenetics and Evolution,1995,4(3):298-313.
    69. Wendel JF, Albert VA. Phylogenctics of the cotton genus(Gossypium):Charaeter-state weighted parsimony analysis of chloroplast-DNA restrietion site data and its systemic and biogeographic implication. Syst Bot,1992,17:115-143.
    70. Wendel,J.F., Brubaker, C.&Seelanan, T.(2010).The Origin and Evolution of Gossypium. Physiology of Cotton. pp1-18.
    71. Yang C, Guo W Z, Li G Y, Gao F, Lin S S, Zhang T Z. QTLs mapping for verticillium wilt resistance at seedling and maturity stages in Gossypium barbademse L. Plant Sci,2008,174:290-298
    72. Yu J, Kohel RJ, Dong RJ. Development of integrative ssr markers from TM-1BACs.Proceedings of Beltwide Cotton Conference CD-Rom,2002
    73. Yu Y, Yuan DJ, Liang SG, et al. Genome structure of cotton revealed by a genome-wide SSR genetic map constructed from a BC1population between gossypium hirsutum and G barbadense BMC Genomics.2011.12:15
    74. Zaitzev GS. A contrubution to the classification of the genus Gossypium L. Bull. Appl. Bot.,Genet. Plant Breeding,1928,18:l-65.(in Russian,in English on p.39-65)
    75. Zhao L, Cai C P, Zhang T Z, Guo W Z. Fine mapping of the red plant gene R1in upland cotton (Gossypium hirsutum). Chin Sci Bull,2009,54(9):1529-1533
    76. Zhao X P, Si Y, Hanson R E, et al. Dispersed repetitive DNA has spread to new genomes since polyploidy formation in cotton. Genome Res1998,8:479-492.
    77. Zhao Xin-ping, Wing R A, Paterson A. H. Cloning and characterization of the majority of repititive DNA in cotton (Gossypium L.). Genome,1995,38(6):1177-1188.
    78. Zheng-sheng zhang, Mei-Chun Hu, Jian Zhang, et al. Construction of a comprehensive PCR-based marker linkage map and QTL mapping for fiber quality traits in upland cotton (Gossypium hirsutum L.). Mol Breeding,2009,24:49-61
    79. Zhongxu Lin, Yanxin Zhang, Xianlong Zhang, et al. A high-density integrative linkage map for Gossypium. Euphytica,2009,66:35-45
    80.Zhu H Y, Han X Y, Lu J H, Zhao L, Xu X Y, Zhang T Z, Guo W Z. Structure, expression differentiation and evolution of duplicated fiber developmental genes in Gossypium barbadense and G. hirsutum. BMC Plant Biol.2011,11:40

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700