艾滋病病毒进化的数学分析
详细信息    查看全文 | 推荐本文 |
  • 作者:陈泽凯
  • 关键词:艾滋病病毒 ; 自然向量 ; Python ; Mega ; 进化树
  • 中文刊名:TXWL
  • 英文刊名:China New Telecommunications
  • 机构:陕西师范大学附属中学;
  • 出版日期:2019-04-20
  • 出版单位:中国新通信
  • 年:2019
  • 期:v.21
  • 语种:中文;
  • 页:TXWL201908174
  • 页数:3
  • CN:08
  • ISSN:11-5402/TN
  • 分类号:218-220
摘要
本文通过运用Yau 2011提出的自然向量法,对艾滋病病毒HIV-1, HIV-2, PLV三类病毒进行进化分析。采用Python计算序列的自然向量以及序列间两两的距离,之后利用Mega对计算好的距离矩阵画进化树。为了测试本文方法的可行性,本文选取了HIV Sequence Database的20条全基因数据进行研究,分别用我们的方法和传统的MSA(多序列比对)画进化树。得到的结果显示我们的方法明显优于MSA,而且在耗时上我们也优于MSA。因此,我们的方法能为艾滋病病毒在进化方面的研究提供有利的工具。
        
引文
[1] Amano K, Nakamura H, Ichikawa H. Self-organizing clustering:a novel non-hierarchical method for clustering large amount of DNA sequences[J]. Genome Informatics, 2003, 14:575-576.
    [2] Emrich S J, Kalyanaraman A, Aluru S. Algorithms for large-scale clustering and assembly of biological sequence data[J]. Handbook of Computational Molecular Biology. pp, 2006:13.1-13.30.
    [3] FitzGerald P C, Shlyakhtenko A, Mir A A, et al. Clustering of DNA sequences in human promoters[J]. Genome research, 2004, 14(8):1562-1574.
    [4] Waterman M S. Introduction to computational biology:maps, sequences and genomes[M]. CRC Press, 1995.
    [5] Abe T, Kanaya S, Kinouchi M, et al. Informatics for unveiling hidden genome signatures[J]. Genome research, 2003, 13(4):693-702.
    [6] Chuzhanova N A, Jones A J, Margetts S. Feature selection for genetic sequence classification[J]. Bioinformatics(Oxford, England),1998, 14(2):139-143.
    [7] Karlin S, Ladunga I. Comparisons of eukaryotic genomic sequences[J]. Proceedings of the National Academy of Sciences, 1994, 91(26):12832-12836.
    [8] Nakashima H, Ota M, Nishikawa K, et al. Genes from nine genomes are separated into their organisms in the dinucleotide composition space[J]. DNA Research, 1998, 5(5):251-259.
    [9] Yau S S T, Wang J, Niknejad A, et al. DNA sequence representation without degeneracy[J]. Nucleic acids research, 2003, 31(12):3078-3080.
    [10] Liu L, Ho Y, Yau S. Clustering DNA sequences by feature vectors[J]. Molecular phylogenetics and evolution, 2006, 41(1):64-69.
    [11] Yau S S T, Yu C, He R. A protein map and its application[J]. DNA and cell biology, 2008, 27(5):241-250.
    [12] Carr K, Murray E, Armah E, et al. A rapid method for characterization of protein relatedness using feature vectors[J]. PLoS One,2010, 5(3):e9550.
    [13] Yu C, Liang Q, Yin C, et al. A novel construction of genome space with biological geometry[J]. DNA research, 2010, 17(3):155-168.
    [14] Larkin M A, Blackshields G, Brown N P, et al. Clustal W and Clustal X version 2.0[J]. bioinformatics, 2007, 23(21):2947-2948.
    [15] Edgar R C. MUSCLE:a multiple sequence alignment method with reduced time and space complexity[J]. BMC bioinformatics, 2004,5(1):113.
    [16] Katoh K, Misawa K, Kuma K, et al. MAFFT:a novel method for rapid multiple sequence alignment based on fast Fourier transform[J].Nucleic acids research, 2002, 30(14):3059-3066.
    [17] Wang L, Jiang T. On the complexity of multiple sequence alignment[J]. Journal of computational biology, 1994, 1(4):337-348.
    [18] Musto H, CacciòS, Rodríguez-Maseda H, et al. Compositional constraints in the extremely GC-poor genome of Plasmodium falciparum[J]. Memórias do Instituto Oswaldo Cruz, 1997, 92(6):835-841.
    [19] Deng M, Yu C, Liang Q, et al. A novel method of characterizing genetic sequences:genome space with biological distance and applications[J]. PloS one, 2011, 6(3):e17293.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700