基于机器学习的A型流感病毒跨种传播和抗原关系预测研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
禽流感病毒是禽适应的A型流感病毒,在过去的十几年间,禽流感病毒的跨种传播给人类社会造成了重大的生命财产损失,引起了社会的高度关注。H3N2亚型流感病毒是另一种对人类社会具有重要影响的A型流感病毒,它的抗原变异让疫苗失去作用,为全球流感病毒监控工作带来较大的困难。研究这两类A型流感病毒的跨种传播和抗原关系具有重要的理论和现实意义。基于机器学习、信息论、特征选择等方法研制并改进了禽流感病毒禽到人的跨种传播和H3N2亚型流感病毒的抗原关系预测模型,同时识别了禽流感病毒禽到人传播的90个特征氨基酸位置以及18个H3N2流感病毒抗原变异关键氨基酸位置,从而可以为公共健康提供早期预警,为相关的分子决定因素和底层机制研究提供思路。
     首先,根据现在尚未有实验验证的不能实现禽到人传播的禽流感病毒的情况,结合一分类SVM适用于负样本较难确定的问题的特点,探索了使用一分类SVM来预测禽流感病毒禽到人传播的可行性,通过氨基酸组成、二肽组成及自相关系数编码禽流感病毒蛋白质序列,构建了一分类SVM预测模型,其预测精度超过了当前已有的反向神经网络预测模型。
     其次,在前期工作建测试用的负样本时,发现构建的负样本比已有的预测模型中用到的负样本具有更高的可靠性,因此扩大了两类样本的数据规模并采取传统的两分类方法提升预测禽流感病毒禽到人的跨种传播同时挖掘有生物学意义的特征。通过信息熵的方法首先选择了90个特征氨基酸位置,基于理化性质编码这些特征位置后使用了多种特征选择方法包括Relief,mRMR,信息增益及遗传算法选取了最优特征子集,利用这个最优特征子集构建的预测模型性能有了大幅提高,同时最终选择的理化特性在两类样本中差异明显,表明了这些特征的有效性,此外其中的两个理化性质得到多个生物学研究结果的支持。
     再次,人工收集了来自于相关文献中记录的H3N2流感病毒抗原变异数据,将最近三个H3N2抗原变异研究中用到的数据规模扩大了近一倍。然后比较了多种打分策略,包括优势比,互信息,Phi相关系数并联合多元线性回归最终识别了18个H3N2流感病毒抗原变异关键位置,这18个关键位置均位于HA蛋白的5个抗原表位中,有8个位置与已识别的正选择位置相吻合,说明了本研究识别的18个抗原变异关键位置对H3N2流感病毒抗原变异具有重要作用。
     最后,在上一部分工作的基础上,期望改进H3N2流感病毒抗原关系的预测模型,降低假阳性。基于氨基酸的某些突变可能并不造成抗原变异,而当理化性质改变时才造成抗原变异的提示,集成了多种理化性质变化来改进预测H3N2流感病毒的抗原关系。通过互信息与层次聚类筛选了候选理化性质,最终的实验结果表明构建的预测模型比上一部分工作构建的模型性能有了较大提高,同时优于当前其他三个H3N2抗原关系预测模型,包括汉明距离预测模型,分组打分多元线性回归模型以及决策树。此外进一步构建了H3N2流感病毒抗原关系预测的Web工具,为相关研究人员提供在线服务。
Avian influenza virus is a class of avian-adapted influenza A viruses. During the pastdecade, avian influenza virus took many people’s lives and brought big panic and closeattention to human society. Influenza H3N2virus is another class of influenza A viruseswith significant impact on public health. Their antigenic variants result in reduced or evenlost effectiveness of the current vaccine, causing trouble in the work of global influenzasurveillance. The research about interspecies transmission and antigenic variants of the twokinds of influenza A viruses is of great importantance both in theoretical and practicalaspects. Based on machine learning, information theory and feature selection methods, theprediction models of avain-to-human transmission of avian influenza viruses and antigenicrelationship of influenza H3N2viruses are improved. Meanwhile,90signature amino acidpositions for avain-to-human transmission of avian influenza viruses and18critical aminoacid positions for antigenic variants of influenza H3N2viruses are identified. This studythereby can provide early warning for public health and valuable clues for the relatedresearch about molecule determinants and underlying mechanism.
     First, due to the fact that there are no experimentally confirmed avian influenza viruseswhich can not directly infect human to be considered as negative samples and one-classsupport vector machine is an approach successfully applied in solving problems where thenegative class is not well defined, thus we explored the feasibility of using one-classsupport vector machine to predict avian-to-human transmissions of avian influenza viruses.The final prediction model constructed with amino acid composition, dipeptidecomposition and autocorrelation achieves good performance. The prediction accuracy ishigher than that of the previous prediction model of back propagation neural network.
     Secondly, when we established the negative testing dataset in the last study, it wasfound that our negative data are more reliable than the negative data used in the previouspredicton model. Therefore, we increased the number of two kinds of samples andattempted to construct traditional binary-class model to improve the prediction ofavian-to-human transmissions of avian influenza viruses. The90signature positions were selected with entropy method. Based on four feature selection methods including Relief,mRMR, information gain and genetic algorithm, the optimal physicochemical featuresubset was mined. The performance of the final precidtion model constructed with theoptimal feature subset achieves great improvement than that of the other existing predictionmodels. The experimental results of cross-validation and an independent test show that thefinal features and the model is efficient to predict the transmission of avian influenzaviruses from avian to human.
     Thirdly,394antigenic relationship data of H3N2influenza virus were collected fromrelated publications. Then, different scoring methods including phi coefficient, odds ratioand mutual information were compared. Base on multiple linear regression model and thebetter scoring method (i.e. phi coefficient),18amino acid positions were identified to becritical for antigenic variants of H3N2influenza virus. All the18critical positions arelocated in five epitopes of HA protein. Additionally,8positions are identical to theidentified positive selection positions in other studies. The results indicate that the18position play important roles in antigenic variants of H3N2influenza virus.
     Finally, based on the aforementioned work, we tried to improve the prediction modelof antigenic relationship of H3N2influenza virus and reduce the false positive. Based onthe hint that physicochemical property change would be more effective for antigenicvariants of H3N2influenza virus, using the physicochemical feature candidates selected bymutual information and hierarchical clustering, the final prediction model was constructedwith stepwise multiple linear regression. The experimental results on training and testingdatasets indicate that our prediction model surpass the exsiting precition models includingthe hamming distance model, the group scoring model and the decision tree model.Furthermore, we developed a web tool named as H3N2-AR to provide the online service ofpredicting antigenic relationship of H3N2influenza virus for the researchers in this field.
引文
[1]甘孟侯.禽流感.第二版.北京:中国农业出版社,2002,74-78.
    [2] Webster, R, Bean, W J, Gorlnan, T, et al. Evolution and ecology of influenza Aviruses. Microbiol Rev,1992,56:152-179.
    [3] Ferguson, N M, Fraser, C, Donnelly, C A, et al. Public health risk from the AvianH5N1influenza epidemic. Science,2004,304:968-969.
    [4] Zhou, N, Senne, D, Landgraf, J, et al. Genetic reassortment of avian, swine, andhuman influenza viruses in American pigs. J Virol,1999,73:8851–8856.
    [5] Bean, W, Schell, M, Katz, J, et al. Evolution of the H3influenza virus hemagglutininfrom human and nonhuman hosts. J Virol,1992,66:1129-1138.
    [6] Neumann, G, Noda, T, Kawaoka, Y. Emergence and pandemic potential ofswine-origin H1N1influenza virus. Nature,2009,459:931-939
    [7] Gamblin, S, Haire, L, Russell, R, et al. The structure and receptor binding propertiesof the1918influenza hemagglutinin. Science,2004,303:1838-1842.
    [8] Klenk, H D, Garten, W and Matrosovich, M. Molecular mechanisms of interspeciestransmission and pathogenicity of influenza viruses: Lessons from the2009pandemic. Bioessays,2011,33:180-188.
    [9] Kawaoka, Y, Krauss, S and Webster, R G. Avian-to-human transmission of the PB1gene of influenza A viruses in the1957and1968pandemics. J Virol,1989,63:4603-4608.
    [10] Scholtissek, C, Rohde, W, Von Hoyningen, V, et al. On the origin of the humaninfluenza virus subtypes H2N2and H3N2. Virology,1978,87:13–20.
    [11] Erica, S. Avian Influenza Virus.2008, Humana Press.
    [12] Olsen, B, Munster, V J, Wallensten, A, et al. Global patterns of influenza A virus inwild birds. Science,2006,312:384-388.
    [13] Webster, R G, Bean, W J, Gorman, O T, et al. Evolution an ecology of influenza Aviruses. Microbiol Rev,1992,56:152-179.
    [14] World Organization for Animal Health, Chapter2.7.12, Avian Influenza in Manualof Diagnostic Tests and Vaccines for Terrestrial Animals.2004.
    [15] Swayne, D E and Suarez, D L. Highly pathogenic avian influenza. Rev. Sci. Tech.2000,19,463-482.
    [16] Yuen, K Y, Chan, P K, Peiris, M, et al. Clinical features and rapid viral diagnosis ofhuman disease associated with avian influenza A H5N1virus. Lancet,1998,351:467-471.
    [17] Mounts, A W, Kwong, H, Izurieta, H S, et al. Case-control study of risk factors foravian influenza A (H5N1) disease, Hong Kong,1997. J Infect Dis,1999,180:505-508.
    [18] Bridges, C, Katz, J M, Seto, W H,et al. Risk of influenza A (H5N1) infection amonghealth care workers exposed to patients with influenza A (H5N1), Hong Kong.JInfect Dis,2000,181:344-348.
    [19] Subbarao, K, Klimov, A, Katz, J, et al. Characterization of an avian influenza A(H5N1) virus isolated from a child with a fatal respiratory illness. Science,1998,279(5394):393-396.
    [20] Peiris, M, Yuen, K Y, Leung, C W, et al. Human infection with influenza H9N2.Lancet,1999,354(9182):916-917.
    [21] Fouchier, R A M, Schneeberger, P M, Rozendaal, F W, et al. Avian influenza A virus(H7N7) associated with human conjunctivitis and a fatal case of acute respiratorydistress syndrome. Proc. Naltl. Acad. Sci. USA,2004,101(5):1356-1361.
    [22] Tweed, S A, Skowronski, D M, David, S T, et al. Human illness from Avianinfluenza H7N3, British Columbia. Emerg. Infect. Dis,2004,10(12):2196-2199.
    [23] Le, Q M, Ito, M, Muramoto, Y, et al. Pathogenicity of highly pathogenic avian H5N1influenza A viruses isolated from humans between2003and2008in northernVietnam. J. Gen. Virol,2010,91(10):2485-2490.
    [24] Cheng, V C, Chan, J F, Wen, X Wu, et al. Infection of immunocompromised patientsby avian H9N2influenza A virus. J. Infect,2011,62(5):394-9.
    [25] Keyao, P and Michael, W D. Quantifying selection and diversity in viruses byentropy methods, with application to the haemagglutinin of H3N2influenza. J. R.Soc. Interface,2011,8:1644-1653.
    [26] Gupta, V, Earl, D J and Deem, M W. Quantifying influenza vaccine efficacy andantigenic distance. Vaccine,2006,24:3881-3888.
    [27] Deem, M W and Lee, H Y. Sequence space localization in the immune systemresponse to vaccination and disease. Phys. Rev. Lett.2003,91:68-101.
    [28] Basler, C F, Reid, A H, Dybing, J K, et al. Sequence of the1918pandemic influenzavirus nonstructural gene (NS) segment and characterization of recombinant virusesbearing the1918NS genes. Proc. Natl. Acad. Sci. USA,2001,98:2746-2751.
    [29] Li, Z, Jiang, Y, Jiao, P, et al. The NS1gene contributes to the virulence of H5N1avian influenza viruses. J. Virol,2006,90:11115-11123.
    [30] Quinlivan, M, Zamarin, D, Garcia-Sastre, A, et al. Attenuation of equine influenzaviruses through truncations of the NS1protein. J. Virol,2005,79:8431-8439.
    [31] Solorzano, A, Webby, R, Lager, K, et al. Mutations in the NS1protein of swineinfluenza virus impair anti-interferon activity and confer attenuation in pigs. J. Virol,2005,79:7535-7543.
    [32] Rott, R. The pathogenic determinant of influenza virus. Vet. Microbiol,1992,33:303-310.
    [33] Vong, S, Coghlan, B, Mardy, S, et al. Low frequency of poultry-to-human H5N1virus transmission, Southern Cambodia,2005. Emerg Infect Dis,2006,12:1542-1547.
    [34] Scholdssek, C,Burger, H,Kismer, O,et a1.The nucleoprotein as a possible majorfactor in determining host specificity of influenza H3N2viruses. Virology,1985,147(2):287-294.
    [35] Ito, T,Couceim, J N,Keln1, S, et a1.Molecular basis for the generation in pigs ofinfluenza A viruses with pandemic potential. J. Virol,1988,72(9):7367-7373.
    [36] Suaz, D L,Perdue, M L,Cox, N,et a1.Comparisons of highly virulent H5N1influenza A viruses isolated from humans and chickens from Hong Kong. J. Virol,1998,72(8):6678-6688.
    [37] Kida, H,Shortridge, K F,Webster, R G.Origin of the hemagglutinin gene of H3N2influenza viruses from pigs in China. Virology,1988,162(1):160-166.
    [38] Peifis, J S,Guan, Y, Markweu, D, et a1.Co-circulation of avian H9N2andcontemporary "human" H3N2influenza viruses in pigs in southeastern China:potential for genetic reassortment. J. Virol,2001,75(20):9679-9686.
    [39] Karasin, A I, Brown, I H, Carman, S, et a1.Isolation and Characterization of H4N6Avian Influenza Viruses from Pigs with Pneumonia in Canada. J. Virol,2000,74(19):9322-9327.
    [40] Ludwig, S, Stitz, L, Planz,0, et a1.European swine virus as a possible source for thenext influenza pandemic. Virology,1995,212(2):555-561.
    [41] Webby R J, Swenson S L, Krau, S, et a1.Evolution of swine H3N2influenza virusesin the United States. J. Virol,2000,74(18):8243-8251.
    [42] Centers for Disease Control and Prevention. Key facts about avian influenza (birdflu) and avian influenza A (H5N1) Virus.
    [43] Gabriele, N and Yoshihiro, K. Host range restriction and pathogenicity in the contextof influenza pandemic. Emerg Infect Dis,2006,12(6):881-886.
    [44] Kilpatrick, A M, Chmura, A A, Gibbons, D W, et al. Predicting the global spread ofH5N1avianinfluenza. Proc Natl Acad Sci USA,2006,103:19368-19373.
    [45] Ha, Y, Stevens, D J, Skehel, J J, et al. X-ray structures of H5avian and H9swineinfluenza virus hemagglutinins bound to avian and human receptor analogs. ProcNatl Acad Sci USA,2001,98:11181-11186.
    [46]段炼,李康生.流感病毒的生态研究.国外医学:微生物学分册,2004,27(3):3-5.
    [47] Horimoto, T, Fukuda, N, Iwatsuki-Horimoto, K, et al. Antigenic differences betweenH5N1human influenza viruses isolated in1997and2003. J Vet Med Sci,2004,66:303-305.
    [48] Gubareva, L V, McCullers, J A, Bethell, R C, et al. Characterization of influenzaA/HongKong/156/97(H5N1) virus in a mouse model and protective effect ofzanamivir on H5N1infection in mice. J Infect Dis,1998,178:1592-1596.
    [49] Chotpitayasunondh, T, Ungchusak, K, Hanshaoworakul, W, et al. Human diseasefrom influenza A (H5N1), Thailand,2004. Emerg Infect Dis,2005,11:201-209.
    [50] Hien, T T, Liem, N T, Dung, N T, et al. Avian influenza A (H5N1) in10patients inVietnam. N Engl J Med,2004,350:1179-1188.
    [51] Keawcharoen, J, Oraveerakul, K, Kuiken,T, et al. Avian influenza H5N1in tigersand leopards. Emerg Infect Dis,2004,10:2189-2191.
    [52] de Jong, M D, Cam, B V, Qui, P T, et al. Fatal Avian influenza A (H5N1) in a childpresenting with diarrhea followed by coma. N Engl J Med,2005,352:686-691.
    [53] Yen, H L, Monto, A S, Webster, R G, et al. Virulence may determine the necessaryduration and dosage of oseltamivirtreatment for highly pathogenicA/Vietnam/1203/04influenza virus in mice. J Infect Dis,2005,192:665-672.
    [54] Liu, J, Xiao, H, Lei, F, et al. Highly pathogenic H5N1influenza virus infection inmigratory birds. Science,2005,309:1206.
    [55] Chen, H, Smith, J D, Zhang, S Y, et al. Avian flu: H5N1virus outbreak in migratorywaterfowl. Nature,2005,436:191-192.
    [56] The Writing Committee of the World Health Organization Consultation on HumanInfluenza A/H5Avian influenza A (H5N1) infection in humans. N Engl J Med,2005,353:1374-1385.
    [57] Chen, H, Smith, G J D, Li, K S, et al. Establishment of multiple sublineages ofH5N1influenza virus in Asia: implications for pandemic control. Proc Natl Acad SciUSA,2006,103:2845-2850.
    [58] Lin, Y P, Shaw, M, Gregory, V, et al. Avian-to-human transmission of H9N2subtypeinfluenza a viruses: Relationship between H9N2and H5N1human isolates. ProcNatl Acad Sci USA,2000,97:9654-9658.
    [59] Butt, K M, Smith, G J, Chen, H, et al. Human Infection with an Avian H9N2Influenza A Virus in Hong Kong in2003. Journal of Clinical Microbiology,2005,43:5760-5767.
    [60] Fouchier, R A, Schneeberger, P M, Rozendaal, F W, et al. Avian influenza A virus(H7N7) associated with human conjunctivitis and a fatal case of acute respiratorydistress syndrome. Proc Natl Acad Sci USA,2004,101:1356-1361.
    [61] Yassine, H, Lee, C and Gourapura, R. Interspecies and intraspecies transmission ofinfluenza A viruses: viral, host and environmental factors. Animal Health ResearchReviews,2010,11:53-72.
    [62] Naeve, C W, Hinshaw, V S and Webster, R G. Mutations in the haemagglutininreceptor-binding site can change the biological properties of an influenza virus. J.Virol,1984,51(2):567-569.
    [63] Yamada, S, Suzuki, Y, Suzuki, T, et al. Haemagglutinin mutations responsible for thebinding of H5N1influenza A viruses to human-type receptors. Nature,2006,444:378-382.
    [64] Subbarao, E K, London, W and Murphy, B R. A single amino acid in the PB2geneof influenza A virus is a determinant of host range, J. Virol,1993,67(4):1761-1764.
    [65] Mehle, A and Doudna, J A. An inhibitory activity in human cells restricts thefunction of an avian-like influenza virus polymerase, Cell Host Microbe,2008,4(2):111-122.
    [66] Munster, V J, de Wit, E, van Riel, D, et al. The molecular basis of the pathogenicityof the dutch highly pathogenic human influenza A H7N7viruses, J. Infect. Dis,2007,196:258-265.
    [67] Yamada, S, Hatta, M, Staker, B L, et al. Biological and structural characterization ofa host-adapting amino acid in influenza virus, PLoS Pathog,2010,6(8): e1001034.
    [68] Tarendeau, F, Thibaut, C, Guilligay, D, et al. Host determinant residue lysine627lies on the surface of a discrete, folded domain of influenza virus polymerase PB2subunit. PLoS Pathog,2008,4(8): e1000136.
    [69] Gao, Y, Zhang, Y, Shinya, K, et al. Identification of amino acids in HA and PB2critical for the transmission of H5N1avian influenza viruses in a mammalian host.PLoS Pathog,2009,5(12): e1000709.
    [70] Li, O T W, Chan, M C W, Leung, C S W, et al. Full factorial analysis of mammalianand avian influenza polymerase subunits suggests a role of an efficient polymerasefor virus adaptation. PLoS One,2009,4(5): e5658.
    [71] Klenk, H D, Garten, W and Matrosovich, M. Molecular mechanisms of interspeciestransmission and pathogenicity of influenza viruses: Lessons from the2009pandemic. BioEssays,2011,33:180-188.
    [72] Chen, G W, Chang, S C, Mok, C K, et al. Genomic signatures of human versus avianInfluenza A viruses. Emerg Infect Dis,2006,12(9):1353-1360.
    [73] Kou, Z, Lei, F, Wang, S, et al. Molecular patterns of avian influenza A viruses.Chin. Sci. Bull,2008,53(13):2002-2007.
    [74] Qiang, X and Kou, Z. Prediction of interspecies transmission for avian influenza Avirus based on a back-propagation neural network. Math Comput Model,2010,52:
    [75] Lee, M S and Chen, J S. Predicting Antigenic Variants of Influenza A/H3N2Viruses.Emerg Infect Dis,2004,10(8):1385-1390.
    [76] Lee, M S, Chen, M C, Liao, Y C, et al. Identifying potential immunodominantpositions and predicting antigenic variants of influenza A/H3N2viruses. Vaccine,2007,25:8133–8139.
    [77] Nobusawa, E and Sato, K. Comparison of the mutation rates of human influenza Aand B viruses. J. Virol.2006,80:3675-3678.
    [78] Wiley, D C, Wilson, I A and Skehel, J J. Structural identification of theantibody-binding sites of Hong Kong influenza haemagglutinin and theirinvolvement in antigenic variation. Nature1981,289:373-8.
    [79] Wilson, I A and Cox, N. Structural Basis of Immune recognition of influenza virushemagglutinin. Annu Rev Immunol,1990,8:737-71.
    [80] Bush, R M, Bender, C A, Subbarao, K, et al. Predicting the evolution of humaninfluenza A. Science,1999,286:1921-5.
    [81] Macken, C, Lu, H, Goodman, J, et al. The value of a database in surveillance andvaccine selection. In: Osterhaus ADME, Cox N, Hampson AW, editors. Options forthe control of influenza IV. Amster-dam: Elsevier Science;2001,103-6.
    [82] Wilfred, N, Ned, S W and Simon, A L. Differential neutralization efficiency ofhemagglutinin epitopes, antibody interference, and the design of influenza vaccines.Proc Natl Acad Sci USA,2009,106(21):8701-8706.
    [83] Wood, J M, Oxford, J S, Una, D, et al. Influenza A (H1N1) Vaccine efficacy inanimalmodels is influenced by two amino acid substitutions in the hemagglutininmolecule. Virology1989,171:214-21.
    [84] Newman, R W, Jennings, R, Major, D L, et al. Immune response of humanvolunteers and animals to vaccination with egg-grown influenza A (H1N1) virus isinfluenced by three amino acid substitutions in the haemagglutininmolecule. Vaccine1993,11:400-406.
    [85] Katz, J M and Webster, R G. Efficacy of inactivated influenza A Virus (H3N2)vaccines grown in mammalian cells or embryonated eggs. J Infect Dis1989,160:191-198.
    [86] Kodihalli, S, Justewicz, D M, Gubareva, L V, et al. Selection of a single amino acidsubstitution in the hemagglutinin molecule by chicken eggs can render influenza Avirus (H3) candidate vaccine ineffective. J Virol1995,69:4888-97.
    [87] Jin, H, Zhou, H, Liu, H, et al. Two residues in the hemagglutinin ofA/Fujian/411/02-like influenza viruses are responsible for antigenic drift fromA/Panama/2007/99. Virology,2005,336:113-9.
    [88] Smith, D J, Lapedes, A S, de Jong, J C, et al. Mapping the antigenic and geneticevolution of influenza virus. Science,2004,305:371-6.
    [89] Liao, Y C, Lee, M C, Ko, C Y, et al. Bioinformatics models for predicting antigenicvariants of influenza A/H3N2virus. Bioinformatics,2008,24(4):505-512.
    [90] Huang, J W, King, C C and Yang, J M. Co-evolution positions and rules forantigenic variants of human influenza A/H3N2viruses. BMC Bioinformatics,2009,10(Suppl1): S41.
    [91] Ye, K, Lameijer, E M, Beukers, M W, et al. A two-entropies analysis to identifyfunctional positions in the transmembrane region of class AG protein-coupledreceptors. Proteins,2006,63(4):1018-1030.
    [92] Peter, M K and Vijay, S P. Combining mutual information with structural analysis toscreen for functionally important residues in influnenza hemagglutinin. PMC,2009,492-503.
    [93] Xia, Z, Jin, G, Zhu, J, et al. Using a mutual information-based site transition networkto map the genetic evolution of influenza A/H3N2virus. Bioinformatics,2009,25(18):2309–2317.
    [94] Bao, Y, Bolotov, P, Dernovoy, D, et al, The influenza virus resource at the nationalcenter for biotechnology information. Journal of Virology,2008,82:596-601.
    [95] Squires, B, Macken, C, Garcia-Sastre, A S, et al. BioHealthBase: informatics supportin the elucidation of influenza virus host pathogen interactions and virulence.Nucleic Acids Research,2008,36(suppl1): D497-D503.
    [96] Li, W and Godzik, A. Cd-hit: a fast program for clustering and comparing large setsof protein or nucleotide sequences. Bioinformatics,2006,22:1658-1659.
    [97] Hu, W. Correlated mutations in the four influenza proteins essential for viral RNAsynthesis, host adaptation, and virulence: NP, PA, PB1, and PB2. Natural Science,2010,2:1138-1147.
    [98] Lin, Z and Pan, X. Accurate prediction of protein secondary structural content. JProtein Chem,2001,20:217-220.
    [99] Horne, D. Prediction of protein helix content from an autocorrelation analysis ofsequence hydrophobicities. Biopolymers,1988,27:451-477.
    [100] Li, Z R, Lin, H H, Han, L Y, et al. PROFEAT: a web server for computing structuraland physicochemical features of proteins and peptides from amino acid sequence.Nucleic Acids Res,2006,34: W32–W37.
    [101] Yvan, S, Lnaki, L and Pedro L. A review of feature selection techniques inbioinformatics. Bioinformatics,2007,23(19):2507–2517.
    [102] Kittler, J. Pattern recognition and signal Processing, chapter feature set searchalgorithms. Sijthoff and Noordhoff, Alphen aan den Rijn, Netherlands,1978:41-60.
    [103] Ben-Bassat, M. Pattern recognition and reduction of dimensionality. In Krishnaiah,P.and Kanal,L,(eds.) Handbook of Statistics II, Vol.1.North-Holland, Amsterdam.1982:773-791.
    [104] Jafari, P and Azuaje, F. An assessment of recently published gene expression dataanalyses: reporting experimental design and statistical factors. BMC Med. Inform.Decis. Mak,1982,6:27.
    [105] Thomas, J G, Olson, J M, Tapscott, S J, et al. An efficient and robust statisticalmodeling approach to discover differentially expressed genes using genomicexpression profiles. Genome Res,2001,11:1227-1236.
    [106] Kria, K and Rendell, L A. A practical approach to feature selection. In: MachineLearning: Proceedings of International Conference (ICML’92). Sleeman, D,Edwards, P, Eds, Morgan, Kaufmann,1992,249-256.
    [107] Holland, J. Adaptation in Natural and Artificial Systems. University of MichiganPress, Ann Arbor,1975.
    [108] Edgar, R C. MUSCLE: multiple sequence alignment with high accuracy and highthroughput. Nucleic Acids Res,2004,32(5):1792-1797.
    [109] Waterhouse, A M, Procter, J B, Martin, D M A, et al. Jalview Version2-a multiplesequence alignment editor and analysis workbench. Bioinformatics,2009,25(9):1189-1191.
    [110] Kawashima, S, Pokarowski, P, Pokarowska, M, et al. AAindex: amino acid indexdatabase, progress report2008. Nucleic Acids Res,2008,36:202-205.
    [111] Peng, H C, Long, F H and Ding, C. Feature selection based on mutual information:criteria of max-dependency, max-relevance, and min-redundancy. IEEE T PatternAnal.2005,27(8):1226-1238.
    [112] He, Z, Shi, X H, Kong, X Y, et al. A novel sequence-based method forphosphorylation site prediction with feature selection and analysis. Protein Pept. Lett,2012,19(1):70-8.
    [113] Ludwig, O and Nunes, U. Novel maximum-margin training algorithms forsupervised neural networks, IEEE T Neural Network,2010,21(6):972-984.
    [114] Hu, H J, Harrison, R W, Tai, P C, et al. Understandable learning machine systemdesign for transmembrane or embedded membrane segments prediction, Int J DataMin Bioin,2011,5(1):38-51.
    [115] Chang, C C and Lin, C J. LIBSVM: a library for support vector machines. ACMTIST,2011,2:1-27.
    [116] Bleeker, S E, Moll, H A, Steyerberg, E W, et al. External validation is necessary inprediction research: a clinical example. J Clin Epidemiol.2003,56(9):826–832.
    [117] Jiao, P, Tian, G, Li, Y, et al. A single-amino-acid substitution in the NS1proteinchanges the pathogenicity of H5N1avian influenza viruses in mice. J. Virol,2008,82(3):1146-1154.
    [118] Subbarao, E K, London, W and Murphy, B R. A single amino acid in the PB2geneof influenza A virus is a determinant of host range, J. Virol,1993,67(4):1761-1764.
    [119] Finkelstein, D B, Mukatira, S, Mehta, P K, et al. Persistent host markers in pandemicand H5N1influenza viruses, J. Virol,2007,81(19):10292-10299.
    [120] George, R A and Heringa, J. An analysis of protein domain linkers: theirclassification and role in protein folding. Protein Eng,2003,15(11):871-879.
    [121] Biswas, S K and Nayak, D P. Influenza virus polymerase basic protein1interactswith influenza virus polymerase basic protein2at multiple sites. J. Virol,1996,70(10):6716-6722.
    [122] Ng, A K, Zhang, H, Tan, K, et al. Structure of the influenza virus A H5N1nucleoprotein: implications for RNA binding, oligomerization, and vaccine design.FASEB J. Vol,2008,22:3638-3647.
    [123] Sugiyama, K, Obayashi, E, Kawaguchi, A, et al. Structural insight into the essentialPB1-PB2subunit contact of the influenza virus RNA polymerase. EMBO J,2009,28:1803-1811.
    [124] Qian, X Y, Chien, C Y, Lu, Y, et al. An amino-terminal polypeptide fragment of theinfluenza virus NS1protein possesses specific RNA-binding activity and largelyhelical backbone structure, RNA,1995,1(9):948-956.
    [125] Qu, H N, Li, G Z and Xu, W S. An Asymmetric Classifier based on Partial LeastSquares, Pattern Recognition, Elsevier,2010,43:3448-3457.
    [126] Wu, G and Yan, S M. Mutation trend of hemagglutinin of influenza A virus: a reviewfrom a computational mutation viewpoint. Acta Pharmacol. Sin,2006,27(5):513-526.
    [127] Chen, C, Chen, L, Zou, X, et al. Prediction of protein secondary structure content byusing the concept of chou’s pseudo amino acid composition and support vectormachine. Protein Pept. Lett,2009,16:27-31.
    [128] Li, Y, Carroll, D S, Gardner, S N, et al. On the origin of small-pox: correlatingvariola phylogenics with historical smallpox records. Proc. Natl Acad. Sci. USA2007,104:15787-15792.
    [129] Ndifon, W, Dushoff, J and Levin, S A. On the use of hemagglutination-inhibition forinfluenza surveillance: surveillance data are predictive of influenzavaccineeffectiveness. Vaccine,2009,27:2447-2452.
    [130] Archetti, I and Horsfall, F L. Persistent antigenic variation of influenza A virusesafter incomplete neutralization in ovo with heterologous immune serum. J Exp Med,1950,92:441-462.
    [131] Kilbourne, E D, Johansson, B E and Grajower, B. Independent and disparateevolution in nature of influenza A virus hemagglutinin and neuraminidaseglycoproteins. Proc Natl Acad Sci USA,1990,87:786-90.
    [132] Ellis, J S, Chakraverty, P and Clewley, J P. Genetic and antigenic variation in thehaemagglutinin of recently circulating human influenza A (H3N2) viruses in theUnited Kingdom. Archives of virology,1995,140(11):1889-1904.
    [133] Both, G W, Sleigh, M J, Cox, N J, et al. Antigenic Drift in Influenza Virus-H3
    [134] Coiras, M T, Aguilar, J C, Galiano, M, et al. Rapid molecular analysis of thehaemagglutinin gene of human influenza A H3N2viruses isolated in Spain from1996to2000. Archives of virology,2001,146(11):2133-2147.
    [135] WHO: Weekly Epidemiological Record. http://www.who.int/wer/en/
    [136] Centers for Disease Control and Prevention: Information for the Vaccines andRelated Biological Products Advisory Committee, CBER, FDA.
    [137] Cramer, H. Mathematical methods of statistics. Princeton: Princeton UniversityPress,1946,282.
    [138] Andy Field, Discovering statistics using SPSS. Oriental Press,2009.
    [139] Mardia, K V, Kent, J T and Bibby, J M. Multivariate Analysis. Academic Press,1979.
    [140] He, J W and Zelikovsky, A. MLR-tagging: informative SNP selection for unphasedgenotypes based on multiple linear regression. Bioinformatics,2006,22(20):2558-2561.
    [141] Guermeur, Y, Geourjon, C, Gallinari, P, et al. Improved performance in proteinsecondary structure prediction by inhomogeneous score combination.Bioinformatics,1999,15(5):413-421.
    [142] Zhou, Y. Reconstruction of gene regulatory networks by stepwise multiple linearregression from time-series microarray data. IEEE International Conference onBioinformatics and Biomedicine Workshops (BIBMW),2011,1017-1019.
    [143] Liu, R and Hu, J J. Computational prediction of heme-binding residues by exploitingresidue interaction network. PLoS One,2011,6(10): e25560.
    [144] Porollo, A and Meller, J. Prediction-based fingerprints of protein-protein interactions.Proteins,2007,66:630-645.
    [145] Du, X J, Dong, L B, Lan, Y, et al. Mapping of H3n2infuenza antigenic evolution inChina reveals a strategy for vaccine strain recommendation. Nature Comm,2012,3:709.
    [146] Bush, R M, Fitch, W M, Bender, C A, et al. Positive selection on the H3hemagglutinin gene of human influenza virus A. Mol. Biol. Evol,1999,16(11):1457-1465.
    [147] Fleury, D, Barrere, B, Bizebard, T, et al. A complex of influenza hemagglutinin witha neutralizing antibody that binds outside the virus receptor binding site. NatureStructural Biology,1999,6(6):530-534.
    [148] Huang, J W and Yang, J M. Changed epitopes drive the antigenic drift for influenzaA (H3N2) viruses. BMC Bioinformatics2011,12(Suppl1): S31.
    [149] Wu, A P, Peng, Y S, Du, X J, et al. Correlation of influenza virus excess mortalitywith antigenic variation: application to rapid estimation of influenza mortalityburden. PLoS Comput Biol,2010,6(8): e1000882.
    [150] Cai, Y, Huang, T, Hu, L, et al. Prediction of lysine ubiquitination with mRMRfeature selection and analysis. Amino Acids,2012,42:1387-1395.
    [151] Ng, R T, Sander, J and Sleumer, M C. Hierarchical Cluster Analysis of SAGE Datafor Cancer Profiling. BIOKDD,2001,65-72.
    [152] Ao, S I, Yip, K, Ng, M, et al. CLUSTAG: hierarchical clustering and graph methodsfor selecting tag SNPs. Bioinformatics,2004,21(8):1735-1736.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700