外膜蛋白序列和结构辨识相关问题研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
蛋白质组学是生物信息学的主要研究领域之一。膜蛋白作为被广泛利用的药物靶,是蛋白质组学的重要研究对象。膜蛋白家族中的外膜蛋白,定位于革兰氏阴性细菌、叶绿体和线粒体的外膜,折叠成桶状的跨膜结构,是两类主要的跨膜蛋白之一。外膜蛋白与革兰氏阴性细菌致病性和免疫功能密切相关,是极具研究价值的药物靶,并且参与非特异性调控、物质运输和选择性离子通道形成等物理化学过程。本文以外膜蛋白生物信息学研究为主题,通过对蛋白质序列编码方法、分类算法、结构预测模型的改进和创新,来提高外膜蛋白序列、结构辨识水平,并解决与此相关的部分问题。论文主要研究内容和创新点如下:
     (1)外膜蛋白序列辨识和基因组挖掘方法研究
     研究从其它蛋白质折叠类型中辨识外膜蛋白的方法,主要目的是:应用于基因组内挖掘新的外膜蛋白及其对应的编码基因;为序列分析和结构预测积累新的数据。本文利用分散量理论,提出了基于最小分散增量的外膜蛋白序列辨识方法,并进一步改进为多分散增量预测结果加权投票预测方法。该方法为蛋白质序列辨识提供了易实现和易推广到多类问题的新手段。此外,为满足基因组挖掘外膜蛋白的需要,提出了蛋白质序列多种联合特征编码方法,在联合特征中引入加权的氨基酸指数相关系数特征,并将优选的特征编码方法和支持向量机分类算法结合来建立分类器。无论是数据集上测试还是基因组内挖掘,该方法都达到了目前最好的预测水平,成为有效的外膜蛋白挖掘工具。此外,文章还利用特征选择技术分析了高维联合特征的优化问题,采用过滤方法筛选有效的特征子集,提高了计算速度乃至预测效果。
     (2)多类蛋白质分类算法研究
     支持向量机是具备优异泛化性能的机器学习技术,但是没有很好地解决多类分类问题,存在诸如分类盲区、误差累积等缺点。模糊支持向量机的出现为改进这些缺点提供了新手段。本文采用基于样本紧密度的模糊隶属度计算方法,并同时计算样本作为正例和作为反例的双份误差,重构了支持向量机的最优分类面,建立了“一对一”方式和有向图方式的双向模糊分类器。在解决膜蛋白分类问题时,该分类算法降低了对孤立点和噪声点的敏感性,一定程度上改善了分类效果,是模糊多类支持向量机的新发展。
     (3)外膜蛋白信号肽和拓扑结构联合预测方法研究
     跨膜蛋白拓扑结构预测的意义在于:一是提供从二级结构推测其三级结构的模型框架;二是有利于对二级和三级结构进行修正。现有的外膜蛋白拓扑预测方法,在应用于前体序列预测时,没有提供预测信号肽的功能,并且由于信号肽的影响,拓扑预测性能会下降。本文应用隐Markov模型理论,建立了外膜蛋白前体序列信号肽和拓扑结构联合预测模型,使得在模型中信号肽成为拓扑结构的一部分,并利用最新的知识优化模型结构。该预测模型具备了目前最好的外膜蛋白拓扑预测性能,并成为集信号肽剪切位点预测、拓扑预测和序列辨识功能于一体的便利工具。
     (4)跨膜蛋白亚细胞定位预测方法研究
     现有的大部分蛋白质亚细胞定位预测方法,针对水溶性蛋白的特性而设计,不能有效预测跨膜蛋白的亚细胞位置。而基于隐Markov模型的拓扑结构预测方法,虽然利用了跨膜拓扑信息,但是没有提供亚细胞定位预测功能。本文对跨膜蛋白拓扑预测模型进行改造,使之成为亚细胞定位预测工具,在预测细胞分泌路径上跨膜蛋白的亚细胞位置时,具有显著高于普通预测方法的性能,填补了跨膜蛋白亚细胞定位预测的空白,并且为拓扑预测器开辟了新的应用方向。
     (5)调控外膜蛋白的非编码小RNA预测方法研究
     非编码小RNA预测是具有重大生物学价值的难点问题。目前还没有专门预测调控某一类蛋白质的非编码小RNA的方法。本文提出了主成分分析-神经网络预测模型。该模型通过主成分分析去除特征相关、降低特征维数,改善了神经网络预测器的性能,成为辨识细菌非编码小RNA的有效工具;此外,考虑到碱基配对是非编码小RNA与外膜蛋白mRNA作用的主要方式,设计了两级筛选系统预测调控外膜蛋白的非编码小RNA。该系统通过碱基配对打分函数来搜索基因组内与已知外膜蛋白mRNA以高分值进行配对的非编码区域,然后利用主成分分析-神经网络预测模型过滤搜索结果中的大部分冗余。其优点是可以降低实验筛选的成本,并提供少冗余的实验对象。
Proteomics occupies one of main fields of bioinformatics research. The research on membrane proteins takes a remarkable station in proteomics, because of the importance of membrane proteins as drug targets for disease treating and as main functional components in boimembranes. As an especial family of membrane protein, outer membrane proteins (OMPs) reside in the outer membranes of gram-negative bacteria、chloroplasts and mitochondria, and a majority of them fold into beta-barrel structures by 8-22 beta-strands, and compose themselves to two transmembrane protein types together with alpha-helical membrane proteins. OMPs perform a variety of functions, such as mediating non-specific, passive transport of ions and small molecules, selectively allowing the passage of molecules, and are involved in voltage-dependent anion channels. Further, OMPs relate to bacterial adhesion, toxicity release and immunity, and so are becoming valuable drug targets for anti-gram-negative bacteria. Discriminating sequences and structures of OMPs are keeping challenges because of difficulties in experimental validation and structural resolution of them. Various computational approaches are emerging for solving these problems. Focus on the topic of OMPs bioinformatics, this dissertation refers to studies on protein sequence encoding, developing classification algorithms and designing new models, for improving accuracy of discriminating OMPs and for solving other relevant problems. The main contents and contributions of the dissertation are summarized as follows:
     (1) The research on new approaches for discrimination of OMPs from other protein folding types, and for OMPs mining in genomes.
     There are two main application fields of OMPs discrimination methods:the first is mining of new OMPs and corresponding genes in genomes; the second is accumulating new data for predicting secondary and tertiary structures of OMPs. Two new approaches have been developed for discrimination of OMPs in this research. One of them is a prediction method based on the theory of measures of diversity in biomathematics. In this method, the increment of diversity is used for measuring differences between OMPs and other proteins. This method is easy for implement and to extend for multiclass protein classification. Another of them is developed on the basis of combined sequence features and support vector machine algorithms (SVM). In this method, a protein sequence is encoded by a combined feature encoding scheme, which combines weighted amino acid index correlation coefficient with amino acid composition and dipeptide composition. This method performs better than existing methods in literature for discrimination of OMPs, which provides an effective tool for new OMPs mining in genomes. Furthermore, feature selection techniques are studied for improvement of the combined feature encoding scheme. A filter method has been presented to select the most effectual features in combined features, which is helpful for accelerating the classification process, and even for improvement of prediction performance.
     (2) The research on algorithms for multiclass protein classification problems
     SVMs often perform better than other machine learning techniques in binary classifications. But some problems are keeping unsolved for multiclass SVMs, such as blind regions and errors cumulation. Therefore, several fuzzy SVM algorithms have been introduced to improve multicass SVMs in literature. This reaserch presents a bidirectional fuzzy SVM algorithm, which treats each sample not only as a positive sample but a negative sample. In this algorithm, a sample contributes double errors from being positive and being negative. Further, the fuzzy membership is defined by not only the relation between a sample and its cluster center, but also those among samples, which is described by the fuzzy connectedness among samples. The bidirectional fuzzy SVM algorithm is implemented by "one-vs-one" frames or Directed Graph frames. In tests of membrane protein classification, it is not sensitive to outliers or noises, and outperforms traditional "one-vs-rest" and "one-vs-one" multicalss SVMs.
     (3) The research on methods for combined prediction of signal peptides and topologies of OMPs
     The topology prediction of transmembrane proteins contributes to two aspects: firstly, it offers a frame from secondary structures of OMPs to investigate their tertiary structures; secondly, it is helpful for revising the structural prediction of OMPs. However, existing topology predictors can not predict signal peptide of OMPs precursors. At the same time, the accuracy of them will decline because of the influence of signal peptide sequences. A predictor based on hidden Markov models is developed for combined prediction of signal peptides and topologies of OMPs in this research. In the model, the signal peptide is treated as a part of the whole topology of an OMP precursor, and the architecture is optimized to fit the natural structure of OMPs. This model performs better than other models for topology prediction, and further can be applied for signal peptide prediction and discrimination of OMPs in genomes.
     (4) The research on methods for transmembrane protein subcellular localization prediction
     Existing methods for protein subcellular localization prediction are mainly designed for soluble proteins, and usually are not accurate for transmembrane proteins. On the other hand, all topology predictors are designed for transmembrane proteins but are not available for subcellular localization prediction. This research described a new approach to predict subcellular localization of transmembrane proteins, which is an alteration of existing topology predictors, and can give better accuracy than existing methods. It is the only approach for transmembrane proteins subcellular localization prediction, and is also a new application of topology predictors.
     (5) The research on methods for recognizing small non-coding RNAs in OMPs regulation
     Prediction of small non-coding RNAs (sRNAs) for regulation is a difficult problem with grand biological value. There is not an approach has been presented for prediction of sRNAs which regulate a given protein type. This research describes a method for prediction of bacterial sRNAs. In this method, a principal component analysis (PCA) process is performed to reduce dimensions and eliminate the correlation of sRNA sequence features, and a BP neural network (NN) is constructed for classification. This PCA-NN classifier can effectively predict bacterial sRNAs, and thus is adopted in a two-phase filtering system for prediction of sRNA regulators of OMPs. The two-phase system searches non-coding regions for sRNA candidates by a base pair scoring between OMP mRNAs and genomic non-coding regions in the first phase, and then filters redundant candidates using the PCA-NN classifier in the second phase. The prediction system can provide less redundant objects for experiments than general methods.
引文
[1]Blattner F R, Plunkett G 3, Bloch C A, et al. The complete genome sequence of Escherichia coli K-12. Science,1997,277(5331):1453-1474.
    [2]Kunst F, Ogasawara N, Moszer I, et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature,1997,390(6657):249-256.
    [3]Olivier M, Aggarwal A, Allen J, et al. A high-resolution radiation hybrid map of the human genome draft sequence. Science,2001,291(5507):1298-1302.
    [4]Venter J C, Adams M D, Myers E W, et al. The sequence of the human genome. Science,2001,291(5507):1304-1351.
    [5]Holt R A, Subramanian G M, Halpern A, et al. The genome sequence of the malaria mosquito Anopheles gambiae. Science,2002,298(5591):129-149.
    [6]Waterston R H, Lindblad-toh K, Birney E, et al. Initial sequencing and comparative analysis of the mouse genome. Nature,2002,420(6915):520-562.
    [7]Aparicio S, Chapman J, Stupka E, et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science,2002,297(5585):1301-1310.
    [8]Gibbs R A, Weinstock G M, Metzker M L, et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature,2004,428(6982): 493-521.
    [9]Baxevanis A D, Ouellette B F F. Bioinformatics:a practical guide to the analysis of genes and proteins. John Wiely & Sons,Inc.,1998.
    [10]张春霆.生物信息学的现状与展望.世界科技研究与发展,2000,22(1):17-20.
    [11]Kanehisa M. Post-genome informatics. Oxford University Press,2001.
    [12]Mount D W. Bioinformatics sequence and genome analysis. Cold Spring Harbor Laboratory Press,2001.
    [13]Baldi P, Brunak S. Bioinformatics,2nd Edition:The Machine Learning Approach. The MIT Press,2001.
    [14]张成岗,贺福初.生物信息学方法与实验.北京:科学出版社,2002.
    [15]郝柏林,张叔誉.生物信息学(第二版).上海:上海科技出版社,2002.
    [16]Jiang T, Xu Y, Zhang M Q. Current topics in computational molecular biology. Tsinghua University Press and The MIT Press,2002.
    [17]Flippin R. Best American Political Writing 2004. Thunder's Mouth Press,2004. 400.
    [18]帕夫纳.计算分子生物学:算法逼近.北京:化学工业出版社,2004.
    [19]马尔科姆坎贝尔等著,孙之荣主译.探索基因组学、蛋白质组学和生物信息学.北京:科学出版社,2004.
    [20]孙啸,陆祖宏,谢建明.生物信息学基础.北京:清华大学出版社,2005.
    [21]卢因.基因Ⅷ.北京:科学出版社,2005.
    [22]利布莱尔.蛋白质组学导论:生物学的新工具.北京:科学出版社,2005.
    [23]GenBank. Http://www.ncbi.nlm.nih.gov/Genbank/index.html/.
    [24]EMBL. Http://www.embl.org/.
    [25]DDBJ. Http://www.ddbj.nig.ac.jp/.
    [26]UniProt. http://www.expasy.uniprot.org/.
    [27]PIR. Http://pir.georgetown.edu/.
    [28]Rfam. http://www.sanger.ac.uk/Software/Rfam/.
    [29]Pfam. http://www.sanger.ac.uk/Software/Pfam/.
    [30]NONCODE. Http://www.noncode.org/.
    [31]ProSite. http://www.expasy.ch/prosite/.
    [32]PDB. Http://www.rcsb.org.
    [33]ModBase. http://modbase.compbio.ucsf.edu/modbase-cgi/index.cgi.
    [34]ProDom. http://prodom.prabi.fr/prodom/current/html/home.php.
    [35]RNAstructure. http://rna.urmc.rochester.edu/rnastructure.html.
    [36]ANTHEPROT. http://antheprot-pbil.ibcp.fr/.
    [37]PHYLIP. http://evolution.genetics.washington.edu/phylip.html.
    [38]SignalP. http://www.cbs.dtu.dk/services/SignalP/.
    [39]TargetP. http://www.cbs.dtu.dk/services/TargetP/.
    [40]Hopkins A L, Groom C R. The druggable genome. Nat Rev Drug Discov,2002, 1(9):727-730.
    [41]Gurrath M. Peptide-binding G protein-coupled receptors:new opportunities for drug design. Curr Med Chem,2001,8(13):1605-1648.
    [42]Schulz G E. beta-Barrel membrane proteins. Curr Opin Struct Biol,2000,10(4): 443-447.
    [43]Schulz G E. The structure of bacterial outer membrane proteins. Biochim Biophys Acta,2002,1565(2):308-317.
    [44]Guex N, Diemand A, Peitsch M C. Protein modelling for all. Trends Biochem Sci,1999,24(9):364-367.
    [45]Guex N, Peitsch M C. SWISS-MODEL and the Swiss-PdbViewer:an environment for comparative protein modeling. Electrophoresis,1997,18(15): 2714-2723.
    [46]Peitsch M C, Schwede T, Guex N. Automated protein modelling-the proteome in 3D. Pharmacogenomics,2000,1(3):257-266.
    [47]Marti-renom M A, Stuart A C, Fiser A, et al. Comparative protein structure modeling of genes and genomes. Annu Rev Biophys Biomol Struct,2000,29: 291-325.
    [48]Fischer D, Eisenberg D. Protein fold recognition using sequence-derived predictions. Protein Sci,1996,5(5):947-955.
    [49]Krogh A, Larsson B, Von H G, et al. Predicting transmembrane protein topology with a hidden Markov model:application to complete genomes. J Mol Biol,2001, 305(3):567-580.
    [50]Tusnady G E, Simon I. The HMMTOP transmembrane topology prediction server. Bioinformatics,2001,17(9):849-850.
    [51]Kall L, Krogh A, Sonnhammer E L. A combined transmembrane topology and signal peptide prediction method. J Mol Biol,2004,338(5):1027-1036.
    [52]孟朝晖.筒型外膜蛋白质生物信息学.北京:国防工业出版社,2007.
    [53]Tusnady G E, Dosztanyi Z, Simon I. PDB_TM:selection and membrane localization of transmembrane proteins in the protein data bank. Nucleic Acids Res,2005,33(Database issue):275-278.
    [54]Tusnady G E, Dosztanyi Z, Simon I. TMDET:web server for detecting transmembrane regions of proteins by using their 3D coordinates. Bioinformatics, 2005,21(7):1276-1277.
    [55]Ikeda M, Arai M, Okuno T, et al. TMPDB:a database of experimentally-characterized transmembrane topologies. Nucleic Acids Res, 2003,31(1):406-409.
    [56]Raman P, Cherezov V, Caffrey M. The Membrane Protein Data Bank. Cell Mol Life Sci,2006,63(1):36-51.
    [57]Jayasinghe S, Hristova K, White S H. MPtopo:A database of membrane protein topology. Protein Sci,2001,10(2):455-458.
    [58]Hubbard T J, Murzin A G, Brenner S E, et al. SCOP:a structural classification of proteins database. Nucleic Acids Res,1997,25(1):236-239.
    [59]Gardy J L, Spencer C, Wang K, et al. PSORT-B:Improving protein subcellular localization prediction for Gram-negative bacteria. Nucleic Acids Res,2003, 31(13):3613-3617.
    [60]Gardy J L, Laird M R, Chen F, et al. PSORTb v.2.0:expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics,2005,21(5):617-623.
    [61]Gromiha M M, Yabuki Y, Kundu S, et al. TMBETA-GENOME:database for annotated beta-barrel membrane proteins in genomic sequences. Nucleic Acids Res,2007,35(Database issue):314-316.
    [62]Florea L, Hartzell G, Zhang Z, et al. A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res,1998,8(9):967-974.
    [63]Kent W J. BLAT-the BLAST-like alignment tool. Genome Res,2002,12(4): 656-664.
    [64]Gnanasekaran T V, Peri S, Arockiasamy A, et al. Profiles from structure based sequence alignment of porins can identify beta stranded integral membrane proteins. Bioinformatics,2000,16(9):839-842.
    [65]Zhai Y, Saier M H. The beta-barrel finder (BBF) program, allowing identification of outer membrane beta-barrel proteins encoded within prokaryotic genomes. Protein Sci,2002,11(9):2196-2207.
    [66]Wimley W C. Toward genomic identification of beta-barrel membrane proteins: composition and architecture of known structures. Protein Sci,2002,11(2): 301-312.
    [67]Schleiff E, Eichacker L A, Eckart K, et al. Prediction of the plant beta-barrel proteome:a case study of the chloroplast outer envelope. Protein Sci,2003,12(4): 748-759.
    [68]Berven F S, Flikka K, Jensen H B, et al. BOMP:a program to predict integral beta-barrel outer membrane proteins encoded within genomes of Gram-negative bacteria. Nucleic Acids Res,2004,32(Web Server issue):394-399.
    [69]Gromiha M M, Ahmad S, Suwa M. TMBETA-NET:discrimination and prediction of membrane spanning beta-strands in outer membrane proteins. Nucleic Acids Res,2005,33(Web Server issue):164-167.
    [70]Gromiha M M. Motifs in outer membrane protein sequences:applications for discrimination. Biophys Chem,2005,117(1):65-71.
    [71]Marani P, Wagner S, Baars L, et al. New Escherichia coli outer membrane proteins identified through prediction and experimental verification. Protein Sci, 2006,15(4):884-889.
    [72]Mirus O, Schleiff E. Prediction of beta-barrel membrane proteins by searching for restricted domains. BMC Bioinformatics,2005,6:254.
    [73]Pajon R, Yero D, Lage A, et al. Computational identification of beta-barrel outer-membrane proteins in Mycobacterium tuberculosis predicted proteomes as putative vaccine candidates. Tuberculosis (Edinb),2006,86(3-4):290-302.
    [74]Aguirre-hernandez R, Hoos H H, Condon A. Computational RNA secondary structure design:empirical complexity and improved methods. BMC Bioinformatics,2007,8:34.
    [75]Eskin E, Pevzner P A. Finding composite regulatory patterns in DNA sequences. Bioinformatics,2002,18 Suppl 1:354-363.
    [76]Feng Z P. An overview on predicting the subcellular location of a protein. In Silico Biol,2002,2(3):291-303.
    [77]Bagos P G, Liakopoulos T D, Spyropoulos I C, et al. A Hidden Markov Model method, capable of predicting and discriminating beta-barrel outer membrane proteins. BMC Bioinformatics,2004,5:29.
    [78]Bigelow H R, Petrey D S, Liu J, et al. Predicting transmembrane beta-barrels in proteomes. Nucleic Acids Res,2004,32(8):2566-2577.
    [79]Deng Y, Liu Q, Li Y X. Scoring hidden Markov models to discriminate beta-barrel membrane proteins. Comput Biol Chem,2004,28(3):189-194.
    [80]Garrow A G, Agnew A, Westhead D R. TMB-Hunt:a web server to screen sequence sets for transmembrane beta-barrel proteins. Nucleic Acids Res,2005, 33(Web Server issue):188-192.
    [81]Garrow A G, Westhead D R. A consensus algorithm to screen genomes for novel families of transmembrane beta barrel proteins. Proteins,2007,69(1):8-18.
    [82]Gromiha M M, Suwa M. Discrimination of outer membrane proteins using machine learning algorithms. Proteins,2006,63(4):1031-1037.
    [83]Gromiha M M, Yabuki Y, Suwa M. TMB finding pipeline:novel approach for detecting beta-barrel membrane proteins in genomic sequences. J Chem Inf Model,2007,47(6):2456-2461.
    [84]Martelli P L, Fariselli P, Krogh A, et al. A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics, 2002,18 Suppl 1:46-53.
    [85]Natt N K, Kaur H, Raghava G P. Prediction of transmembrane regions of beta-barrel proteins using ANN-and SVM-based methods. Proteins,2004,56(1): 11-18.
    [86]Garrow A G, Agnew A, Westhead D R. TMB-Hunt:an amino acid composition based method to screen proteomes for beta-barrel transmembrane proteins. BMC Bioinformatics,2005,6:56.
    [87]Yarov-yarovoy V, Schonbrun J, Baker D. Multipass membrane protein structure prediction using Rosetta. Proteins,2006,62(4):1010-1025.
    [88]Xu Y, Xu D. Protein threading using PROSPECT:design and evaluation. Proteins,2000,40(3):343-354.
    [89]Xu J B, Li M, Kim D, et al. Raptor:Optimal Protein Threading By Linear Programming. Journal of Bioinformatics and Computational Biology,2003,1(1): 95-117.
    [90]Gromiha M M, Majumdar R, Ponnuswamy P K. Identification of membrane spanning beta strands in bacterial porins. Protein Eng,1997,10(5):497-500.
    [91]Gromiha M M, Suwa M. Variation of amino acid properties in all-beta globular and outer membrane protein structures. Int J Biol Macromol,2003,32(3-5): 93-98.
    [92]Merkel J S, Regan L. Aromatic rescue of glycine in beta sheets. Fold Des,1998, 3(6):449-455.
    [93]Tamm L K, Arora A, Kleinschmidt J H. Structure and assembly of beta-barrel membrane proteins. J Biol Chem,2001,276(35):32399-32402.
    [94]Tamm L K, Hong H, Liang B. Folding and assembly of beta-barrel membrane proteins. Biochim Biophys Acta,2004,1666(1-2):250-263.
    [95]Diederichs K, Freigang J, Umhau S, et al. Prediction by a neural network of outer membrane beta-strand protein topology. Protein Sci,1998,7(11):2413-2420.
    [96]Jacoboni I, Martelli P L, Fariselli P, et al. Prediction of the transmembrane regions of beta-barrel membrane proteins with a neural network-based predictor. Protein Sci,2001,10(4):779-787.
    [97]Bagos P G, Liakopoulos T D, Hamodrakas S J. Evaluation of methods for predicting the topology of beta-barrel outer membrane proteins and a consensus prediction method. BMC Bioinformatics,2005,6:7.
    [98]Liu Q, Zhu Y S, Wang B H, et al. A HMM-based method to predict the transmembrane regions of beta-barrel membrane proteins. Comput Biol Chem, 2003,27(1):69-76.
    [99]Bagos P G, Liakopoulos T D, Spyropoulos I C, et al. PRED-TMBB:a web server for predicting the topology of beta-barrel outer membrane proteins. Nucleic Acids Res,2004,32(Web Server issue):400-404.
    [100]Ahn C S, Yoo S J, Park H S. Prediction for beta-barrel Transmembrane Protein region using HMM. KISS,2003,30(2):802-804.
    [101]Huang K S, Bay ley H, Liao M J, et al. Refolding of an integral membrane protein. Denaturation, renaturation, and reconstitution of intact bacteriorhodopsin and two proteolytic fragments. J Biol Chem,1981,256(8):3802-3809.
    [102]Dornmair K, Kiefer H, Jahnig F. Refolding of an integral membrane protein. OmpA of Escherichia coli. J Biol Chem,1990,265(31):18907-18911.
    [103]Surrey T, Jahnig F. Refolding and oriented insertion of a membrane protein into a lipid bilayer. Proc Natl Acad Sci U S A,1992,89(16):7457-7461.
    [104]Surrey T, Jahnig F. Kinetics of folding and membrane insertion of a beta-barrel membrane protein. J Biol Chem,1995,270(47):28199-28203.
    [105]Kleinschmidt J H, Tamm L K. Folding intermediates of a beta-barrel membrane protein. Kinetic evidence for a multi-step membrane insertion mechanism. Biochemistry,1996,35(40):12993-13000.
    [106]Kleinschmidt J H, Wiener M C, Tamm L K. Outer membrane protein A of E. coli folds into detergent micelles, but not in the presence of monomeric detergent. Protein Sci,1999,8(10):2065-2071.
    [107]Kleinschmidt J H, Tamm L K. Time-resolved distance determination by tryptophan fluorescence quenching:probing intermediates in membrane protein folding. Biochemistry,1999,38(16):4996-5005.
    [108]Kleinschmidt J H, Den B T, Driessen A J, et al. Outer membrane protein A of Escherichia coli inserts and folds into lipid bilayers by a concerted mechanism. Biochemistry,1999,38(16):5006-5016.
    [109]Bulieris P V, Behrens S, Hoist O, et al. Folding and insertion of the outer membrane protein OmpA is assisted by the chaperone Skp and by lipopolysaccharide. J Biol Chem,2003,278(11):9092-9099.
    [110]Visudtiphole V, Thomas M B, Chalton D A, et al. Refolding of Escherichia coli outer membrane protein F in detergent creates LPS-free trimers and asymmetric dimers. Biochem J,2005,392(Pt 2):375-381.
    [111]Liang B, Tamm L K. Structure of outer membrane protein G by solution NMR spectroscopy. Proc Natl Acad Sci U S A,2007,104(41):16140-16145.
    [112]Storz G, Altuvia S, Wassarman K M. An abundance of RNA regulators. Annu Rev Biochem,2005,74:199-217.
    [113]Vogel J, Papenfort K. Small non-coding RNAs and the bacterial outer membrane. Curr Opin Microbiol,2006,9(6):605-611.
    [114]Guillier M, Gottesman S, Storz G. Modulating the outer membrane with small RNAs. Genes Dev,2006,20(17):2338-2348.
    [115]Davis B M, Waldor M K. RNase E-dependent processing stabilizes MicX, a Vibrio cholerae sRNA. Mol Microbiol,2007,65(2):373-385.
    [116]Delihas N, Forst S. MicF:an antisense RNA gene involved in response of Escherichia coli to global stress factors. J Mol Biol,2001,313(1):1-12.
    [117]Castillo-keller M, Vuong P, Misra R. Novel mechanism of Escherichia coli porin regulation. J Bacteriol,2006,188(2):576-586.
    [118]Chen S, Zhang A, Blyn L B, et al. MicC, a second small-RNA regulator of Omp protein expression in Escherichia coli. J Bacteriol,2004,186(20):6689-6697.
    [119]Johansen J, Rasmussen A A, Overgaard M, et al. Conserved small non-coding RNAs that belong to the sigmaE regulon:role in down-regulation of outer membrane proteins. J Mol Biol,2006,364(1):1-8.
    [120]Douchin V, Bohn C, Bouloc P. Down-regulation of porins by a small RNA bypasses the essentiality of the regulated intramembrane proteolysis protease RseP in Escherichia coli. J Biol Chem,2006,281(18):12253-12259.
    [121]Rasmussen A A, Eriksen M, Gilany K, et al. Regulation of ompA mRNA stability:the role of a small regulatory RNA in growth phase-dependent control. Mol Microbiol,2005,58(5):1421-1429.
    [122]Pfeiffer V, Sittka A, Tomer R, et al. A small non-coding RNA of the invasion gene island (SPI-1) represses outer membrane protein synthesis from the Salmonella core genome. Mol Microbiol,2007,66(5):1174-1191.
    [123]Guillier M, Gottesman S. Remodelling of the Escherichia coli outer membrane by two small regulatory RNAs. Mol Microbiol,2006,59(1):231-247.
    [124]Chen S, Lesnik E A, Hall T A, et al. A bioinformatics based approach to discover small RNA genes in the Escherichia coli genome. Biosystems,2002,65(2-3): 157-177.
    [125]Saetrom P, Sneve R, Kristiansen K I, et al. Predicting non-coding RNA genes in Escherichia coli with boosted genetic programming. Nucleic Acids Res,2005, 33(10):3263-3270.
    [126]Wang C, Ding C, Meraz R F, et al. PSoL:a positive sample only learning algorithm for finding non-coding RNA genes. Bioinformatics,2006,22(21): 2590-2596.
    [127]Livny J, Fogel M A, Davis B M, et al. sRNAPredict:an integrative computational approach to identify sRNAs in bacterial genomes. Nucleic Acids Res,2005,33(13):4096-4105.
    [128]Tjaden B, Goodwin S S, Opdyke J A, et al. Target prediction for small, noncoding RNAs in bacteria. Nucleic Acids Res,2006,34(9):2791-2802.
    [129]Henderson R, Unwin P N. Three-dimensional model of purple membrane obtained by electron microscopy. Nature,1975,257(5521):28-32.
    [130]Deisenhofer J, Epp O, Miki K, et al. X-ray structure analysis of a membrane protein complex. Electron density map at 3 A resolution and a model of the chromophores of the photosynthetic reaction center from Rhodopseudomonas viridis. J Mol Biol,1984,180(2):385-398.
    [131]Johansen G, Linderstrom-lang K. Liberation, diffusion, and precipitation of phosphate in the Gomori test. Acta Med Scand Suppl,1952,266:601-613.
    [132]Levitt M, Chothia C. Structural patterns in globular proteins. Nature,1976, 261(5561):552-558.
    [133]Wimley W C. The versatile beta-barrel membrane protein. Curr Opin Struct Biol, 2003,13(4):404-411.
    [134]Sansom M S, Kerr I D. Transbilayer pores formed by beta-barrels:molecular modeling of pore structures and properties. Biophys J,1995,69(4):1334-1343.
    [135]Forst D, Welte W, Wacker T, et al. Structure of the sucrose-specific porin ScrY from Salmonella typhimurium and its complex with sucrose. Nat Struct Biol, 1998,5(1):37-46.
    [136]Sansom M S, Kerr I D, Breed J, et al. Water in channel-like cavities:structure and dynamics. Biophys J,1996,70(2):693-702.
    [137]Mannella C A. Conformational changes in the mitochondrial channel protein, VDAC, and their functional implications. J Struct Biol,1998,121(2):207-218.
    [138]Buchanan S K. Beta-barrel proteins from bacterial outer membranes:structure, function and refolding. Curr Opin Struct Biol,1999,9(4):455-461.
    [139]Buchanan S K, Smith B S, Venkatramani L, et al. Crystal structure of the outer membrane active transporter FepA from Escherichia coli. Nat Struct Biol,1999, 6(1):56-63.
    [140]Giangaspero A, Sandri L, Tossi A. Amphipathic alpha helical antimicrobial peptides. Eur J Biochem,2001,268(21):5589-5600.
    [141]Guina T, Yi E C, Wang H, et al. A PhoP-regulated outer membrane protease of Salmonella enterica serovar typhimurium promotes resistance to alpha-helical antimicrobial peptides. J Bacteriol,2000,182(14):4077-4086.
    [142]Fantinatti F, Silveira W D, Castro A F. Characteristics associated with pathogenicity of avian septicaemic Escherichia coli strains. Vet Microbiol,1994, 41(1-2):75-86.
    [143]赵香汝,杨汉春.细菌外膜蛋白的研究现状.中国兽医杂志,1997,23(2):41-42.
    [144]Rautemaa R, Meri S. Complement-resistance mechanisms of bacteria. Microbes Infect,1999,1(10):785-794.
    [145]Kraiczy P, Skerka C, Kirschfink M, et al. Mechanism of complement resistance of pathogenic Borrelia burgdorferi isolates. Int Immunopharmacol,2001,1(3): 393-401.
    [146]Koronakis V, Sharff A, Koronakis E, et al. Crystal structure of the bacterial membrane protein TolC central to multidrug efflux and protein export. Nature, 2000,405(6789):914-919.
    [147]Gromiha M M, Suwa M. A simple statistical method for discriminating outer membrane proteins with better accuracy. Bioinformatics,2005,21(7):961-968.
    [148]Li W, Godzik A. Cd-hit:a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics,2006,22(13):1658-1659.
    [149]Gromiha M M, Suwa M. Influence of amino acid properties for discriminating outer membrane proteins at better accuracy. Biochim Biophys Acta,2006, 1764(9):1493-1497.
    [150]Murzin A G, Lesk A M, Chothia C. Principles determining the structure of beta-sheet barrels in proteins. I. A theoretical analysis. J Mol Biol,1994,236(5): 1369-1381.
    [151]Jackups R J, Liang J. Interstrand pairing patterns in beta-barrel membrane proteins:the positive-outside rule, aromatic rescue, and strand registration prediction. J Mol Biol,2005,354(4):979-993.
    [152]Gromiha M M, Ahmad S, Suwa M. Application of residue distribution along the sequence for discriminating outer membrane proteins. Comput Biol Chem,2005, 29(2):135-142.
    [153]Gromiha M M, Ahmad S, Suwa M. Neural network-based prediction of transmembrane beta-strand segments in outer membrane proteins. J Comput Chem,2004,25(5):762-767.
    [154]Pasquier C, Hamodrakas S J. An hierarchical artificial neural network system for the classification of transmembrane proteins. Protein Eng,1999,12(8):631-634.
    [155]Kahsay R Y, Gao G, Liao L. An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics,2005,21(9):1853-1858.
    [156]Laxton R R. The measure of diversity. J Theor Biol,1978,70(1):51-67.
    [157]徐克学.生物数学.北京:科学出版社,1999.
    [158]Ponnuswamy P K, Gromiha M M. Hydrophobic characteristics of folded proteins. Prog. Biophys. Mol. Biol.,1993,59:57-103.
    [159]Boser B E, Guyon I M, Vapnik V N. A training algorithm for optimal margin classifiers. Pittsburgh, Pennsylvania, United States:ACM New York, NY, USA,1992.
    [160]Courant R, Hilbert D. Methods of mathematical physics. NY:J.Wiley,1953.
    [161]Lee Y J, Mangasarian O L. SSVM:A Smooth Support Vector Machine for Classification. Computational Optimization and Applications,2001,20:5-22.
    [162]Hastie T, Rosset S, Tibshirani R, et al. The Entire Regularization Path for the Support Vector Machine. Journal of Machine Learning Research,2004,5: 1391-1415.
    [163]Schittkowski K. Optimal parameter selection in support vector machines. Journal of Industrial and Management Optimization,2005,1:465-476.
    [164]Hsu C W, Lin C J. A simple decomposition method for support vector machine. Machine Learning,2002,46(1):291-314.
    [165]Lavalle S M, Branicky M S. On the relationship between classical grid search and probabilistic roadmaps. International Journal of Robotics Research,2002, 23(7):673-692.
    [166]Juretic D, Zucic D, Lucic B, et al. Preference functions for prediction of membrane-buried helices in integral membrane proteins. Comput Chem,1998, 22(4):279-294.
    [167]Radzicka A. Comparing the Polarities of the Amino Acids:Side Chain Distribution Coefficients Between the Vapor Phase, Cyclohexane,1-Octanol and Neutral Aqueous Solution. Biochemistry,1988,27:1664-1670.
    [168]Eisenberg D, Mclachlan A D. Solvation energy in protein folding and binding. Nature,1986,319(6050):199-203.
    [169]Webb A R. Statistical pattern recognition. New York:John Wiley & Sons,2002.
    [170]Kullback S, Leibler R A. On information and sifficiency. Annals of mathematical statistics,1951,22:79-86.
    [171]Saeys Y. Feature selection for classification of nucleic acid sequences:学位论文. Ghent:Ghent University,2004.
    [172]Weston J, Watkins C. Support vector machines for multi-class pattern recognition. Bruges:Facto Press,1999.
    [173]Bose R, Ray-chaudhuri D. On a class of error correcting binary group codes. Information and Control,1960,3:68-79.
    [174]Dietterich T G, Bakiri G. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research,1995,2: 263-286.
    [175]Francesco R, David W. Eorror Correcting Output Codes for local Learners. Chenitz Germany,1998,4:21-24.
    [176]Platt J C, Cristianini N, Shawe-taylor J. Advances in Neural Information Processing Systems,2000,12:547-553.
    [177]Lin C F, Wang S D. Fuzzy support vector machines. IEEE Trans on Neural Networks,2002,12(2):464-471.
    [178]Daisuke T, Shigeo A. Fuzzy lest squares support vector machines for multiclass problems. Neural Networks,2003,16(5):785-792.
    [179]Chou K C, Elrod D W. Prediction of membrane protein types and subcellular locations. Proteins,1999,34(1):137-153.
    [180]Rozman E. Metabolism of ebrotidine. A review. Journal of Chemistry,1997, 47(4A):486-489.
    [181]Sternberg M J, Thornton J M. On the conformation of proteins:the handedness of the connection between parallel beta-strands. J Mol Biol,1977,110(2): 269-283.
    [182]Ptitsyn O B, Finkelstein A V. Similarities of protein topologies:evolutionary divergence, functional convergence or principles of folding? Q Rev Biophys, 1980,13(3):339-386.
    [183]Stirk H J, Woolfson D N, Hutchinson E G, et al. Depicting topology and handedness in jellyroll structures. FEBS Lett,1992,308(1):1-3.
    [184]Flores T P, Moss D S, Thornton J M. An algorithm for automatically generating protein topology cartoons. Protein Eng,1994,7(1):31-37.
    [185]Efimov A V. Structural trees for protein superfamilies. Proteins,1997,28(2): 241-260.
    [186]Mezey P G. Topological shape analysis of chain molecules:An application of the GSTE principle. J. Math. Chem.,1993,12(3):365-373.
    [187]Tusnady G E, Simon I. The HMMTOP transmembrane topology prediction server. Bioinformatics,2001,17(9):849-850.
    [188]Kahsay R Y, Gao G, Liao L. An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics,2005,21(9):1853-1858.
    [189]Rabiner L R, Juang B H. An Introduction to Hidden Markov Models. IEEE ASSP Magazine,1986,3(1):4-16.
    [190]Von H G. The signal peptide. J Membr Biol,1990,115(3):195-201.
    [191]Chamberlain A K, Bowie J U. Asymmetric amino acid compositions of transmembrane beta-strands. Protein Sci,2004,13(8):2270-2274.
    [192]刘琪.跨膜蛋白拓扑预测研究:学位论文.上海:上海交通大学,2003.
    [193]Krogh A, Brown M, Mian l S, et al. Hidden Markov models in computational biology. Applications to protein modeling. J Mol Biol,1994,235(5):1501-1531.
    [194]Krogh A. Hidden Markov models for labelled sequences. Proceedings of the 12th I APR International Conference on PatternRecognition,1994,:140-144.
    [195]Krogh A. Two methods for improving performance of an HMM and their application for gene finding. Proc Int Conf Intell Syst Mol Biol,1997,5: 179-186.
    [196]Barrett C, Hughey R, Karplus K. Scoring hidden Markov models. Comput Appl Biosci,1997,13(2):191-199.
    [197]Hua S, Sun Z. Support vector machine approach for protein subcellular localization prediction. Bioinformatics,2001,17(8):721-728.
    [198]Reinhardt A, Hubbard T. Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res,1998,26(9):2230-2236.
    [199]Huang Y, Li Y. Prediction of protein subcellular locations using fuzzy k-NN method. Bioinformatics,2004,20(1):21-28.
    [200]Xie D, Li A, Wang M, et al. LOCSVMPSI:a web server for subcellular localization of eukaryotic proteins using SVM and profile of PSI-BLAST. Nucleic Acids Res,2005,33(Web Server issue):105-110.
    [201]Park K J, Kanehisa M. Prediction of protein subcellular locations by support vector machines using compositions of amino acids and amino acid pairs. Bioinformatics,2003,19(13):1656-1663.
    [202]Claverie J M. Fewer genes, more noncoding RNA. Science,2005,309(5740): 1529-1530.
    [203]Deng W, Zhu X, Skogerbo G, et al. Organization of the Caenorhabditis elegans small non-coding transcriptome:genomic features, biogenesis, and expression. Genome Res,2006,16(1):20-29.
    [204]Mattick J S. The functional genomics of noncoding RNA. Science,2005, 309(5740):1527-1528.
    [205]Lau N C, Seto A G, Kim J, et al. Characterization of the piRNA complex from rat testes. Science,2006,313(5785):363-367.
    [206]袁曾任.人工神经网络及其应用.北京:清华大学出版社,1999.
    [207]Charalambous C. Conjugate gradient algorithm for efficient of artificial neural network. IEEE Proceedings,1992,139(3):301-310.
    [208]Battiti R. First-and second-order methods for learning:between steepest descent and newton's method. Neural Computation,1992,4:141-166.
    [209]Hagan M T. Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks,1994,5(5):989-993.
    [210]Uzilov A V, Keegan J M, Mathews D H. Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change. BMC Bioinformatics,2006,7:173.
    [211]Rivas E, Eddy S R. Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics,2001,2:8.
    [212]Kim S K, Nam J W, Rhee J K, et al. miTarget:microRNA target gene prediction using a support vector machine. BMC Bioinformatics,2006,7:411.
    [213]Eddy S R. Computational genomics of noncoding RNA genes. Cell,2002,109(2): 137-140.
    [214]齐震.Non-coding RNA基因预测及用信息差异度进行种系进化分析:学位论文.北京:中国科学院生物物理研究所,2004.
    [215]Liu C, Bai B, Skogerbo G, et al. NONCODE:an integrated knowledge database of non-coding RNAs. Nucleic Acids Res,2005,33(Database issue):112-115.
    [216]He S, Liu C, Skogerbo G, et al. NONCODE v2.0:decoding the non-coding. Nucleic Acids Res,2008,36(Database issue):170-172.
    [217]Karlin S, Campbell A M, Mrazek J. Comparative DNA analysis across diverse genomes. Annu Rev Genet,1998,32:185-225.
    [218]Storz G, Gottesman S. Versatile roles of small RNA regulators in bacteria. The RNA World:Gesteland R F, Cech T R, Atkins J F, Plainview, NY:Cold Spring Harbor Lab Press,2006:,567-594.
    [219]Zuker M, Mathews D H. Turner Algorithms and Thermodynamics for RNA Secondary Structure Prediction:A Practical Guide in RNA Biochemistry and Biotechnology. NATO ASI Series:Barciszewski J, Clark B F C, Kluwer Academic Publishers,1999.
    [220]Markham N R, Zuker M. DINAMelt web server for nucleic acid melting prediction. Nucleic Acids Res,2005,33(Web Server issue):577-581.
    [221]Andronescu M, Zhang Z C, Condon A. Secondary structure prediction of interacting RNA molecules. J Mol Biol,2005,345(5):987-1001.
    [222]Altschul S F, Erickson B W. A nonlinear measure of subalignment similarity and its significance levels. Bull Math Biol,1986,48(5-6):617-632.
    [223]Vogel J, Wagner E G. Target identification of small noncoding RNAs in bacteria. Curr Opin Microbiol,2007,10(3):262-270.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700