WRF-TMH: predicting transmembrane helix by fusing composition index and physicochemical properties of amino acids

详细信息查看全文

作者：Maqsood Hayat (1) (2)
Asifullah Khan (2)
关键词：Transmembrane helix ; Physicochemical properties ; Compositional index ; Weighted random forest ; Structures of membrane proteins
刊名：Amino Acids
出版年：2013
出版时间：May 2013
年：2013
卷：44
期：5
页码：1317-1328
全文大小：563KB
参考文献：1. Afridi TH, Khan A, Lee YS (2012) Mito-GSAAC: mitochondria prediction using genetic ensemble classifier and split amino acid composition. Amino Acids 42:1443鈥?453 CrossRef
2. Amico M, Finelli M, Rossi I (2006) PONGO: a web server for multiple predictions of all-alpha transmembrane proteins. Nucleic Acids Res 34:W169鈥揥172 CrossRef
3. Arai M, Mitsuke H, Ikeda M, Xia JX, Kikuchi T, Satake M, Shimizu T (2004) Con Pred II: a consensus prediction method for obtaining transmembrane topology models with high reliability. Nucleic Acids Res 32:W390鈥揥393 CrossRef
4. Argos P, Rao J, Hargrave P (1982) Structural prediction of membrane bound proteins. Eur J Biochem 128:565鈥?75 CrossRef
5. Bagos P, Liakopoulos T, Hamodrakas S (2006) Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins. BMC Bioinform 7:189 CrossRef
6. Bairoch A, Apweiler R (1997) The SWISS-PROT protein sequence database: its relevance to human molecular medical research. J Mol Med 5:312鈥?16
7. Berman H, Westbrook J, Feng Z, Gilliland G, Bhat T (2000) Nucleic Acids Res 28:235鈥?42 CrossRef
8. Bordner A (2009) Predicting protein鈥損rotein binding sites in membrane proteins. BMC Bioinform 24(10):312 CrossRef
9. Bush WS, Edwards TS, Dudek SM, Mckinney BA, Ritchie MD (2008) Alternative contingency table measures improve the power and detection of multifactor dimensionality reduction. BMC Bioinform 9:238鈥?54 CrossRef
10. Chen CP, Kernytsky A, Rost B (2002) Transmembrane helix predictions revisited. Protein Sci 11:2774鈥?791 CrossRef
11. Claros MG, Von Heijne G (1994) TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci 10:685鈥?86
12. Cserzo M, Wallin E, Simon I, Von Heijne G, Elofsson A (1997) Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng Des Sel 10:673鈥?76 CrossRef
13. Cserzo M, Eisenhaber F, Eisenhaber B, Simon I (2004) TM or not TM: transmembrane protein prediction with low false positive rate using DASTMfilter. Bioinformatics 20:136鈥?37 CrossRef
14. Cuthbertson JM, Doyle DA, Sansom MS (2005) Transmembrane helix prediction: a comparative evaluation and analysis. Protein Eng Des Sel 18:295鈥?08 CrossRef
15. Deber C, Wang C, Liu L, Prior A, Agrawal S, Muskat B, Cuticchia A (2001) TM finder: a prediction program for transmembrane protein segments using a combination of hydrophobicity and nonpolar phase helicity scales. Protein Sci 10:212鈥?19 CrossRef
16. Eisenberg D, Weiss RM, Terwilliger TC (1982) The helical hydrophobic moment: a measure of the amphipathicity of a helix. Nature 299:371鈥?74 CrossRef
17. Hayat M, Khan A (2012) Mem-PHybrid: hybrid features based prediction system for classifying membrane protein types. Anal Biochem 424:35鈥?4 CrossRef
18. Hayat M, Khan A, Yeasin M (2012) Prediction of membrane proteins using split amino acid composition and ensemble classification. Amino Acids 42:2447鈥?460 CrossRef
19. Hirokawa T, Boon-Chieng S, Mitaku S (1998) SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14:378鈥?79 CrossRef
20. Hosseini SR, Sadeghi M, Pezeshk H, Eslahchi C, Habibi M (2008) Prosign: a method for protein secondary structure assignment based on three-dimensional coordinates of consecutive c(alpha) atoms. Comput Biol Chem 32(6):406鈥?11 CrossRef
21. Ikeda M, Arai M, Lao DM, Shimizu T (2002) Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. In Silico Biol 2:19鈥?3
22. Jayasinghe S, Hristova K, White SH (2001a) MPtopo: a database of membrane protein topology. Protein Sci 10:455鈥?58 CrossRef
23. Jayasinghe S, Hristova K, White SH (2001b) Energetics, stability, and prediction of transmembrane helices. J Mol Biol 312:927鈥?34 CrossRef
24. Jones DT (2007) Improving the accuracy of transmembrane protein topology prediction using evolutionary information. Bioinformatics 23:538鈥?44 CrossRef
25. Juretic D, Zoranic L, Zucic D (2002) Basic charge clusters and predictions of membrane protein topology. J Chem Inf Comput Sci 42:620鈥?32 CrossRef
26. Kahsay R, Gao G, Liao L (2005) An improved hidden Markov model for transmembrane protein detection and topology prediction and its applications to complete genomes. Bioinformatics 21:1853鈥?858 CrossRef
27. Kall L, Sonnhammer E (2002) Reliability of transmembrane predictions in whole-genome data. FEBS Lett 532:415鈥?18 CrossRef
28. Kall L, Krogh A, Sonnhammer E (2007) Advantages of combined transmembrane topology and signal peptide prediction鈥攖he Phobius web server. Nucleic Acids Res 35:W429鈥揥432 CrossRef
29. Khan A, Majid A, Choi TS, Acids A (2010) Predicting protein subcellular location: exploiting amino acid based sequence of feature spaces and fusion of diverse classifiers. Amino Acids 38:347鈥?50 CrossRef
30. Klabunde T, Hessler G (2002) Chem Bio Chem 3:928鈥?44 CrossRef
31. Krogh A, Larsson B, von Heijne G, Sonnhammer EL (2001) Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567鈥?80 CrossRef
32. Kyte J, Doolittle R (1982) A simple method for displaying the hydropathic character of a protein. J Mol Biol 157:105鈥?32 CrossRef
33. Lo A, Chiu HS, Sung TY, Lyu PC, Hsu WL (2008) Enhanced membrane protein topology prediction using a hierarchical classification method and a new scoring function. J Proteome Res 7:487鈥?96 CrossRef
34. Martelli P, Fariselli P, Casadio R (2003) An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins. Bioinformatics 19:i205鈥搃211 CrossRef
35. Melen K, Krogh A, von-Heijne G (2003) Reliability measures for membrane protein topology prediction algorithms. J Mol Biol 327:735鈥?44 CrossRef
36. Moller S, Kriventseva EV, Apweiler R (2000) A collection of well characterized integral membrane proteins. Bioinformatics 16:1159鈥?160 CrossRef
37. Moller S, Croning MD, Apweiler R (2001) Evaluation of methods for the prediction of membrane spanning regions. Bioinformatics 646鈥?53:17
38. Nakai K, Kanehisa M (1992) A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics 14:897鈥?11 CrossRef
39. Naveed M, Khan A (2012) GPCR-MPredictor: multi-level prediction of G protein-coupled receptors using genetic ensemble. Amino Acids 42:1809鈥?823 CrossRef
40. Nugent T, Jones D (2009a) Transmembrane protein topology prediction using support vector machines. BMC Bioinformatics 10:159 CrossRef
41. Nugent T, Jones D (2009b) Predicting transmembrane helix packing arrangements using residue contacts and a force-directed algorithm. PLoS Comput Biol 6:e1000714 CrossRef
42. Persson B, Argos P (1996) Topology prediction of membrane proteins. Protein Sci 5:363鈥?71
43. Pylouster J, Bornot A, Etchebest C, Brevern AGD (2010) Influence of assignment on the prediction of transmembrane helices in protein structures. Amino Acids 39(5):1241鈥?254 CrossRef
44. Rost B, Casadio R, Fariselli P, Sander C (1995) Transmembrane helices predicted at 95% accuracy. Protein Sci 4:521鈥?33 CrossRef
45. Rost B, Fariselli P, Casadio R (1996) Topology prediction for helical transmembrane proteins at 86% accuracy. Protein Sci 5:1704鈥?718 CrossRef
46. Shen H, Chou JJ (2008) MemBrain: improving the accuracy of predicting transmembrane helices. PLoS ONE 3:e2399 CrossRef
47. Sonnhammer EL, Von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6:175鈥?82
48. Suyama M, Ohara O (2003) Domcut: prediction of inter-domain linker regions in amino acid sequences. Bioinformatics 19:673鈥?74 CrossRef
49. Tusnady GE, Simon I (1998) Principles governing amino acid composition of integral membrane proteins: application to topology prediction. J Mol Biol 283:489鈥?06 CrossRef
50. Tusnady GE, Simon I (2001) The HMMTOP transmembrane topology prediction server. Bioinformatics 17:849鈥?50 CrossRef
51. Viklund H, Elofsson A (2004) Best alpha-helical transmembrane protein topology predictions are achieved using hidden Markov models and evolutionary information. Protein Sci 13:1908鈥?917 CrossRef
52. Von Heijne G (1992) Membrane protein structure prediction. Hydrophobicity analysis and the positive-inside rule. J Mol Biol 225:487鈥?94 CrossRef
53. Wang XF, Chen Z, Wang C, Yan RX, Zhang Z, Song J (2011) Predicting residue鈥搑esidue contacts and helix鈥揾elix interactions in transmembrane proteins using an integrative feature-based random forest approach. PLoS ONE 6:e26767 CrossRef
54. Wang C, Xi L, Li S, Liu H, Yao X (2012) A sequence-based computational model for the prediction of the solvent accessible surface area for <alpha> -helix and <beta> -barrel transmembrane residues. J Comput Chem 33:11鈥?7 CrossRef
55. Zaki N, Bouktif S, Sanja LM (2011a) A combination of compositional index and genetic algorithm for predicting transmembrane helical segments. PLoSONE 6(7):e21821
56. Zaki N, Bouktif S, Sanja LM (2011b) A genetic algorithm to enhance transmembrane helices topology prediction using compositional index, ACM GECCO鈥?1, Dublin
作者单位：Maqsood Hayat (1) (2)
Asifullah Khan (2)

1. Abdul Wali Khan University, Mardan, Pakistan
2. Pattern Recognition Lab, Department of Computer and Information Sciences, Pakistan Institute of Engineering and Applied Sciences, Nilore, 45650, Islamabad, Pakistan
ISSN：1438-2199

文摘

Membrane protein is the prime constituent of a cell, which performs a role of mediator between intra and extracellular processes. The prediction of transmembrane (TM) helix and its topology provides essential information regarding the function and structure of membrane proteins. However, prediction of TM helix and its topology is a challenging issue in bioinformatics and computational biology due to experimental complexities and lack of its established structures. Therefore, the location and orientation of TM helix segments are predicted from topogenic sequences. In this regard, we propose WRF-TMH model for effectively predicting TM helix segments. In this model, information is extracted from membrane protein sequences using compositional index and physicochemical properties. The redundant and irrelevant features are eliminated through singular value decomposition. The selected features provided by these feature extraction strategies are then fused to develop a hybrid model. Weighted random forest is adopted as a classification approach. We have used two benchmark datasets including low and high-resolution datasets. tenfold cross validation is employed to assess the performance of WRF-TMH model at different levels including per protein, per segment, and per residue. The success rates of WRF-TMH model are quite promising and are the best reported so far on the same datasets. It is observed that WRF-TMH model might play a substantial role, and will provide essential information for further structural and functional studies on membrane proteins. The accompanied web predictor is accessible at http://111.68.99.218/WRF-TMH/.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700