Identification and classification of conopeptides using profile Hidden Markov Models

详细信息查看全文

作者：Silja Laht^a ; ^{siljalaht@ebc.ee} ; Dominique Koua^b ; ^c ; Lauris Kaplinski^a ; Fr&#233 ; d&#233 ; rique Lisacek^c ; Reto Stö ; cklin^b ; Maido Remm^a
关键词：Conotoxin ; Conopeptide ; Hidden Markov Model ; Conopeptide superfamilies ; Protein prediction
刊名：Biochimica et Biophysica Acta - Proteins and Proteomics
出版年：2012
出版时间：March 2012
年：2012
卷：1824
期：3
页码：488-492
全文大小：455 K

文摘

Conopeptides are small toxins produced by predatory marine snails of the genus Conus. They are studied with increasing intensity due to their potential in neurosciences and pharmacology. The number of existing conopeptides is estimated to be 1 million, but only about 1000 have been described to date. Thanks to new high-throughput sequencing technologies the number of known conopeptides is likely to increase exponentially in the near future. There is therefore a need for a fast and accurate computational method for identification and classification of the novel conopeptides in large data sets. 62 profile Hidden Markov Models (pHMMs) were built for prediction and classification of all described conopeptide superfamilies and families, based on the different parts of the corresponding protein sequences. These models showed very high specificity in detection of new peptides. 56 out of 62 models do not give a single false positive in a test with the entire UniProtKB/Swiss-Prot protein sequence database. Our study demonstrates the usefulness of mature peptide models for automatic classification with accuracy of 96 % for the mature peptide models and 100 % for the pro- and signal peptide models. Our conopeptide profile HMMs can be used for finding and annotation of new conopeptides from large datasets generated by transcriptome or genome sequencing. To our knowledge this is the first time this kind of computational method has been applied to predict all known conopeptide superfamilies and some conopeptide families.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700