Statistics of N-terminal alignment as a guide for refining prokaryotic gene annotation
详细信息查看全文 | 推荐本文 |
摘要
Identification of a correct N-terminus of a protein is an important step in genome annotation. However, we sometimes encounter incorrectly annotated N-termini in genomic databases. We analyzed statistics of surplus or missing N-terminal amino acid residues in tentatively translated coding sequence of cyanobacterial database entries, and found that, on average, about 8-9%of the aligned proteins have a putative incorrect N-terminus, although the percentage was dependent on the database entry. In an attempt to find more plausible N-termini for these proteins, we were able to estimate a better-aligning N-terminus in 90%of the cases. TTG was found as a putative initiation codon in most cases of recessed N-termini. This statistical approach, applicable to any group of prokaryotes, will help identify a plausible translation initiation site for each protein-coding gene in newly sequenced genomes, and also is a method of refining the N-terminus of proteins in already published genomes.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700