文摘
Reliable identification of posttranslational modificationsis key to understanding various cellular regulatory processes. We describe a tool, InsPecT, to identify posttranslational modifications using tandem mass spectrometry data. InsPecT constructs database filters thatproved to be very successful in genomics searches. Givenan MS/MS spectrum S and a database D, a database filterselects a small fraction of database D that is guaranteed(with high probability) to contain a peptide that producedS. InsPecT uses peptide sequence tags as efficient filtersthat reduce the size of the database by a few orders ofmagnitude while retaining the correct peptide with veryhigh probability. In addition to filtering, InsPecT also usesnovel algorithms for scoring and validating in the presenceof modifications, without explicit enumeration of all variants. InsPecT identifies modified peptides with better orequivalent accuracy than other database search toolswhile being 2 orders of magnitude faster than SEQUEST,and substantially faster than X!TANDEM on complexmixtures. The tool was used to identify a number of novelmodifications in different data sets, including many phosphopetides in data provided by Alliance for CellularSignaling that were missed by other tools.