文摘
Using proteins in saliva as biomarkers has great advantage in early diagnosis and prognosis evaluation of health conditions or diseases. In this article, we present a computational method for predicting secreted proteins in human saliva. Firstly, we collected currently known saliva-secreted proteins and the representatives that deem to be not extracellular secretion into saliva. Secondly, we pruned the negative data concerned the imbalance condition, and then extracted the relevant features from the physicochemical and sequence properties of all remained proteins. After that, a support vector machine classifier was built which got performance of average sensitivity, specificity, precision, accuracy and Matthews correlation coefficient value to 80.67%, 90.56%, 90.09%, 85.53% and 0.7168, respectively. These results indicated that the selected features and the model are effective. Finally, a screening test was implemented to all human proteins in UniProt and acquired 5811 proteins as predicted saliva-secreted proteins which may be used as biomarker candidates for further salivary diagnosis.