In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label
predictor based on ensemble
linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark
dataset and a novel independent
dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art
predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers’ convenience, the online Web server LNP-Chlo is freely available at
http://bioinfo.eie.polyu.edu.hk/LNPChloServer/.