In this paper, we propose a semi-supervised approach that automatically integrates the knowledge about unknown malware from already available and cheap unlabeled data into the detection system. The novelty of the proposed approach is that it does not require expert effort to update the database of the detection engine. Instead, the dynamic changes in malware attack patterns are extracted by unsupervised clustering from already available unlabeled data. Then the extracted geometric information about the intrinsic attack characteristics of the clusters is integrated into the classification systems of the detection engine, which updates the detection system automatically. The proposed approach uses global K-means clustering with term-frequency (TF), inverse document frequency (IDF), and cosine similarity as a distance measure for extracting the cluster information and adding it to a support vector machine (SVM) classification system. The proposed approach has been tested extensively on a real malware data set for both static and dynamic malware features. The experiment results show that the proposed semi-supervised approach achieves higher accuracy over the existing supervised approaches for all classifiers. We note that the static feature-based semi-supervised approach can improve detection accuracy significantly. While applying the proposed semi-supervised approach with the run-time characteristics of dynamic feature analysis, the combined effect of dynamic analysis and the proposed approach further increases the detection accuracy of all classifiers by up to a 100% for the SVM and the random forest classifiers, thus exceeding the existing supervised approaches with similar features.