Outlier-eliminated k-means clustering algorithm based on differential privacy preservation
详细信息    查看全文
文摘
Individual privacy may be compromised during the process of mining for valuable information, and the potential for data mining is hindered by the need to preserve privacy. It is well known that k-means clustering algorithms based on differential privacy require preserving privacy while maintaining the availability of clustering. However, it is difficult to balance both aspects in traditional algorithms. In this paper, an outlier-eliminated differential privacy (OEDP) k-means algorithm is proposed that both preserves privacy and improves clustering efficiency. The proposed approach selects the initial centre points in accordance with the distribution density of data points, and adds Laplacian noise to the original data for privacy preservation. Both a theoretical analysis and comparative experiments were conducted. The theoretical analysis shows that the proposed algorithm satisfies ε-differential privacy. Furthermore, the experimental results show that, compared to other methods, the proposed algorithm effectively preserves data privacy and improves the clustering results in terms of accuracy, stability, and availability.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700