Analysis of K-Means and K-Medoids Algorithm For Big Data

详细信息查看全文

作者：Preeti Arora^a ; ^{erpreetiarora07@gmail.com" class="auth_mail" title="E-mail the corresponding author} ; Deepali ; Dr.^b ; ^{deepalivermani@gmail.com" class="auth_mail" title="E-mail the corresponding author} ; Shipra Varshney^c ; ^{shipra_vin@yahoo.com" class="auth_mail" title="E-mail the corresponding author}
关键词：Clustering ; K-Means ; K-Medoids
刊名：Procedia Computer Science
出版年：2016
出版时间：2016
年：2016
卷：78
期：Complete
页码：507-512
全文大小：790 K

文摘

Clustering plays a very vital role in exploring data, creating predictions and to overcome the anomalies in the data. Clusters that contain collateral, identical characteristics in a dataset are grouped using reiterative techniques. As the data in real world is growing day by day so very large datasets with little or no background knowledge can be identified into interesting patterns with clustering. So, in this paper the two most popular clustering algorithms K-Means and K-Medoids are evaluated on dataset transaction10k of KEEL. The input to these algorithms are randomly distributed data points and based on their similarity clusters has been generated. The comparison results show that time taken in cluster head selection and space complexity of overlapping of cluster is much better in K-Medoids than K-Means. Also K-Medoids is better in terms of execution time, non sensitive to outliers and reduces noise as compared to K-Means as it minimizes the sum of dissimilarities of data objects.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700