Active Distance-Based Clustering Using K-Medoids
详细信息    查看全文
  • 关键词:Active k ; medoids ; Active clustering ; Distance ; based clustering ; Centroid ; based clustering
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9651
  • 期:1
  • 页码:253-264
  • 全文大小:661 KB
  • 参考文献:1.Amsterdam library of object images (aloi) (2004). http://​aloi.​science.​uva.​nl/​
    2.Example data sets for elki (2013). http://​elki.​dbs.​ifi.​lmu.​de/​wiki/​DataSets
    3.Arbelaez, A., Quesada, L.: Parallelising the k-meds clustering problem using space-partitioning. In: Proceedings of the Sixth Annual Symposium on Combinatorial Search, SOCS, Leavenworth, Washington, USA, 11–13 July 2013
    4.Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA, pp. 1027–1035, New Orleans, Louisiana, USA, 7–9 January 2007
    5.Asuncion, A., Newman, D.: UCI machine learning repository datasets (2007). https://​archive.​ics.​uci.​edu/​ml/​datasets.​html
    6.Basu, S., Banerjee, A., Mooney, R.J.: Active semi-supervision for pairwise constrained clustering. In: Proceedings of the Fourth SIAM International Conference on Data Mining, pp. 333–344, Lake Buena Vista, Florida, USA, 22–24 April 2004
    7.Biswas, A., Jacobs, D.W.: Active image clustering with pairwise constraints from humans. Int. J. Comput. Vis. 108(1–2), 133–147 (2014)MathSciNet CrossRef MATH
    8.Chen, M.: Synthesized dataset for k-medoids. http://​www.​mathworks.​com/​matlabcentral/​fileexchange/​28898-k-medoids/​
    9.Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 3rd edn. The MIT Press, Cambridge (2009)MATH
    10.Eriksson, B., Dasarathy, G., Singh, A., Nowak, R.D.: Active clustering: robust and efficient hierarchical clustering using adaptively selected similarities. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS, pp. 260–268, Fort Lauderdale, USA, 11–13 April 2011
    11.Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recogn. 41(5), 1834–1844 (2008)CrossRef MATH
    12.Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, Hoboken (1990)CrossRef
    13.Krishnamurthy, A., Balakrishnan, S., Xu, M., Singh, A.: Efficient active algorithms for hierarchical clustering. In: Proceedings of the 29th International Conference on Machine Learning, ICML, Edinburgh, Scotland, UK, June 26-July 1, 2012
    14.Mai, S.T., He, X., Hubig, N., Plant, C., Böhm, C.: Active density-based clustering. In: IEEE 13th International Conference on Data Mining, pp. 508–517, Dallas, TX, USA, 7–10 December 2013
    15.Mobahi, H., Collobert, R., Weston, J.: Deep learning from temporal coherence in video. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 737–744, ICML, Montreal, Quebec, Canada, 14–18 June 2009
    16.Nayar, S., Nene, S.A., Murase, H.: Columbia object image library (coil 100). Department of Computer Science, Columbia University, Technical report, CUCS-006-96 (1996)
    17.Nguyen, X.V., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: is a correction for chance necessary? In: Proceedings of the 26th Annual International Conference on Machine Learning, ICML, pp. 1073–1080, Montreal, Quebec, Canada, 14–18 June 2009
    18.Settles, B.: Active learning literature survey. Univ. Wis. Madison 52(55–66), 11 (2010)
    19.Shamir, O., Tishby, N.: Spectral clustering on a budget. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS, pp. 661–669, Fort Lauderdale, USA, 11–13 April 2011
    20.Singh, S.S., Chauhan, N.: K-means v/s k-medoids: a comparative study. In: National Conference on Recent Trends in Engineering and Technology, vol. 13 (2011)
    21.Tong, S., Koller, D.: Support vector machine active learning with applications to text classification. J. Mach. Learn. Res. 2, 45–66 (2001)MATH
    22.Ultsch, A.: Fundamental clustering problems suite (fcps). Technical report, University of Marburg (2005)
    23.Voevodski, K., Balcan, M., Röglin, H., Teng, S., Xia, Y.: Active clustering of biological sequences. J. Mach. Learn. Res. 13, 203–225 (2012)MathSciNet MATH
    24.Vu, V., Labroche, N., Bouchon-Meunier, B.: Improving constrained clustering with active query selection. Pattern Recogn. 45(4), 1749–1758 (2012)CrossRef
    25.Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML), Williams College, pp. 577–584, Williamstown, MA, USA, June 28–July 1, 2001
    26.Wang, X., Davidson, I.: Active spectral clustering. In: ICDM 2010, The 10th IEEE International Conference on Data Mining, pp. 561–568, Sydney, Australia, 14–17 December 2010
    27.Wauthier, F.L., Jojic, N., Jordan, M.I.: Active spectral clustering via iterative uncertainty reduction. In: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, pp. 1339–1347, Beijing, China, 12–16 August 2012
    28.Xiong, C., Johnson, D.M., Corso, J.J.: Active clustering with model-based uncertainty reduction. CoRR, abs/1402.1783 (2014)
    29.Chen, Y., Keogh, E., Batista, G.: UCR time series classification archive (2015). http://​www.​cs.​ucr.​edu/​~eamonn/​time_​series_​data/​
  • 作者单位:Amin Aghaee (19)
    Mehrdad Ghadiri (19)
    Mahdieh Soleymani Baghshah (19)

    19. Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
  • 丛书名:Advances in Knowledge Discovery and Data Mining
  • ISBN:978-3-319-31753-3
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
k-medoids algorithm is a partitional, centroid-based clustering algorithm which uses pairwise distances of data points and tries to directly decompose the dataset with n points into a set of k disjoint clusters. However, k-medoids itself requires all distances between data points that are not so easy to get in many applications. In this paper, we introduce a new method which requires only a small proportion of the whole set of distances and makes an effort to estimate an upper-bound for unknown distances using the inquired ones. This algorithm makes use of the triangle inequality to calculate an upper-bound estimation of the unknown distances. Our method is built upon a recursive approach to cluster objects and to choose some points actively from each bunch of data and acquire the distances between these prominent points from oracle. Experimental results show that the proposed method using only a small subset of the distances can find proper clustering on many real-world and synthetic datasets.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700