Fully polynomial-time approximation scheme for a special case of a quadratic Euclidean 2-clustering problem
详细信息    查看全文
  • 作者:A. V. Kel'manov ; V. I. Khandeev
  • 关键词:cluster analysis ; partition ; Euclidean space ; minimum of the sum of squares of distances ; NP ; hardness ; fixed space dimension ; FPTAS
  • 刊名:Computational Mathematics and Mathematical Physics
  • 出版年:2016
  • 出版时间:February 2016
  • 年:2016
  • 卷:56
  • 期:2
  • 页码:334-341
  • 全文大小:290 KB
  • 参考文献:1.M. Bern and D. Eppstein, “Approximation algorithms for geometric problems,” in Approximation Algorithms for NP-Hard Problems, Ed. by D. S. Hochbaum (PWS, Boston, 1997), pp. 296–345.
    2.G. James, D. Witten, T. Hastie, and R. Tibshirani, An Introduction to Statistical Learning (Springer Science, New York, 2013).CrossRef MATH
    3.M. C. Bishop, Pattern Recognition and Machine Learning (Springer Science, New York, 2006).MATH
    4.P. Flach, Machine Learning: The Art and Science of Algorithms That Make Sense of Data (Cambridge University Press, New York, 2012).CrossRef MATH
    5.C. Steger, M. Ulrich, and C. Wiedemann, Machine Vision Algorithms and Applications (Cambridge Univ. Press, New York, 2010).
    6.S. Kaiser, Biclustering: Methods, Software, and Application (Faculty of Math. Comput. Sci. Stat., Munich, 2011).
    7.A. K. Jain, “Data clustering: 50 years beyond k-means,” Pattern Recogn. Lett. 31 8, 651–666 (2010).CrossRef
    8.A. V. Kel’manov and A. V. Pyatkin, “On the complexity of a search for a subset of “similar” vectors,” Dokl. Math 78 1, 574–575 (2008).MathSciNet CrossRef MATH
    9.E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A posteriori detection of a quasiperiodic fragment with a given number of repetitions in a numerical sequence,” Sib. Zh. Ind. Mat. 9 1, 55–74 (2006).MathSciNet MATH
    10.E. Kh. Gimadi, A. V. Kel’manov, M. A. Kel’manova, and S. A. Khamidullin, “A posteriori detecting a quasiperiodic fragment in a numerical sequence,” Pattern Recogn. Image Anal. 18 1, 30–42 (2008).CrossRef MATH
    11.A. V. Kel’manov, “Off-line detection of a quasi-periodically recurring fragment in a numerical sequence,” Proc. Steklov Inst. Math. 263, Suppl. 2, 84–92 (2008).MathSciNet CrossRef MATH
    12.A. V. Dolgushev and A. V. Kel’manov, “An approximation algorithm for solving a problem of cluster analysis,” J. Appl. Ind. Math. 5 4, 551–558 (2011).MathSciNet CrossRef MATH
    13.A. V. Kel’manov and V. I. Khandeev, “Randomized algorithm for two-cluster partition of a set of vectors,” Comput. Math. Math. Phys. 55 2, 330–339 (2015).MathSciNet CrossRef MATH
    14.D. Aloise, A. Deshpande, P. Hansen, and P. Popat, “NP-hardness of Euclidean sum-of-squares clustering,” Machine Learning 75 2, 245–248 (2009).CrossRef
    15.J. B. MacQueen, “Some methods for classification and analysis of multivariate observations,” Proceedings of the 5th Berkeley Symposium of Mathematical Statistics and Probability (Univ. of California Press, Berkeley, 1967), Vol. 1, pp. 281–297.
    16.M. Rao, “Cluster analysis and mathematical programming,” J. Am. Stat. Assoc. 66, 622–626 (1971).CrossRef MATH
    17.A. V. Kel’manov and A. V. Pyatkin, “Complexity of certain problems of searching for subsets of vectors and cluster analysis,” Comput. Math. Math. Phys. 49 11, 1966–1971 (2009).MathSciNet CrossRef MATH
    18.A. V. Kel’manov, “On the complexity of some data analysis problems,” Comput. Math. Math. Phys. 50 11, 1941–1947 (2010).MathSciNet CrossRef MATH
    19.A. V. Kel’manov, “On the complexity of some cluster analysis problems,” Comput. Math. Math. Phys. 51 11, 1983–1988 (2011).MathSciNet CrossRef
    20.A. E. Baburin, E. Kh. Gimadi, N. I. Glebov, and A. V. Pyatkin, “The problem of finding a subset of vectors with the maximum total weight,” J. Appl. Ind. Math. 2 1, 32–38 (2008).MathSciNet CrossRef MATH
    21.A. V. Kel’manov and A. V. Pyatkin, “NP-completeness of some problems of choosing a vector subset,” J. Appl. Ind. Math. 5 3, 352–357 (2011).MathSciNet CrossRef
    22.A. V. Kel’manov and V. I. Khandeev, “An exact pseudopolynomial algorithm for a problem of the two-cluster partitioning of a set of vectors,” J. Appl. Ind. Math. 9 4, 497–502 (2015).MathSciNet CrossRef MATH
    23.A. V. Kel’manov and V. I. Khandeev, “A 2-approximation polynomial algorithm for a clustering problem,” J. Appl. Ind. Math. 7 4, 515–521 (2013).MathSciNet CrossRef MATH
    24.A. V. Kel’manov and A. V. Pyatkin, “On a version of the problem of choosing a vector subset,” J. Appl. Ind. Math. 3 4, 447–455 (2009).MathSciNet CrossRef
    25.M. R. Garey and D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness (Freeman, San Francisco, 1979).MATH
    26.A. V. Dolgushev, A. V. Kel’manov, and V. V. Shenmaier, “A PTAS for a problem of cluster analysis,” Proceedings of the 9th International Conference on Intelligent Information Processing, Budva, Montenegro (Torus, Moscow, 2012), pp. 242–244.
    27.V. V. Vazirani, Approximation Algorithms (Springer, Berlin, 2001).MATH
    28.A. V. Kel’manov and S. M. Romanchenko, “An FPTAS for a vector subset search problem,” J. Appl. Ind. Math. 8 3, 329–336 (2014).MathSciNet CrossRef MATH
    29.I. Wirth, Algorithms + Data Structures = Programs (Prentice Hall, New Jersey, 1976).MATH
  • 作者单位:A. V. Kel’manov (1) (2)
    V. I. Khandeev (1)

    1. Sobolev Institute of Mathematics, Siberian Branch, Russian Academy of Sciences, pr. Akademika Koptyuga 4, Novosibirsk, 630090, Russia
    2. Novosibirsk State University, ul. Pirogova 2, Novosibirsk, 630090, Russia
  • 刊物类别:Mathematics and Statistics
  • 刊物主题:Mathematics
    Computational Mathematics and Numerical Analysis
    Russian Library of Science
  • 出版者:MAIK Nauka/Interperiodica distributed exclusively by Springer Science+Business Media LLC.
  • ISSN:1555-6662
文摘
The strongly NP-hard problem of partitioning a finite set of points of Euclidean space into two clusters of given sizes (cardinalities) minimizing the sum (over both clusters) of the intracluster sums of squared distances from the elements of the clusters to their centers is considered. It is assumed that the center of one of the sought clusters is specified at the desired (arbitrary) point of space (without loss of generality, at the origin), while the center of the other one is unknown and determined as the mean value over all elements of this cluster. It is shown that unless P = NP, there is no fully polynomial-time approximation scheme for this problem, and such a scheme is substantiated in the case of a fixed space dimension.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700