An optimal method for data clustering
详细信息    查看全文
  • 作者:Linsen Xie ; Chengbo Lu ; Ying Mei ; Hong Du ; Zhihong Man
  • 关键词:Clustering ; Extreme learning machine ; Feature space ; Graph Laplacian
  • 刊名:Neural Computing & Applications
  • 出版年:2016
  • 出版时间:February 2016
  • 年:2016
  • 卷:27
  • 期:2
  • 页码:283-289
  • 全文大小:531 KB
  • 参考文献:1.Luxburg UV (2004) A tutorial on spectral clustering. Stat Comput 17(4):395–416CrossRef
    2.Han J, Kamber M, Pei J (2001) Data mining, concepts and techniques. Morgan Kaufmann, San Francisco
    3.McQueen J (1967) Some methods for classifications and analysis of multivariate observations. In: The symposium on mathematical statistics and probability vol 1, pp 281–297
    4.Karypis G, Han E-H, Kumar V (1999) Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8):68–75CrossRef
    5.Rastogi G, Shim K (1998) CURE: an efficient clustering algorithm for large datasets. In: ACM SIGMOD conference, 1998
    6.Defays D (1977) An efficient algorithm for a complete link method. Comput J 20(4):364–366MATH MathSciNet CrossRef
    7.Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial data bases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, AAAI Press, pp 226–231
    8.Roy S, Bhattacharyya D (2005) An approach to find embedded clusters using density based techniques. In: Distributed computing and internet technology, pp 523–535
    9.Sheikholeslami G, Chatterjee S, Zhang A (1998) Wave cluster: a multi-resolution clustering approach for very large spatial databases. In: The proceedings of the 24th VLDB conference, New York, USA, pp 428–439
    10.Xiong H, Wu J, Chen J (2009) K-means clustering versus validation measures: a data-distribution perspective. IEEE Trans Syst Man Cybern Part B Cybern 39(2):318–331CrossRef
    11.Vapnik VN (1998) Statistical learning theory. Wiley, New YorkMATH
    12.Girolami M (2002) Mercer kernel based clustering in feature space. IEEE Trans Neural Netw 13(3):780–784CrossRef
    13.Camastra F, Verri A (2005) A novel kernel method for clustering. IEEE Trans Pattern Anal Mach Intell 27(5):801–805CrossRef
    14.Ng AY, Jordan MI, Weiss Y (2001) On spectral clustering: analysis and an algorithm. Adv Neural Inf Process Syst 14:849–856
    15.He Q, Jin X, Du C, Zhuang F, Shi Z (2014) Clustering in extreme learning machine feature space. Neurocomputing 128:88–95CrossRef
    16.Huang GB, Chen L, Siew CK (2006) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892CrossRef
    17.Huang GB, Wang DH, Lan Y (2011) Extreme learning machines: a survey. Int J Mach Learn Cybern 2(2):107–122CrossRef
    18.Huang GB, Zhu QY, Siew CK (2004) Extreme learning machine: a new learning scheme of feedforward neural networks. In: Proceedings of international joint conference on neural networks (IJCNN2004), vol 2, Budapest, Hungary, pp 985–990
    19.Man Z, Lee K, Wang DH, Cao Z, Miao C (2011) A new robust training algorithm for a class of single hidden layer neural networks. Neurocomputing 74:2491–2501CrossRef
    20.Man Z, Lee K, Wang D, Cao Z, Khoo S (2013) An optimal weight learning machine for handwritten digit image recognition. Signal Process 93(6):1624–1638CrossRef
    21.Belkin M, Matveeva I, Niyogi P (2004) Regularization and semi-supervised learning on large graphs. In: Proceedings of 17th conference on learning theory (COLT), 2004
    22.The IRIS data can be downloaded from the following address: http://​archive.​ics.​uci.​edu/​ml/​datasets/​Iris
    23.Wisconsin’s breast cancer database can be downloaded from the following address: http://​archive.​ics.​uci.​edu/​ml/​datasets/​Breast+Cancer+Wi​sconsin+(Original)
    24.Wine database can be downloaded from the following address: https://​archive.​ics.​uci.​edu/​ml/​datasets/​Wine
    25.Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7:179–188CrossRef
    26.Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci USA 87:9193–9196MATH CrossRef
  • 作者单位:Linsen Xie (1)
    Chengbo Lu (1)
    Ying Mei (1)
    Hong Du (1)
    Zhihong Man (2)

    1. Department of Mathematics, Lishui University, Lishui, 323000, Zhejiang, China
    2. Faculty of Science, Engineering and Technology, Swinburne University of Technology, Melbourne, VIC, 3122, Australia
  • 刊物类别:Computer Science
  • 刊物主题:Simulation and Modeling
  • 出版者:Springer London
  • ISSN:1433-3058
文摘
An algorithm for optimizing data clustering in feature space is studied in this work. Using graph Laplacian and extreme learning machine (ELM) mapping technique, we develop an optimal weight matrix W for feature mapping. This work explicitly performs a mapping of the original data for clustering into an optimal feature space, which can further increase the separability of original data in the feature space, and the patterns points in same cluster are still closely clustered. Our method, which can be easily implemented, gets better clustering results than some popular clustering algorithms, like k-means on the original data, kernel clustering method, spectral clustering method, and ELM k-means on data include three UCI real data benchmarks (IRIS data, Wisconsin breast cancer database, and Wine database). Keywords Clustering Extreme learning machine Feature space Graph Laplacian

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700