An improved clustering ensemble method based link analysis
详细信息    查看全文
  • 作者:Zhi-Feng Hao (1) (2)
    Li-Juan Wang (1) (2)
    Rui-Chu Cai (1)
    Wen Wen (1)

    1. Faculty of Computer
    ; Guangdong University of technology ; Guangzhou ; 510006 ; China
    2. School of Computer Science and Engineering
    ; South China University of Technology ; Guangzhou ; 510006 ; China
  • 关键词:K ; means clustering ; Clustering ensemble ; Link analysis
  • 刊名:World Wide Web
  • 出版年:2015
  • 出版时间:March 2015
  • 年:2015
  • 卷:18
  • 期:2
  • 页码:185-195
  • 全文大小:662 KB
  • 参考文献:1. Adamic, L.A., Adar, E.: Friends and neighbors on the Web. Soc. Networks 25(3), 211鈥?30 (2003) CrossRef
    2. Ayad, H., and Kamel, M.: 鈥淔inding Natural Clusters Using Multiclusterer Combiner Based on Shared Nearest Neighbors,鈥?Proc. Int鈥檒 Work. Mult. Classif. Syst., 166鈥?75 (2003)
    3. Borges, J., Levene, M.: Ranking pages by topology and popularity within Web sites. World Wide Web 9, 301鈥?16 (2006) CrossRef
    4. Domeniconi, C., Al-Razgan, M.: Weighted Cluster Ensembles: Methods and Analysis. ACM Trans. Knowl. Discov. Data 2(4), 1鈥?0 (2009) CrossRef
    5. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. John Wiley & Sons, New York (2001)
    6. Fern, X.Z., Brodley, C.E.: 鈥淩andom projection for high dimensional clustering: A cluster ensemble approach,鈥?Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003), Washington DC, 186鈥?93 (2003)
    7. Fischer, B., Buhmann, J.M.: Bagging for path-based clustering. IEEE Trans. Pattern Anal. Mach. Intell. 25(11), 1411鈥?415 (2003)
    8. Fouss, F., Pirotte, A., Renders, J.M., Saerens, M.: Random-Walk Computation of Similarities between Nodes of a Graph with Application to Collaborative Recommendation. EEE Trans. Knowl. Data Eng. 19(3), 355鈥?69 (2007) CrossRef
    9. Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835鈥?50 (2005)
    10. Getoor, L., Diehl, C.P.: Link mining: a survey. ACM SIGKDD Explor. Newsl. 7(2), 3鈥?2 (2005) CrossRef
    11. Gionis, A., Mannila, H. and Tsaparas, P.: 鈥淐lustering Aggregation,鈥?Proc. Int鈥檒 Conf. Data Eng., 341鈥?52 (2005)
    12. Iam-On, N., Boongoen, T., Garrett, S., Price, C.: A link-based approach to the cluster ensemble problem. IEEE Trans. Pattern Anal. Mach. Intell. 33(12), 2396鈥?409 (2011) CrossRef
    13. Jain, A.K., Law, M.H.C.: Data clustering: A user鈥檚 dilemma鈥? Pattern Recognition and Machine Intelligence, pp. 1鈥?0. Springer-Verlag, Berlin (2005) CrossRef
    14. Jain, A., Murty, M., Flynn, P.: Data clustering: a review. ACM Comput. Surv. 31, 264鈥?23 (1999) CrossRef
    15. Karypis, G., Kumar, V.: Multilevel k-Way Partitioning Scheme for Irregular Graphs. J. Parallel Distrib. Comput. 48(1), 96鈥?29 (1998) CrossRef
    16. Kellam, P., Liu, X., Martin, N.J., Orengo, C., Swift, S. and Tucker, A.: 鈥淐omparing, contrasting and combining clusters in viral gene expression data,鈥?in Proc. 6th Workshop Intell. Data Anal. Med. Pharmocol., 56鈥?2 (2001)
    17. Kuncheva, L.I., Vetrov, D.P.: Evaluation of stability of k-means cluster ensembles with respect to random initialization. IEEE Trans. Pattern Anal. Mach. Intell. 28(11), 1798鈥?808 (2006) CrossRef
    18. Li, J.Q., Zhao, Y., Garcia-Molina, H.: A path-based approach for web page retrieval. World Wide Web 15, 257鈥?83 (2012) CrossRef
    19. Likas, A., Vlassis, N., Verbeek, J.J.: The Global k-Means Clustering Algorithm. Pattern Recognit. 36, 451鈥?61 (2003) CrossRef
    20. Lin, Z., King, I. and Lyu, M.R.: 鈥淧ageSim: A Novel Link-Based Similarity Measure for the World Wide Web,鈥漃roc. IEEE/WIC/ACM Int鈥檒 Conf. Web Intell., 687鈥?93 (2006)
    21. Minaei-Bidgoli, B. Topchy, A. and Punch, W.: 鈥淎 Comparison of Resampling Methods for Clustering Ensembles,鈥?Proc. Int鈥檒 Conf. Mach. Learn. Models Technol. Appl., 939鈥?45 (2004)
    22. Monti, S., Tamayo, P., Mesirov, J.P., Golub, T.R.: Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach. Learn. 52, 91鈥?18 (2003)
    23. Natthakan Iam-On, Tossapon Boongoen, Improved Link-Based Cluster Ensembles,WCCI 2012 IEEE World Congress on Computational Intelligence. Brisbane(2012)
    24. Ng, A., Jordan, M., Weiss, Y.: On spectral clustering: analysis and an algorithm. Adv. Neural Inf. Process. Syst 14, 849鈥?56 (2001)
    25. Nguyen, N. and Caruana, R.: 鈥淐onsensus Clusterings,鈥?Proc. IEEE Int鈥檒 Conf. Data Min., 607鈥?12 (2007)
    26. Punera, K., Ghosh, J.: Soft cluster ensembles. In: de Oliveira Valente, J., Pedrycz, W. (eds.) Advances in fuzzy clustering and its applications. Wiley, Hoboken (2007)
    27. Strehl, A., Ghosh, J.: Cluster Ensembles: a Knowledge Reuse Framework for Combining Multiple Partitions. J. Mach. Learn. Res. 3, 583鈥?17 (2002)
    28. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27(12), 1866鈥?881 (2005) CrossRef
    29. Wang, T.: CA-Tree: a Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles. IEEE Trans. Syst. Man Cybern.鈥擯ART B: Cybern. 41(3), 686鈥?98 (2011) CrossRef
    30. Wei, F., Qian, W., Wang, C., Zhou, A.: Detecting overlapping community structures in networks. World Wide Web 12, 235鈥?61 (2009) CrossRef
  • 刊物类别:Computer Science
  • 刊物主题:Information Systems Applications and The Internet
    Database Management
    Operating Systems
  • 出版者:Springer Netherlands
  • ISSN:1573-1413
文摘
Clustering Ensemble aggregates several base clustering analyses into a consensus clustering result, which is more accurate, stable and meaningful than standard clustering algorithm. In this paper, the ensemble information is described by data cluster association matrix. However, most data cluster association matrix overlooks an important type of information about the relationship between clusters. This paper proposes a new method WETU to refine the data cluster association matrix with link-based similarity measure. The refined data cluster association matrix is obtained according to the similarity of clusters among all base clustering results, not in one base clustering result. In addition, WETU can provide more discriminative information than CSM and WTU. The data cluster association matrix is refined into high level real-valued matrix, which can be aggregated by real-valued method, such as Global k-means. Experiments on synthetic dataset and UCI datasets show that the proposed method outperforms standard K-means, base clustering algorithm and CSM+Global k-means and WTU+Global k-means.T

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700