基于流形正则化的多类型关系数据联合聚类方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Multi-type Relational Data Co-clustering Approach Based on Manifold Regularization
  • 作者:黄梦婷 ; 张灵 ; 姜文超
  • 英文作者:HUANG Meng-ting;ZHANG Ling;JIANG Wen-chao;School of Computers,Guangdong University of Technology;
  • 关键词:多类型关系数据 ; 流形正则化 ; 非负矩阵分解 ; 关联矩阵
  • 英文关键词:Multi-type relational data;;Manifold regularization;;Nonnegative matrix factorization;;Correlation matrix
  • 中文刊名:JSJA
  • 英文刊名:Computer Science
  • 机构:广东工业大学计算机学院;
  • 出版日期:2019-06-15
  • 出版单位:计算机科学
  • 年:2019
  • 期:v.46
  • 基金:广东省自然科学基金项目(2016A030313703);; 广东省科技计划项目(2016B030305002,2017B030305003,2017B010124001);; 广东省产学研合作项目(2017B090901005)资助
  • 语种:中文;
  • 页:JSJA201906008
  • 页数:5
  • CN:06
  • ISSN:50-1075/TP
  • 分类号:70-74
摘要
随着大数据应用的发展,通过非线性流形采样得到的多类型关系数据规模越来越大,数据几何结构更加复杂,异构关系数据变得异常稀疏,导致数据挖掘难度增大且准确率降低。针对上述问题,提出一种基于流形非负矩阵三分解的多类型关系数据联合聚类方法:首先,对于较小规模的实体,根据其自然关系或内容相关性构造关联矩阵,对其分解后得到该类实体的聚类指示矩阵,将其作为非负矩阵三分解的输入;然后,在快速非负矩阵三分解(FNMTF)的基础上加入流形正则化处理,实现数据类型间关系与类型内部关系的联合聚类,进一步提高聚类的准确率。实验表明:在准确率和整体性能方面,流形非负矩阵三分解算法优于传统的基于非负矩阵分解的联合聚类算法。
        With the development of big data applications,the size of multi-type relational data sampled from nonlinear manifolds is getting larger.The data geometric structure is more complicated,and the heterogeneous relational data are becoming extremely sparse.As a result,data mining becomes more difficult and less accurate.In order to solve this problem,this paper proposed a manifold nonnegative matrix tri-factorization(MNMTF) approach for multi-type relational data co-clustering.First of all,the correlation matrix is constructed with the natural relationship or content relevance of smaller-scale entities and it is decomposed into indicating matrix.The indicating matrix is used as the input of nonnegative matrix tri-factorization.Then,the manifold regularization is added on the basis of fast nonnegative matrix tri-factorization(FNMTF) to simultaneously cluster data inter-type relationships and intra-type relationships,improving the accuracy of clustering.Experiments show that the accuracy and performance of MNMTF algorithm are superior to the traditional co-clustering algorithms based on nonnegative matrix factorization.
引文
[1] ROWEIS S T,SAUL L K.Nonlinear dimensionality reduction by locally linear embedding[J].Science,2000,290(5500):2323-2326.
    [2] BELKIN M,NIYOGI P.Laplacian eigenmaps for dimensionality reduction and data representation [J].Neural Computation,2003,15(6):1373-1396.
    [3] AILEM M,ROLE F,NADIF M.Co-clustering document-term matrices by direct maximization of graph modularity[C]//ACM International on Conference on Information and Knowledge Management.New York:ACM Press,2015:1807-1810.
    [4] HONDA K,TANAKA D,NOTSU A.Incremental algorithms for fuzzy co-clustering of very large cooccurrence matrix[C]//IEEE International Conference on Fuzzy Systems.Piscataway:IEEE Press,2014:2494-2499.
    [5] LEE D D,SEUNG H S.Learning the parts of objects with nonnegative matrix factorization[J].Nature,1999,401(21):788-791.
    [6] LEE D D,SEUNG H S.Algorithms for non-negative matrix factorization[C]//Neural Information Processing Systems.New York:NIPC Press 2000:535-541.
    [7] DING C,HE X,SIMON H D,et al.On the equivalence of nonnegative matrix factorization and spectral clustering[C]//SIAM International Conference on Data Mining.Philadelphia:SIAM Press,2005:606-610.
    [8] DING C,LI T,PENG W,et al.Orthogonal nonnegative matrix tri-factorizations for clustering[C]//ACM SIGKDD Internatio-nal Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2006:126-135.
    [9] LI Z,WU X.Weighted nonnegative matrix tri-factorization for co-clustering[C]//IEEE International Conference on TOOLS with Artificial Intelligence.Piscataway:IEEE Press,2011:811-816.
    [10] BUONO N D,PIO G.Non-negative Matrix Tri-Factorization for co-clustering:An analysis of the block matrix[J].Information Sciences,2015,301(20):13-26.
    [11] GU Q,ZHOU J.Co-clustering on manifolds[C]//ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2009:359-368.
    [12] WANG S,HUANG A.Penalized nonnegative matrix tri-factorization for co-clustering[J].Expert Systems with Applications,2017,78(C):64-73.
    [13] WANG S,GUO W.Robust co-clustering via dual local learning and high-order matrix factorization[J].Knowledge-Based Systems,2017,138(15):176-187.
    [14] WANG H,NIE F,HUANG H,et al.Fast nonnegative matrix tri-factorization for large-scale data co-clustering[C]//International Joint Conference on Artificial Intelligence.Menlo Park:AAAI Press,2011:1553-1558.
    [15] SHEN G,YANG W,WANG W,et al.Large-scale heteroge- neous data co-clustering based on nonnegative matrix factorization[J].Journal of Computer Research and Development,2016,53(2):459-466.(in Chinese)申国伟,杨武,王巍,等.基于非负矩阵分解的大规模异构数据联合聚类[J].计算机研究与发展,2016,53(2):459-466.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700