Link prediction in heterogeneous data via generalized coupled tensor factorization
详细信息    查看全文
  • 作者:Beyza Ermi? ; Evrim Acar ; A. Taylan Cemgil
  • 关键词:Coupled tensor factorization ; Link prediction ; Heterogeneous data ; Missing data ; Data fusion
  • 刊名:Data Mining and Knowledge Discovery
  • 出版年:2015
  • 出版时间:January 2015
  • 年:2015
  • 卷:29
  • 期:1
  • 页码:203-236
  • 全文大小:2,101 KB
  • 参考文献:1. Acar E, Kolda TG, Dunlavy DM (2011a) All-at-once optimization for coupled matrix and tensor factorizations. In: KDD-1 workshop proceedings
    2. Acar E, Dunlavy D, Kolda TG, Morten M (2011b) Scalable tensor factorizations for incomplete data. Chemometr Intell Lab 106:41-6 CrossRef
    3. Al Hasan M, Zaki MJ (2011) A survey of link prediction in social networks. In: Aggarwal CC (ed) Social network data analytics. Springer, New York
    4. Alter O, Brown PO, Botstein D (2003) Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms. Proc Natl Acad Sci USA 100:3351-356 CrossRef
    5. Banerjee A, Basu S, Merugu S (2007) Multi-way clustering on relation graphs. In: SDM-7, pp 145-56
    6. Candès EJ, Plan Y (2010) Matrix completion with noise. Proc IEEE 98:925-36 CrossRef
    7. Cao B, Liu NN, Yang Q (2010) Transfer learning for collective link prediction in multiple heterogenous domains. In: ICML-0, pp 159-66
    8. Carroll JD, Chang JJ (1970) Analysis of individual differences in multidimensional scaling via an N-way generalization of “Eckart-Young-decomposition. Psychometrika 35:283-19 CrossRef
    9. Choudhury MD, Sundaram H, John A, Seligmann DD (2009) Social synchrony: predicting mimicry of user actions in online social media. In: CSE, vol 4, pp 151-58
    10. Cichocki A, Zdunek R, Phan AH, Amari S (2009) Nonnegative matrix and tensor factorization. Wiley, Chichester CrossRef
    11. Clauset A, Moore C, Newman M (2008) Hierarchical structure and the prediction of missing links in networks. Nature 453:98-01 CrossRef
    12. Davis DA, Lichtenwalter R, Chawla NV (2011) Multi-relational link prediction in heterogeneous information networks. In: ASONAM-1, pp 281-88
    13. Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. In: ACM TKDD-1, vol 5, Issue 2, Article 10
    14. Ermis B, Cemgil AT (2013) A Bayesian tensor factorization model via variational inference for link prediction. In: NIPS 2013 workshop on probabilistic models for big data (PMBD)
    15. Ermis B, Acar E, Cemgil TA (2012) Link prediction via generalized coupled tensor factorisation. In: ECML/PKDD workshop on collective learning and inference on structured data
    16. Gandy S, Recht B, Yamada I (2011) Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Probl 27:025010 CrossRef
    17. Getoor L, Diehl CP (2005) Link mining: a survey. ACM SIGKDD Explor Newsl 7(2):3-2 CrossRef
    18. Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory-multi-modal factor analysis. UCLA Work Pap Phonetics 16:1-4
    19. Harshman RA, Lundy ME (1996) Uniqueness proof for a family of models sharing features of Tucker’s three-mode factor analysis and PARAFAC/candecomp. Psychometrika 61(1):133-54 CrossRef
    20. Hitchcock FL (1927) Multiple invariants and generalized rank of a p-way matrix or tensor. J Math Phys 7:39-9
    21. Jamali M, Lakshmanan L (2013) HeteroMF: recommendation in heterogeneous information networks using context dependent factor models. In: Proceedings of the 22nd international conference on World Wide Web, WWW -3, pp 643-54
    22. Jiang M, Cui P, Liu R, Yang Q, Wang F, Zhu W, Yang S (2012) Social contextual recommendation. In: CIKM-2, pp 45-4
    23. Kaas R (2005) Compound Poisson distributions and GLM’s, Tweedie’s distribution. Technical report. Royal Flemish Academy of Belgium for Science and the Arts, Brussels
    24. Koren Y, Bell R, Volinsky C (2009) Matrix factorization techniques for recommender systems. Computer 42(8):30-7 CrossRef
    25. Lin Y-R, Sun J, Castro P, Konuru R, Sundaram H, Kelliher A (2009) MetaFac: commu
  • 刊物类别:Computer Science
  • 刊物主题:Data Mining and Knowledge Discovery
    Computing Methodologies
    Artificial Intelligence and Robotics
    Statistics
    Statistics for Engineering, Physics, Computer Science, Chemistry and Geosciences
    Information Storage and Retrieval
  • 出版者:Springer Netherlands
  • ISSN:1573-756X
文摘
This study deals with missing link prediction, the problem of predicting the existence of missing connections between entities of interest. We approach the problem as filling in missing entries in a relational dataset represented by several matrices and multiway arrays, that will be simply called tensors. Consequently, we address the link prediction problem by data fusion formulated as simultaneous factorization of several observation tensors where latent factors are shared among each observation. Previous studies on joint factorization of such heterogeneous datasets have focused on a single loss function (mainly squared Euclidean distance or Kullback–Leibler-divergence) and specific tensor factorization models (CANDECOMP/PARAFAC and/or Tucker). However, in this paper, we study various alternative tensor models as well as loss functions including the ones already studied in the literature using the generalized coupled tensor factorization framework. Through extensive experiments on two real-world datasets, we demonstrate that (i) joint analysis of data from multiple sources via coupled factorization significantly improves the link prediction performance, (ii) selection of a suitable loss function and a tensor factorization model is crucial for accurate missing link prediction and loss functions that have not been studied for link prediction before may outperform the commonly-used loss functions, (iii) joint factorization of datasets can handle difficult cases, such as the cold start problem that arises when a new entity enters the dataset, and (iv) our approach is scalable to large-scale data.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700