基于SkipGram模型的链路预测方法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:A LINK PREDICTION METHOD BASED ON SKIPGRAM MODEL
  • 作者:赵超 ; 朱福喜 ; 刘世超
  • 英文作者:Zhao Chao;Zhu Fuxi;Liu Shichao;School of Computer Science,Wuhan University;School of Computer Science and Technology,Hankou University;
  • 关键词:链路预测 ; 向量表征 ; SkipGram模型 ; 节点相似性
  • 英文关键词:Link prediction;;Vector representation;;SkipGram model;;Node similarity
  • 中文刊名:JYRJ
  • 英文刊名:Computer Applications and Software
  • 机构:武汉大学计算机学院;汉口学院计算机科学与技术学院;
  • 出版日期:2017-10-15
  • 出版单位:计算机应用与软件
  • 年:2017
  • 期:v.34
  • 基金:国家自然科学基金项目(61272277)
  • 语种:中文;
  • 页:JYRJ201710043
  • 页数:7
  • CN:10
  • ISSN:31-1260/TP
  • 分类号:247-253
摘要
现有的基于节点相似性的链路预测算法,在提升预测准确度时往往无法兼顾计算复杂度。受自然语言概率图模型在词向量表征上的运用启发,提出一种基于SkipGram模型的链路预测方法。首先提出基于概率的随机游走方法,通过这种方法得到网络节点的采样序列;然后结合SkipGram模型将网络节点映射到一个低维向量空间来降低复杂度;最终以向量间的距离作为衡量网络节点间相似性的指标,进而完成链路预测。通过在6个具有代表性的真实网络中进行实验和比较发现,提出的模型在预测准确度上得到大幅提高。
        The existing link prediction algorithm based on node similarity can hardly keep low complexity of the computation when aiming to promote prediction accuracy. Inspired by the application of probabilistic graphical model of natural language,this paper proposes a link prediction method based on SkipGram model. First,the random walk based on probability method was proposed,and the sampling sequence of the network nodes was obtained by this method.Then,the network nodes were mapped to a low dimensional vector space to reduce the complexity by combining the SkipGram model. In the end,the distance between vectors was used as the index to measure the similarity between the nodes of the network to accomplish link prediction. Through experiments and comparison in six representative real networks,the model proposed in this paper can improve the accuracy of prediction a lot.
引文
[1]Lei C,Ruan J.A novel link prediction algorithm for reconstructing protein-protein interaction networks by topological similarity[J].Bioinformatics,2013,29(3):355-364.
    [2]Yu Q,Long C,Lv Y,et al.Predicting co-author relationship in medical co-authorship networks[J].Plos One,2014,9(7):101-214.
    [3]Liben-Nowell D,Kleinberg J.The link prediction problem for social networks[J].Journal of the Association for Information Science and Technology,2007,58(7):1019-1031.
    [4]Zhu J,Hong J,Hughes J G.Using Markov Chains for Link Prediction in Adaptive Web Sites[C]//International Conference on Computing in An Imperfect World.Springer-Verlag,2002:60-73.
    [5]Saruukkai B R.Link prediction and path analysis using markov chains[J].Computer Networks,2010,33(1-6):377-386.
    [6]Adamic L A,Adar E.Friends and neighbors on the Web[J].Social Networks,2003,25(3):211-230.
    [7]Zhou T,LüL,Zhang Y C.Predicting missing links via local information[J].The European Physical Journal B,2009,71(4):623-630.
    [8]Bengio Y,Vincent P,Janvin C.A neural probabilistic language model[J].Journal of Machine Learning Research,2003,3(6):1137-1155.
    [9]Mikolov T,Chen K,Corrado G,et al.Efficient Estimation of Word Representations in Vector Space[J].Computer Science,2013,4:124-132.
    [10]Shavlik J W.Proceedings of the Fifteenth International Conference on Machine Learning[C]//Fifteenth International Conference on Machine Learning.Morgan Kaufmann Publishers Inc,1998:498-517.
    [11]Subelj L,Bajec M.Groups of nodes in complex real-world networks:An algorithm and comparison of the state-of-theart[J].Physica A-statistical Mechanics&Its Applications,2013,397:102-114.
    [12]Jaccard P.Lois de distribution florale dans la zone alpine[J].Bulletin De La Societe Vaudoise Des Sciences Naturelles,1902,38(144):69-130.
    [13]Barabási A-L,Albert R.Emergence of Scaling in Random Networks[J].Science,1999,286(5439):509-512.
    [14]Cano L,Diaz R.Indirect Influences on Directed Manifolds[J].Mathematics,2015,32(5):10-26.
    [15]Wang Jing,Rong Lili.Similarity index based on the information of neighbor nodes for link prediction of complex network[J].Modern Physics Letters B,2013,27(27):793-799.
    [16]Bartusiak R,Kajdanowicz T,Wierzbicki A,et al.Cooperation Prediction in Git Hub Developers Network with Restricted Boltzmann Machine[M].Intelligent Information and Database Systems,2016.
    [17]Paparo G D,Müller M,Comellas F,et al.Quantum Google Algorithm:Construction and Application to Complex Networks[J].Eur.phys.j.plus,2014,129(7):1137-1143.
    [18]Jeh G,Widom J.Mining the space of graph properties[C]//Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2004:187-196.
    [19]Blondel V,Gajardo A,Heymans M,et al.A measure of similarity between graph vertices[J].Siam Review,2004,46(4):647-666.
    [20]Leicht E A,Holme P,Newman M E.Vertex similarity in networks[J].Physical Review E Statistical Nonlinear&Soft Matter Physics,2006,73(2):65-93.
    [21]Clauset A,Moore C,Newman M E.Hierarchical structure and the prediction of missing links in networks[J].Nature,2008,453(7191):98-101.
    [22]Kolzsch A,Blasius B.Indications of marine bioinvasion from network theory[J].The European Physical Journal B,2011,84(4):601-612.
    [23]Cimini G,Medo M,Zhou T,et al.Heterogeneity,quality,and reputation in an adaptive recommendation model[J].The European Physical Journal B,2011,80(2):201-208.
    [24]Zhang D,Wu C,Xiong W,et al.Study on topology design for large scale service overlay networks[C]//International Conference on Wireless Communications,NETWORKING and Mobile Computing.IEEE Press,2009:3821-3824.
    [25]Pan X,Deng G,Liu J G.Weighted bipartite network and personalized recommendation[J].Physics Procedia,2010,3(5):1867-1876.
    [26]Lee S G,Lee J Y,Chmielewski J.Information filtering via weighted heat conduction algorithm[J].Physica A Statistical Mechanics&Its Applications,2011,390(12):2414-2420.
    [27]Getoor L.Link mining:a new data mining challenge[J].Acm SIGKDD Explorations Newsletter,2003,5(1):84-89.
    [28]李志宇,梁循,周小平.一种大规模网络中基于节点结构特征映射的链接预测方法[J].计算机学报,2016,24(5):149-157.
    [29]Liang Z,Xu M,Teng M,et al.Coevolution is a short-distance force at the protein interaction level and correlates with the modular organization of protein networks[J].Febs Letters,2010,584(19):4237-4240.
    [30]Newman M E.Finding community structure in networks using the eigenvectors of matrices[J].Physical Review E Statistical Nonlinear&Soft Matter Physics,2006,74(3):92-100.
    [31]Watts D J,Strogatz S H.Collective dynamics of‘smallworld’networks[J].Nature,1998,393(6684):440-442.
    [32]Ackland R,Ackland R.Mapping the U.S.Political Blogosphere:Are Conservative Bloggers More Prominent[C]//Blog Talk Downunder 2005 Conference.Berlin:Springer,2005:1-12.
    [33]Mahajan R,Spring N,Wetherall D,et al.User-level Internet Path Diagnosis[J].Acm Sigops Operating Systems Review,2003,37(5):106-119.
    [34]Surhone L M,Tennoe M T,Henssonow S F,et al.USAir Flight 427[M].Betascript Publishing,2010:165-178.
    [35]Hanley J A,Mcneil B J.The meaning and use of the area under a receiver operating characteristic(ROC)curve[J].Radiology,1982,143(1):29-36.
    [36]Herlocker J L,Konstann J A,Terveen K,et al.Evaluating collaborative filtering recommender systems[J].ACM Transactions on Information Systems,2004,22(1):5-53.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700