基于图和改进K近邻模型的高效协同过滤推荐算法
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:An Efficient Collaborative Filtering Algorithm Based on Graph Model and Improved KNN
  • 作者:孟桓羽 ; 刘真 ; 王芳 ; 徐家栋 ; 张国强
  • 英文作者:Meng Huanyu;Liu Zhen;Wang Fang;Xu Jiadong;Zhang Guoqiang;School of Computer and Information Technology,Beijing Jiaotong University;Information Technology Center,Beijing Jiaotong University;School of Computer Science and Technology,Nanjing Normal University;
  • 关键词:协同过滤 ; 社会网络 ; 图模型 ; K近邻 ; 最短路径
  • 英文关键词:collaborative filtering;;social network;;graph model;;KNN;;shortest path
  • 中文刊名:JFYZ
  • 英文刊名:Journal of Computer Research and Development
  • 机构:北京交通大学计算机与信息技术学院;北京交通大学信息中心;南京师范大学计算机科学与技术学院;
  • 出版日期:2017-07-15
  • 出版单位:计算机研究与发展
  • 年:2017
  • 期:v.54
  • 基金:国家重点研发计划项目(2016YFB1200100);; 国家自然科学基金项目(61202429,61572256);; 中央高校基本科研业务费专项资金项目(2015JBM042);; 江苏省自然科学基金项目(BK20141454)~~
  • 语种:中文;
  • 页:JFYZ201707002
  • 页数:13
  • CN:07
  • ISSN:11-1777/TP
  • 分类号:25-37
摘要
在互联网高速发展的今天,推荐系统已成为解决信息过载的有效手段,能够缓解用户在筛选感兴趣信息时的困扰,帮助用户发现有价值的信息.推荐系统中的协同过滤推荐算法,因其领域无关性及支持用户发现潜在兴趣的优点被广泛应用.由于数据的规模过大且稀疏的特点,当前协同过滤在算法实时性、推荐精确度等方面仍有较大提升空间.提出了GK-CF方法,通过建立基于图的评分数据模型,将传统的协同过滤算法与图计算及改进的KNN算法结合.通过图的消息传播及改进的相似度计算模型对用户先进行筛选再做相似度计算;以用户-项目二部图的节点结构为基础,通过图的最短路径算法进行待评分项目的快速定位.在此基础上,进一步通过并行图框架对算法进行了并行化实现及优化.在物理集群环境下进行了实验,结果表明,与已有的协同过滤算法相比,提出的GK-CF算法能够很好地提高推荐的准确度和评分预测的准确性,并具有较好的算法可扩展性和实时性能.
        With the rapid development of Internet,recommender system has been considered as a typical method to deal with the over-loading of Internet information.The recommender system can partially alleviate user's difficulty on information filtering and discover valuable information for the active user.Collaborative filtering algorithm has the advantages of domain independence and supports users'potential interests.For these reasons,collaborative filtering has been widely used.Because the user item rating matrix is sparse and in large-scale,recommender system is facing big challenges of precision and performance.This paper puts forward a GK-CF algorithm.By building agraph-based rating data model,the traditional collaborative filtering,graph algorithms and improved KNN algorithm have been integrated.Through the message propagation in the graph and the improved user similarity calculation model,candidate similar users will be selected firstly before the calculation of users similarity.Based on the topology of bipartite graph,the GK-CF algorithm ensures the quick and precise location of the candidate items through the shortest path algorithm.Under the parallel graph framework,GK-CF algorithm has been parallelized design and implement.The experiments on real world clusters show that:compared with the traditional collaborative filtering algorithm,the GK-CF algorithm can better improve recommendation precision and the rating accuracy.The GK-CF algorithm also has good scalability and real-time performance.
引文
[1]Adomavicius G,Tuzhilin A.Toward the next generation of the recommender systems:A survey of the state-of-the-art and possible extensions[J].IEEE Trans on Knowledge and Data Engineering,2005,17(6):734-749
    [2]Koren Y.Factorzation meets the neighborhood:Amultifaceted collaborative filtering model[C]//Proc of the14th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2008:426-434
    [3]Breese J S,Heckerman D,Kadie C.Empirical analysis of predictive algorithms for collaborative flitering[C]//Proc of the 14th Conf on Uncertainty in Artificial Intelligence.San Francisco,CA:Morgan Kaufmann,1998:43-52
    [4]Delgado J,Ishii N.Memory-based weighted-majority prediction for recommender systems[C]//Proc of the ACMSIGIR'99 Workshop Recommender Systems:Algorithms and Evaluation.New York:ACM,1999
    [5]Park Y,Park S,Jung W.Reversed CF:A fast collaborative filtering algorithm using a K-nearest neighbor graph[J].Expert Systems with Applications,2015,42(8):4022-4028
    [6]Das J,Aman A K,Gupta P,et al.Scalable hierarchical collaborative filtering using BSP trees[C]//Proc of the Int Conf on Computational Advancement in Communication Circuits and Systems.Berlin:Springer,2015:269-278
    [7]Liu Haifeng,Hu Zheng,Mian Ahmad,et al.A new user similarity model to improve the accuracy of collaborative filtering[J].Knowledge-Based Systems,2014,56(1):156-166
    [8]Xiang Liang,Yuan Quan,Zhao Shiwan,et al.Temporal recommendation on graphs via long-and short-term preference fushion[C]//Proc of the 16th ACM Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2010:723-731
    [9]Ding Yi,Li Xue.Time weight collaborative filtering[C]//Proc of the 14th ACM Int Conf on Information and Knowledge Management,New York:ACM,2005:485-492
    [10]Koren Y.Collaborative filtering with temporal dynamics[J].Communications of the ACM,2010,53(4):89-97
    [11]Sun Guangfu,Wu Le,Liu Hong,el al.Recommendations based on collaborative filtering by exploiting sequential behaviors[J].Journal of Software,2013,24(11):2721-2733(in Chinese)(孙光福,吴乐,刘洪,等.基于时序行为的协同过滤推荐算法[J].软件学报,2013,24(11):2721-2733)
    [12]Matook S,Brown S A,Rolf J.Forming an intention to act on recommendations given via online social networks[J].European Journal of Information Systems,2015,24(1):76-92
    [13]Jia Dongyan,Zhang Fuzhi.A collaborative filtering recommendation algorithm based on double neighbor choosing strategy[J].Journal of Computer Research and Development,2013,50(5):1076-1084(in Chinese)(贾冬艳,张付志.基于双重邻居选取策略的协同过滤推荐算法[J].计算机研究与发展,2013,50(5):1076-1084)
    [14]Qin Jiwei,Zheng Qinghua,Tian Feng.Collaborative filtering algorithms integrating trust and preference of user's emotion[J].Journal of Software,2013,24(Suppl 2):61-72(in Chinese)(秦继伟,郑庆华,田峰.一种融合信任和用户情感偏好的协同过滤算法[J].软件学报,2013,24(增刊2):61-72)
    [15]Alami Y,Nfaoui E,Beqqali O.Toward an effective hybrid collaborative filtering:A new approach based on matrix factorization and heuristic-based neighborhood[C]//Proc of the Intelligent Systems and Computer Vision(ISCV).Piscataway,NJ:IEEE,2015:51-58
    [16]Santos A.A hybird recommendation system based on human curiosity[C]//Proc of the 9th ACM Conf on Recommender Systems.New York:ACM,2015:367-370
    [17]Fouss F,Pirotte A,Renders J M,et al.Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation[J].IEEE Trans on Knowledge and Data Engineering,2007,19(3):355-369
    [18]Huang Zan,Chen Hsinchun,Zeng Daniel.Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering[J].ACM Trans on Information System,2004,22(1):116-142
    [19]Aggarwal C C,Wolf J L,Wu Kun-Lung,et al.Horting hatches an egg:A new graph-theoretic approach to collaborative filtering[C]//Proc of the 5th ACM SIGKDDInt Conf on Knowledge Discovery and Data Mining.New York:ACM,1999:201-212
    [20]Onuma K,Tong H,Faloutsos C.TANGENT:A novel,surprise me,recommendation algorithm[C]//Proc of the15th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2009:657-666
    [21]Xiang Liang,Yuan Quan,Zhao Shiwan,et al.Temporal recommendation on graphs via long-and short-term preference fusion[C]//Proc of the 16th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2010:723-732
    [22]Zhu Xia,Song Aibo,Dong Fang,et al.A collaborative filtering recommendation mechanism for cloud computing[J].Journal of Computer Research and Development,2014,51(10):2255-2269(in Chinese)(朱夏,宋爱波,东方,等.云计算环境下基于协同过滤的个性化推荐机制[J].计算机研究与发展,2014,51(10):2255-2269)
    [23]Xin R S,Crankshaw D,Dave A,et al.GraphX unifying data-parallel and graph-parallel analytics[DB/OL].Computer Science Databases.2014[2016-04-29].https://arxiv.org/abs/1402.2394
    [24]Quick L,Wilkinson P,Hardcastle D.Using pregel-like large scale graph processing frameworks for social network analysis[C]//Proc of 2012IEEE/ACM Int Conf on Advances in Social Networks Analysis and Mining.Piscataway,NJ:IEEE,2012:457-463
    [25]Zaharia M,Chowdhury M,Das T,et al.Resilient distributed datasets:A fault-tolerant abstraction for inmemory cluster computing[C]//Proc of the 9th USENIXConf on Networked System Design and Implementation.Berkeley,CA:USENIX Association,2012:15-28
    [26]Deng lin,Pei Jian,Ma Jinwen,et al.A rank sum test method for informative gene discovery[C]//Proc of the 10th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining.New York:ACM,2004:410-419
    [27]Xie Juanying,Gao Hongchao.Statistical correlation and kmeans based distinguishable gene subset selection algorithms[J].Journal of Software,2014,25(9):2050-2075(in Chinese)(谢娟英,高红超.基于统计相关性与K-Means的区分基因子集选择算法[J].软件学报,2014,25(9):2050-2075)
    (1)http://grouplens.org/datasets/movielens/
    (1)http://grouplens.org/datasets/movielens/
    (1)http://grouplens.org/datasets/movielens/
    (2)http://webscope.sandbox.yahoo.com/catalog.php?datatype=r&did=3

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700