摘要
随着电子商务的发展,推荐系统被广泛用于挖掘用户行为数据中的商业价值.基于kNN的协同过滤是经典的推荐算法,但存在两个主要问题:时间复杂度高以及使用单个距离度量导致预测精度低.本文提出了一种聚类与kNN相结合的协同过滤算法(C-kNN).在预处理阶段,使用M-distance将商品划分成多个簇.在评级预测阶段,只有簇内的项目作为距离计算和预测的候选邻居.在四个真实数据集上的实验结果表明,C-kNN比经典kNN在MAE和RMSE上均有可观提升.
With the development of e-commerce,recommendation systems are widely used to mine business value from user behavior data. Collaborative filtering based on kNN is the classical algorithm of the recommendation system,but there are two main problems:high time complexity,and the use of a single distance metric leads to low prediction accuracy. This paper proposes a clustering and kNN combined collaborative filtering algorithm( C-kNN). In the preprocessing phase,the items are divided into multiple clusters using the M-distance. In the rating prediction stage,only intra-cluster items are used as candidate neighbors for distance calculation and prediction. The experimental results on four common datasets show that C-kNN obtains a considerable improvement in both MAE and RM SE than the classic kNN.
引文
[1]Linden G,Smith B,York J.Amazon.com recommendations:itemto-item collaborative filtering[J].IEEE Internet Computing,2003,7(1):76-80.
[2]Herlocker J,Konstan J A,Riedl J.An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms[J].Information Retrieval,2002,5(4):287-310.
[3]Aditya P H,Budi I,Munajat Q.A comparative analysis of memorybased and model-based collaborative filtering on the implementation of recommender system for E-commerce in Indonesia:a case study PT X[C].Proceedings of the International Conference on Advanced Computer Science and Information Systems,2017:303-308.
[4]Kamishima T,Akaho S,Asoh H,et al.Model-based approaches for independence-enhanced recommendation[C].Proceedings of the International Conference on Data Mining Workshops,2017:860-867.
[5]Li Tao-ying,Li Mo,Li Peng-hui.Personalized collaborative filtering recommendation algorithm based on weighted slope one[J].Application Research of Computers,2017,34(8):2264-2268.
[6]Zhang F,Lu Y,Chen J,et al.Robust collaborative filtering based on non-negative matrix factorization and R1-norm[J].KnowledgeBased Systems,2016,118(C):177-190.
[7]Wu S,Ren W,Yu C,et al.Personal recommendation using deep recurrent neural networks in NetEase[C].Proceedings of the International Conference on Data Engineering,2016:1218-1229.
[8]Park H S,Yoo J O,Cho S B.A context-aware music recommendation system using fuzzy Bayesian networks with utility theory[C].Proceedings of the International Conference on Fuzzy Systems and Knowledge Discovery,2006:970-979.
[9]Dudani S A.The distance-weighted k-nearest-neighbor rule[J].IEEE Transactions on Systems M an&Cybernetics,1976,6(4):325-327.
[10]Li Bin,Zhang Bo,Liu Xue-jun,et al.Collaborative filtering recommendation algorithm based on jaccard similarity and locational behaviors[J].Computer Science,2016,43(12):200-205.
[11]Phueaknumpol V,Budsabong Z,Kerdprasop K,et al.Product recommendation system by approximate search based on manhattan distance measurement[C].Proceedings of the 11th International Conference on Computational Intelligence,2012:153-158.
[12]Mei Zheng,Fan Min,Heng-ru Zhang,et al.Fast recommendations with the M-distance[J].IEEE Access,2016,4:1464-1468.
[13]Macqueen J.Some methods for classification and analysis of multivariate observations[C].Proceedings of the Berkeley Symposium on M athematical Statistics and Probability,1967:281-297.
[14]Kearns M,Ron D.Algorithmic stability and sanity-check bounds for leave-one-out cross-validation[J].Neural Computation,1999,11(6):1427-1453.
[15]Ding Xiao-huan,Peng Fu-rong,Wang Qiong,et al.Co-clustering recommendation algorithm based on parallel factorization decomposition[J].Journal of Computer Applications,2016,36(6):1594-1598.
[16]Yang Da-xin,Wang Rong-bo,Huang Xiao-xi,et al.K-means user clustering recommendation algorithm based on minimum variance[J].Computer Technology and Development,2018,28(1):104-107.
[5]李桃迎,李墨,李鹏辉.基于加权Slope One的协同过滤个性化推荐算法[J].计算机应用研究,2017,34(8):2264-2268.
[10]李斌,张博,刘学军,等.基于Jaccard相似度和位置行为的协同过滤推荐算法[J].计算机科学,2016,43(12):200-205.
[15]丁小焕,彭甫镕,王琼,等.基于平行因子分解的协同聚类推荐算法[J].计算机应用,2016,36(6):1594-1598.
[16]杨大鑫,王荣波,黄孝喜,等.基于最小方差的K-means用户聚类推荐算法[J].计算机技术与发展,2018,28(1):104-107.