摘要
针对传统基于邻域的协同过滤推荐算法存在数据稀疏性及相似性度量只能利用用户共同评分的问题,提出一种基于巴氏系数和Jaccard系数的协同过滤算法(CFBJ)。在项目相似性度量中,该算法引入巴氏系数和Jaccard系数,巴氏系数利用用户所有评分信息克服共同评分的限制,Jaccard系数可以增加相似性度量中共同评分项所占的比重。该算法通过提高项目相似度准确率来选取最近邻,优化了对目标用户的偏好预测和个性化推荐。实验结果表明,该算法比平均值-杰卡德差分(MJD)算法、皮尔森系数(PC)算法、杰卡德均方差(JMSD)算法、PIP算法误差更小,分类准确率更高,有效缓解了用户评分数据稀疏所带来的问题,提高了推荐系统的预测准确率。
The traditional collaborative filtering recommendation algorithm based on neighborhood has problems of data sparsity and similarity measures only utilizing ratings of co-rated items, so a Collaborative Filtering algorithm based on Bhattacharyya coefficient and Jaccard coefficient( CFBJ) was proposed. The similarity was measured by introducing Bhattacharyya coefficient and Jaccard coefficient. Bhattacharyya coefficient could utilize all ratings made by a pair of users to get rid of common rating restrictions. Jaccard coefficient could increase the proportion of common items in similarity measurement. The nearest neighborhood was selected by improving the accuracy of item similarity and the preference prediction and personalized recommendation of the active users were optimized. The experimental results show that the proposed algorithm has smaller error and higher classification accuracy than algorithms of Mean Jaccard Difference( MJD),Pearson Correlation( PC), Jaccard and Mean Squared Different( JMSD) and PIP( Proximity-Impact-Popularity). It effectively alleviates the data sparsity problem and enhances the accuracy of recommendation system.
引文
[1]HERLOCKER J L,KONSTAN J A,TERVEEN L G,et al.Evaluating collaborative filtering recommender systems[J].ACM Transactions on Information Systems,2004,22(1):5-53.
[2]SARWAR B,KARYPIS G,KONSTAN J,et al.Item-based collaborative filtering recommendation algorithms[C]//Proceedings of the10th International Conference on World Wide Web.New York:ACM,2001:285-295.
[3]GONG S.A collaborative filtering recommendation algorithm based on user clustering and item clustering[J].Journal of Software,2010,5(7):745-752.
[4]DESHPANDE M,KARYPIS G.Item-based top-n recommendation algorithms[J].ACM Transactions on Information Systems,2004,22(1):143-177.
[5]HUANG Z,CHEN H,ZENG D.Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering[J].ACM Transactions on Information Systems,2004,22(1):116-142.
[6]ADLER J,PARMRYD I.Quantifying colocalization by correlation:the Pearson correlation coefficient is superior to the Mander's overlap coefficient[J].Cytometry Part A,2010,77(8):733-742.
[7]ANAND S S,MOBASHER B.Intelligent techniques for Web personalization[C]//Proceedings of the 2003 International Conference on Intelligent Techniques for Web Personalization.Berlin:Springer,2003:1-36.
[8]黄创光,印鉴,汪静,等.不确定近邻的协同过滤推荐算法[J].计算机学报,2010,33(8):1369-1377.(HUANG C G,YIN J,WANG J,et al.Uncertain neighbors'collaborative filtering recommendation algorithm[J].Chinese Journal of Computers,2010,33(8):1369-1377.)
[9]LUO H,NIU C,SHEN R,et al.A collaborative filtering framework based on both local user similarity and global user similarity[J].Machine Learning,2008,72(3):231-245.
[10]AHN H J.A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem[J].Information Sciences,2008,178(1):37-51.
[11]HERLOCKER J L,KONSTAN J A,BORCHERS A,et al.An algorithmic framework for performing collaborative filtering[C]//Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.New York:ACM,1999:230-237.
[12]JAMALI M,ESTER M.Trustwalker:a random walk model fo combining trust-based and item-based recommendation[C]//Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2009:397-406.
[13]BOBADILLA J,ORTEGA F,HERNANDO A,et al.A similarity metric designed to speed up,using hardware,the recommender systems k-nearest neighbors algorithm[J].Knowledge-Based Systems,2013,51:27-34.
[14]BOBADILLA J,ORTEGA F,HERNANDO A.A collaborative filtering similarity measure based on singularities[J].Information Processing&Management,2012,48(2):204-217.
[15]PATRA B K,LAUNONEN R,OLLIKAINEN V,et al.Exploiting Bhattacharyya similarity measure to diminish user cold-start problem in sparse data[M]//Discovery Science.Berlin:Springer,2014:252-263.
[16]KAILATH T.The divergence and Bhattacharyya distance measures in signal selection[J].IEEE Transactions on Communication Technology,1967,15(1):52-60.
[17]JAIN A K.On an estimate of the Bhattacharyya distance[J].IEEE Transactions on Systems Man&Cybernetics,1976,SMC-6(11):763-766.
[18]BOBADILLA J,ORTEGA F,HERNANDO A,et al.A collaborative filtering approach to mitigate the new user cold start problem[J].Knowledge-Based Systems,2012,26:225-238.
[19]BREESE J S,HECKERMAN D,KADIE C.Empirical analysis o predictive algorithms for collaborative filtering[C]//Proceedings of the Conference on Uncertainty in Artificial Intelligence.San Francisco:Morgan Kaufmann,1998:43-52.
[20]BOBADILLA J,SERRADILLA F,BERNAL J.A new collaborative filtering metric that improves the behavior of recommender systems[J].Knowledge-Based Systems,2010,23(6):520-528.