摘要
推荐系统的冷启动问题是近期的研究热点,而用户的活跃性判定是冷启动问题的基础。已有方法在判定用户的活跃性时,单纯地考虑了用户发表信息量,对社交媒体的社交关系及行为等特征利用不够。该文面向微博网络,提出了系统的用户活跃性判定方法,创新性主要体现在:(1)提出了微博网络影响用户活跃性的四类指标,包括用户背景、社交关系、发表内容质量及社交行为,避免了仅仅使用用户发表信息数量判定用户是否活跃的粗糙方式;(2)提出了用户活跃性判定流程,提出了基于四类指标的用户与用户集的差异度计算模型。以新浪微博为例,选取了学术研究、企业管理、教育、文化、军事五个领域的900个用户作为测试集,使用准确率P、召回率R及F值为评价指标,进行了实验分析和比较。结果显示,该文所提用户活跃性判定方法的准确率P、召回率R、F值比传统的判定方法分别提高了21%、13%和16%,将该文所提方法用于用户推荐,得到的P、R和F值比最新的方法分别提高了5%、2%和3%,验证了所提方法的有效性。
To determining the user activeness,the existing methods mainly centered on the amount of information users posted,without proper utilizing the userssocial relationship and behavior on microblog.This paper proposes a systematic method of determining the user activeness on microblog.In this method,four indexes are introduced to determinate usersactiveness on microblog,including usersprofile,social relationship,information quality and social behavior.And we also present the flow of determining the user activeness,and computation model for the diversity between a user and the whole user set.From Sina microblog,we select 900 users as the test set from the domain of academic research,business management,education,culture and military.Precision,Recall and F-value were used as evaluation index for experimental analysis and comparison among methods.The results show that our method improves the precision,recall and F-value of the user activeness determination by 21%,13% and 16%,respectively.Applying the proposed method to user recommendation,the precision,recall and F-value are increased by 5%,2%and 3%,respectively.
引文
[1]高明,金澈清,钱卫宁,等.面向微博系统的实时个性化推荐[J].计算机学报,2014,37(4):963-975.
[2]Chen K L,Chen T Q,Zheng G Q,et al.Collaborative personalized tweet recommendation[C]//Proceeding of the 35th International ACM SIGIR Conference on Research and Development Information Retrieval.Portland,OR,USA,2012:661-670.
[3]仲兆满,胡云,李存华,等.微博中特定用户的相似用户发现方法[J].计算机学报,2016,39(4):765-779.
[4]徐志明,李栋,刘挺,等.微博用户的相似性度量及其应用[J].计算机学报,2014,37(1):207-218.
[5]彭泽环,孙乐,韩先培,等.基于排序学习的微博用户推荐[J].中文信息学报,2013,27(4):96-102.
[6]Zeng W,Zeng A,Liu H,et al.Uncovering the information core in recommender systems[J].Scientific Reports,2014(4):6140.
[7]汪祥,贾焰,周斌,陈儒华,韩毅.基于交互关系的微博用户标签预测[J].计算机工程与科学,2013,35(10):44-50.
[8]Akcora C G,Carminati B,Ferrari E.User similarities on social networks[J].Social Network Analysis and Mining,2013(3):475-495.
[9]Lin J,Sugiyaman K,Kan M Y,et al.Addressing cold-start in app recommendation:Latent user models constructed from Twitter followers[C]//Proceedings of the SIGIR’13,2013:283-293.
[10]Massa P,Avesani P.Trust-aware recommender systems[C]//Proceedings of the 2007 ACM Conference on Recommender Systems,2007:17-24.
[11]Guo G B,Zhang J,Thalmann D.Merging trust in collaborative filtering to alleviate data sparsity and cold start[J].Knowledge-Based Systems,2014(57):57-68.
[12]Pereira A L V,Hruschka E R.Simultaneous co-clustering and learning to address the cold start problem in recommender systems[J].Knowledge-Based Systems,2015(82):11-19.
[13]古万荣,董守斌,曾之肇,等.基于微博用户模型的个性化新闻推荐.中文信息学报,2016,30(1):93-100.
[14]王占,林岩.基于信任与用户兴趣变化的协同过滤方法研究[J].情报学报,2017,36(2):197-205.
[15]Meyffret S,Medini L,Laforest F.Trust-based local and social recommendation[C]//Proceedings of the2012 ACM Conference on Recommender Systems.Dublin,Ireland,ACM,2012:53-60.
[16]Yuan W W,Yang X W,Steck H,et al.Circle-based recommendation in online social networks[C]//Proceedings of the 2012ACM Conference on Recommender Systems.Beijing,China,ACM,2012:1267-1275.
[17]Ocepek U,Rugelj J,Bosnic Z.Improving matrix factorization recommendations for examples in cold start[J].Expert Systems with Applications,2015(42):6784-6794.
[18]于洪,李俊华.一种解决新项目冷启动问题的推荐算法[J].软件学报,2015,26(6):1395-1408.
[19]杨圩生,罗爱民,张萌萌.基于信任环的用户冷启动推荐[J].计算机科学,2013,40(11a):363-366.
[20]Balcan D,Colizza V,Gon9alves B,et al.Multiscale mobility networks and the spatial spreading of infectious diseases[J].Proceedings of the National Academy of Sciences,2009,106(51):21484-21489.
[21]Liang C,Liu Z Y,Sun M S.Expert finding for Microblog misinformation identification[C]//Proceedings of the 24th ACL International Conference on Computational Linguistics.Mumbai,2012:703-712.
[22]仲兆满,管燕,胡云,等.基于背景和内容的微博用户兴趣挖掘[J].软件学报,2017,28(2):278-291.
(1)2015年5月28日执行完采集。