基于HITS算法的微博用户可信度评估
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Evaluation of microblog users' credibility based on HITS algorithm
  • 作者:吴树芳 ; 徐建民
  • 英文作者:WU Shufang;XU Jianmin;College of Management and Economics,Tianjin University;College of Management,Hebei University;School of Computer Science and Technology ,Hebei University;
  • 关键词:HITS算法 ; 微博用户 ; 可信度 ; 交互行为 ; 博文
  • 英文关键词:HITS algorithm;;microblog users;;credibility;;interaction;;blog
  • 中文刊名:SDGY
  • 英文刊名:Journal of Shandong University(Engineering Science)
  • 机构:天津大学管理与经济学部;河北大学管理学院;河北大学计算机科学与技术学院;
  • 出版日期:2016-09-18 22:48
  • 出版单位:山东大学学报(工学版)
  • 年:2016
  • 期:v.46;No.219
  • 基金:河北省社会科学基金资助项目(HB15TQ013)
  • 语种:中文;
  • 页:SDGY201605002
  • 页数:6
  • CN:05
  • ISSN:37-1391/T
  • 分类号:10-15
摘要
以新浪微博为研究平台,在HITS(hyperlink-induced topic search)算法的基础上,提出融合用户交互行为和博文内容的微博用户可信度评估算法。分别构建基于交互行为和基于博文内容的微博用户有向链接图,图中节点表示用户,有向边体现用户基于交互行为或基于内容的指向关系;依据HITS算法计算两种拓扑结构下微博用户的权威度和中心度;以融合的权威度作为度量评估用户可信度。试验采用从新浪微博采集的数据作为测试集合,通过反复训练法获得可信度阈值,绘制不同可信度算法的用户可信度曲线,验证了算法的可行性和有效性。
        Based on Sina-Microblog and HITS( hyperlink-induced topic search) algorithm,a newuser's credibility algorithm that merged user interactions and blog contents was putted forward. The newalgorithm firstly constructed two directed connection graphs based on user interactions and blog contents respectively,where nodes represented users and arcs embodied the direction relationship between users. Authority and hub of these two connected graphs was computed.The fusion authority was adopted as measurement to evaluate user's credibility. The data collected from Sina-Microblog as test set was used to conduct experiments. Threshold of credibility was obtained by repeated training,and then credibility curves of different algorithms were drawn to verify the feasibility and effectiveness of the newalgorithm.
引文
[1]SONG J,LEE S,KIM J.Spam filtering in Twitter using sender-receiver relationship[M].Berlin,German:Springer,2006:301-317.
    [2]王越,张剑金,刘芳芳.一种多特征微博僵尸粉检测方法与实现[J].中国科技论文,2014,9(1):81-86.WANG Yue,ZHANG Jianjin,LIU Fangfang.Detection of micro-blog zombie fans based on multi-features[J].China Science Paper,2014,9(1):81-86.
    [3]刘晓飞.基于链接分析的微博用户可信度研究[D].兰州:兰州交通大学,2015.LIU Xiaofei.Research on credibility of microblog users based on link analysis[D].Lanzhou:Lanzhou Jiaotong University,2015.
    [4]蒋盛益,陈东沂,庞观松,等.微博信息可信度分析研究综述[J].图书情报工作,2013,57(12):136-142.JIANG Shengyi,CHEN Dongyi,PANG Guansong,et al.A review of micro-blog information reliability analysis[J].Library and Information Service,2013,57(12):136-142.
    [5]毛佳昕,刘奕群,张敏,等.基于用户交互行为的微博用户社会影响力分析[J].计算机学报,2014,37(4):791-880.M AO Jiaxin,LIU Yiqun,ZHANG M in,et al.Social influence analysis for micro-blog user based on user behavior[J].Chinese Journal of Computers,2014,37(4):791-880.
    [6]Wikipedia Inc.Credibility[EB/OL].(2013-01-20)[2015-01-20].http://en.wikipedia.org/wiki/Credibility.
    [7]CASTILLO C,MENDOZA M,POBLTETE B.Information credibility on Tw itter[C]//Proceedings of Information International Conference on World Wide Web.New York,USA:ACM Press,2011:675-684.
    [8]闫光辉,刘晓飞,王梦阳.基于链接的微博用户可信度研究[J].计算机应用研究,2015,32(10):2910-2917.YAN Guanghui,LIU Xiaofei,WANG M engyang.Research on credibility of microblog users based on link[J].Application Research of Computers,2015,32(10):2910-2917.
    [9]GUPTA M,ZHAO P,ZHAO J.Evaluation event credibility on Tw itter[C]//Proceedings of the 2012 SIAM International Conference on Data M ining.California,USA:SIAM Press,2012:153-164.
    [10]MUKHERJEE A,LIU B,GLANCE N.Spotting fake review er groups in consumer review er[C]//Proceedings of the 21st International Conference on World Wide Web.New York,USA:ACM Press,2012:191-200.
    [11]CHU Z,GIANVECCHIO S,WANG H,et al.Detecting automation of tw itter accounts:are you a human,bot,or cyborg?[J].IEEE Transactions on Dependable and Secure Computing,2012,9(6):811-824.
    [12]徐建民,粟武林,吴树芳,等.基于逻辑回归的微博用户可信度建模[J].计算机工程与设计,2015,36(3):772-777.XU Jianmin,SU Wulin,WU Shufang,et al.M odeling user reliability based on logistic regression in M icro-blog[J].Computer Engineerinlg and Design,2015,36(3):772-777.
    [13]苗家,马军,陈竹敏.一种基于HITS算法的Blog文摘方法[J].中文信息学报,2011,25(1):104-109.M IAO Jia,M A Jun,CHEN Zhumin.A new HITSbased summarization approach for Blog[J].Journal of Chinese Information Processing,2011,25(1):104-109.
    [14]周小平,梁循,张海燕.基于R-C模型的微博用户社区发现[J].软件学报,2014,25(12):2808-2823.ZHOU Xiaoping,LIANG Xun,ZHANG Haiyan.User community detection M icro-blog using R-C model[J].Journal of Softw are,2014,25(12):2808-2823.
    [15]KLEINBERG J M.Authoritative sources in a hyperlinked environment[C]//Proceedings of the 9th Annual ACM-SIAM Symposium on Discrete Algorithms.New York,USA:ACM Press,1998:668-677.
    [16]田中生.基于影响力的社会网络关键用户识别方法研究[D].长春:吉林大学,2015.TIAN Zhongsheng.Research on key user identification method based on influence in social netw orks[D].Changchun:Jinlin University,2014.
    [17]李赫元,俞晓明,刘悦,等.中文微博客的垃圾用户检测[J].中文信息学报,2014,28(3):62-67.LI Heyuan,YU Xiaoming,LIU Yue,et al.Research on detecting spammer in M icro-blogs[J].Journal of Chinese Information Processing,2014,28(3):62-67.
    [18]王峰,余伟,李石君.新浪微博平台上的用户可信度评估[J].计算机科学与探索,2013,7(12):1125-1134.WANG Feng,YU Wei,LI Shijun.Evaluation of user credibility based on Sina w eibo platform[J].Journal of Frontiers of Computer Science and Technology,2013,7(12):1125-1134.