社交媒体事件检测研究综述
详细信息    查看全文 | 推荐本文 |
  • 英文篇名:Survey on Event Detection Research in Social Media
  • 作者:王冰玉 ; 吴振宇 ; 沈苏彬 ; 陈佳颖
  • 英文作者:WANG Bing-yu;WU Zhen-yu;SHEN Su-bin;CHEN Jia-ying;School of Internet of Things,Nanjing University of Posts and Telecommunications;School of Computer Science & Technology,Nanjing University of Posts and Telecommunications;
  • 关键词:事件检测 ; 事件 ; 话题 ; 社交媒体
  • 英文关键词:event detection;;event;;topic;;social media
  • 中文刊名:WJFZ
  • 英文刊名:Computer Technology and Development
  • 机构:南京邮电大学物联网学院;南京邮电大学计算机学院;
  • 出版日期:2018-04-28 11:58
  • 出版单位:计算机技术与发展
  • 年:2018
  • 期:v.28;No.257
  • 基金:国家自然科学基金青年项目(61502246);; 南京邮电大学科研启动基金项目(NY215019)
  • 语种:中文;
  • 页:WJFZ201809022
  • 页数:7
  • CN:09
  • ISSN:61-1450/TP
  • 分类号:111-117
摘要
事件检测是社交媒体挖掘的重要内容之一。目前,已经提出多种针对社交媒体数据的事件检测方法。然而,对事件的定义以及检测方法的优缺点尚未明确说明。因此,首先对事件的定义进行了说明,分析了事件与话题等易混淆概念之间的区别与联系,事件相对于话题更具局限性,而同一话题下可能涵盖多个相似或相关事件。其次,从社交媒体数据类型的角度出发,分析和总结了社交媒体事件检测方法的优缺点以及适用场景,传统媒体中常使用的基于Single-Pass、基于突发项等原理及实现简单,但是适用场景具有局限性,基于聚类的方法可实现无监督的事件自动检测,但其大部分实现都相对复杂。基于社交数据的方式则可以利用用户行为信息更及时地发现热点事件。最后,对事件检测的未来发展方向进行了展望。
        Event detection is an important part in social media data mining field. At present,a variety of methods of event detection to social media data are put forward. However,the definition of event and the advantages and disadvantages of exist methods have not been clearly stated. Therefore,we firstly describe the definition of event and analyze the differences and connections between confusing concepts such as events and topics. In general,the concept of event is narrowthan topic as the same topic may cover multiple similar or related events. Secondly,from the point of data type,the merits and demerits of event detection methods and the application scenarios are discussed and summarized. Methods based on Single-Pass and burst items are simple and easily implemented while having the limitations of application scenario. Clustering-based methods can detect events without supervision but are relatively complex and time-costing. Methods aiming at social user data is a newdirection deserve to deep research as they utilize user behavior information in time which can detect events more timely. Finally,the future direction of event detection is prospected.
引文
[1]LI Cheng,BENDERSKY M,GARG V,et al. Related event discovery[C]//Tenth ACMinternational conference on web search and data mining.[s.l.]:[s.n.],2017:355-364.
    [2]JOHNSON N F,ZHENG M,VOROBYEVA Y,et al. Newonline ecology of adversarial aggregates:ISIS and beyond[J].Science,2016,352(6292):1459-1463.
    [3]SCHINAS M,PAPADOPOULOS S,PETKOS G,et al.Multimodal graph-based event detection and summarization in social media streams[C]//Proceedings of the 23rd annual ACMconference on multimedia conference. Brisbane,Australia:ACM,2015:189-192.
    [4]童薇,陈威,孟小峰.EDM:高效的微博事件检测算法[J].计算机科学与探索,2012,6(12):1076-1086.
    [5]黄颖.LDA及主题词相关性的新事件检测[J].计算机与现代化,2012,28(1):6-9.
    [6]仓玉,洪宇,姚建民,等.基于时序话题模型的新事件检测[J].智能计算机与应用,2011,1(1):74-78.
    [7]褚衍杰,魏强,李云照.基于关键词语义与作用域扩展的事件检测[J].计算机工程,2014,40(8):273-276.
    [8]RAMOS J. Using tf-idf to determine word relevance in docum ent queries[C]//Proceedings of the first instructional conference on m achine learning.[s. l.]:[s. n.],2003:1-4.
    [9]张阔,李涓子,吴刚,等.基于词元再评估的新事件检测模型[J].软件学报,2008,19(4):817-828.
    [10]薛晓飞,张永奎,任晓东.基于新闻要素的新事件检测方法研究[J].计算机应用,2008,28(11):2975-2977.
    [11]刘炜,李明,杨合立.基于本体的话题检测与跟踪技术[J].甘肃科技,2011,27(22):42-45.
    [12]王勇,肖诗斌,郭跇秀,等.中文微博突发事件检测研究[J].现代图书情报技术,2013,29(2):57-62.
    [13]RANGREJ A,KULKARNI S,TENDULKAR A V.Comparative study of clustering techniques for short text documents[C]//Proceedings of the 20th international conference companion on world wide web. Hyderabad,India:ACM,2011:111-112.
    [14]周刚,邹鸿程,熊小兵,等.MB-SinglePass:基于组合相似度的微博话题检测[J].计算机科学,2012,39(10):198-202.
    [15]KALEEL S B,ABHARI A. Cluster-discovery of Twitter messages for event detection and trending[J]. Journal of Computational Science,2015,6:47-57.
    [16]ALLAN J,PAPKA R,LAVRENKO V.On-line newevent detection and tracking[C]//Proceedings of the 21st annual international ACMSIGIR conference on research and development in information retrieval.Melbourne,Australia:ACM,1998:37-45.
    [17]王颖颖,张赟,胡乃静.在线新事件检测系统中的性能提升策略[J].计算机工程,2008,34(15):72-74.
    [18]KLEINBERG J.Bursty and hierarchical structure in streams[J].Data Mining and Knowledge Discovery,2003,7(4):373-397.
    [19]DU Yanyan,HE Yanxiang,TIAN Ye,et al. Microblog bursty topic detection based on user relationship[C]//6th IEEE joint international information technology and artificial intelligence conference. Chongqing,China:IEEE,2011:260-263.
    [20]张鲁民,贾焰,周斌,等.一种基于情感符号的在线突发事件检测方法[J].计算机学报,2013,36(8):1659-1667.
    [21]张鲁民,贾焰,周斌.基于情感计算的微博突发事件检测方法研究[J].信息网络安全,2012(8):143-145.
    [22]张晓霞,王名扬,贾冲冲,等.基于突发词H指数的微博突发事件检测算法研究[J].情报杂志,2015,34(2):37-41.
    [23]WENG J,LEE B S.Event detection in Twitter[C]//International conference on weblogs and social media. Barcelona,Catalonia,Spain:[s.n.],2011:311-312.
    [24]杨尔弘.突发事件信息提取研究[D].北京:北京语言大学,2005.
    [25]LI Xiaoyan,CROFT W B. Time-based language models[C]//Proceedings of the twelfth international conference on information and knowledge management. NewOrleans,LA,USA:ACM,2003:469-475.
    [26]陈宏,陈伟.基于突发特征分析的事件检测[J].计算机应用研究,2011,28(1):117-120.
    [27]谢思发,林琛,苏旋,等.Hadoop平台的微博热点事件挖掘[J].小型微型计算机系统,2014,35(4):797-801.
    [28]赵洁,马铮,周晓峰,等.基于突发词项频域分析的微博突发事件检测[J].情报理论与实践,2015,38(1):124-129.
    [29]林达真,李绍滋,曹冬林.基于时间分布特征的博客突发事件检测[J].计算机工程与科学,2010,32(10):145-149.
    [30]张志瑛.基于主题模型和社区发现的微博热点事件检测研究[D].重庆:西南大学,2014.
    [31]YANG Yiming,PIERCE T,CARBONELL J.A study of retrospective and on-line event detection[C]//Proceedings of the 21st annual international ACMSIGIR conference on research and development in information retrieval. Melbourne,Australia:ACM,1998:28-36.
    [32]SAYYADI H,HURST M,MAYKOV A.Event detection and tracking in social streams[C]//Proceedings of the third international conference on weblogs and social media.[s. l.]:[s.n.],2009:17-20.
    [33]冯戈利.跨文档事件检测算法[J].机械设计与制造工程,2015,44(1):6-10.
    [34]唐晓波,童海燕,严承希.基于话题情感强度的微博舆情分析[J].图书馆学研究,2014,35(17):85-93.
    [35]SETTY V,ANAND A,MISHRA A,et al. Modeling event importance for ranking daily news events[C]//Proceedings of the tenth ACMinternational conference on web search and data mining. Cambridge,United Kingdom:ACM,2017:231-240.
    [36]GIONIS A,INDYK P,MOTWANI R. Similarity search in high dimensions via hashing[C]//Proceedings of the 25th international conference on very large data bases.[s. l.]:Morgan Kaufmann Publishers Inc.,1999:518-529.
    [37]VADREVU S,TEO C H,RAJAN S,et al.Scalable clustering of news search results[C]//ACMinternational conference on web search and data mining. Hong Kong,China:ACM,2011:675-684.
    [38]卞艺杰,陈超,马玲玲,等.一种改进的LSH/Min Hash协同过滤算法[J].计算机与现代化,2013,29(12):19-22.
    [39]DAS A S,DATAR M,GARG A,et al.Google news personalization:scalable online collaborative filtering[C]//Proceedings of the 16th international conference on world wide web.Banff,Alberta,Canada:ACM,2007:271-280.
    [40]REZAEIAN N,NOVIKOVA G M.Detecting near-duplicates in russian documents through using fingerprint algorithm simhash[J].Procedia Computer Science,2017,103:421-425.
    [41]龙志禕,程葳.基于词聚类的热点话题检测算法[J].计算机工程与设计,2011,32(6):2214-2216.
    [42]李婷玉.基于语义的文本事件信息抽取方法的研究与实现[D].上海:上海交通大学,2012.
    [43]许旭阳,李弼程,张先飞,等.基于事件实例驱动的新闻文本事件抽取[J].计算机科学,2011,38(8):232-235.
    [44]赵江江.开放域事件抽取与微博事件检测跟踪[D].哈尔滨:哈尔滨工业大学,2013.
    [45]张阔,李涓子,吴刚,等.基于关键词元的话题内事件检测[J].计算机研究与发展,2009,46(2):245-252.
    [46]杨文漪.面向微博的事件检测算法研究[D].北京:北京邮电大学,2013.
    [47]李艳,郝身刚,赵卫东,等.时间敏感的社交网络热点话题检测[J].计算机工程与设计,2014,35(12):4324-4328.
    [48]郭跇秀,吕学强,李卓.基于突发词聚类的微博突发事件检测方法[J].计算机应用,2014,34(2):486-490.
    [49]冯永,韩楠,贾东风.云计算环境下基于代表点增量层次密度聚类的微博事件检测及跟踪[J].计算机应用,2013,33(12):3559-3562.
    [50]费绍栋,杨玉珍,刘培玉,等.融合情感过滤的突发事件检测方法[J].计算机应用,2015,35(5):1320-1323.
    [51]NING Yue,MUTHIAH S,TANDON R,et al. Uncovering news-Twitter reciprocity via interaction patterns[C]//Proceedings of the 2015 IEEE/ACMinternational conference on advances in social networks analysis and mining 2015.Paris,France:ACM,2015:1-8.
    [52]HUA Ting,CHEN Feng,ZHAO Liang,et al.STED:semi-supervised targeted-interest event detectionin in twitter[C]//Proceedings of the 19th ACMSIGKDD international conference on knowledge discovery and data mining. Chicago,Illinois,USA:ACM,2013:1466-1469.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700