On the Predictive Power of Web Intelligence and Social Media
详细信息    查看全文
  • 关键词:Web intelligence ; Open ; source intelligence ; Web and social media mining ; Twitter analysis ; Forecasting ; Event extraction ; Temporal analytics ; Sentiment analysis
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9546
  • 期:1
  • 页码:26-45
  • 全文大小:900 KB
  • 参考文献:1.Ahram.org. Egypt warms up for a decisive day of anti- and pro-Morsi protests. www.​english.​ahram.​org.​eg/​NewsContent/​1/​64/​75483/​Egypt/​Politics-/​Egypt-warms-up-for-a-decisive-day-of-anti-and-proM.​aspx . Accessed 25 August 2013
    2.Asur, S., Huberman, B.A.: Predicting the future with social media. In: WI-IAT (2010)
    3.Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. J. Comput. Sci. 2(1), 1–8 (2011)CrossRef
    4.Choi, H., Varian, H.: Predicting the present with google trends. Econ. Rec. 88(s1), 2–9 (2012)CrossRef
    5.Da, Z., Engelberg, J., Gao, P.: In search of attention. J. Finance 66(5), 1461–1499 (2011)CrossRef
    6.Gayo-Avello, D.: No, you cannot predict elections with twitter. IEEE Internet Comput. 16(6), 91–94 (2012)CrossRef
    7.Goel, S., Hofman, J.M., Lahaie, S., Pennock, D.M., Watts, D.J.: Predicting consumer behavior with web search. PNAS 107(41), 17486–17490 (2010)CrossRef
    8.González-Bailón, S., Borge-Holthoefer, J., Rivero, A., Moreno, Y.: The dynamics of protest recruitment through an online network. Sci. Rep. 1, 197 (2011)
    9.Gruhl, D., Chavet, L., Gibson, D., Meyer, J., Pattanayak, P., Tomkins, A., Zien, J.: How to build a webfountain: an architecture for very large-scale text analytics. IBM Syst. J. 43(1), 64–77 (2004)CrossRef
    10.Gruhl, D., Guha, R., Kumar, R., Novak, J., Tomkins, A.: The predictive power of online chatter. In: SIGKDD (2005)
    11.Liaw, W.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)MathSciNet
    12.Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kübler, S., Marinov, S., Marsi, E.: MaltParser: a language-independent system for data-driven dependency parsing. Nat. Lang. Eng. 13(2), 95–135 (2007)
    13.NYTimes.com. Protester Dies in Clash That Apparently Involved Hezbollah Supporters. www.​nytimes.​com/​2013/​06/​10/​world/​middleeast/​protester-dies-in-lebanese-clash-said-to-involve-hezbollah-supporters.​html . Accessed 24 August 2013
    14.R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria (2013)
    15.Radinsky, K., Horvitz, E.: Mining the web to predict future events. In: WSDM (2013)
    16.Telegraph.co.uk. Twitter in numbers. www.​telegraph.​co.​uk/​technology/​twitter/​9945505/​Twitter-in-numbers.​html . Accessed 25 August 2013
    17.TheGuardian.com. John Kerry urges peace in Egypt amid anti-government protests.www.​theguardian.​com/​world/​video/​2013/​jun/​26/​kerry-urges-peace-egypt-protests-video . Accessed 25 August 2013
    18.Ward, J.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)CrossRef
    19.Zhang, D., Guo, B., Yu, Z.: The emergence of social and community intelligence. Computer 7, 21–28 (2011)CrossRef
  • 作者单位:Nathan Kallus (18)

    18. Massachusetts Institute of Technology, 77 Massachusetts Ave E40-149, Cambridge, MA, 02139, USA
  • 丛书名:Big Data Analytics in the Social and Ubiquitous Context
  • ISBN:978-3-319-29009-6
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
With more information becoming widely accessible and new content created every day on today’s web, more are turning to harvesting such data and analyzing it to extract insights. But the relevance of such data to see beyond the present is not clear. We present efforts to predict future events based on web intelligence – data harvested from the web – with specific emphasis on social media data and on timed event mentions, thereby quantifying the predictive power of such data. We focus on predicting crowd actions such as large protests and coordinated acts of cyber activism – predicting their occurrence, specific timeframe, and location. Using natural language processing, statements about events are extracted from content collected from hundred of thousands of open content web sources. Attributes extracted include event type, entities involved and their role, sentiment and tone, and – most crucially – the reported timeframe for the occurrence of the event discussed – whether it be in the past, present, or future. Tweets (Twitter posts) that mention an event to occur reportedly in the future prove to be important predictors. These signals are enhanced by cross referencing with the fragility of the situation as inferred from more traditional media, allowing us to sift out the social media trends that fizzle out before materializing as crowds on the ground.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700