Determining the Topic Hashtags for Chinese Microblogs Based on 5W Model
详细信息    查看全文
  • 关键词:Hashtag ; Microblogs ; Topic detection ; Short message news ; 5W model
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9784
  • 期:1
  • 页码:55-67
  • 全文大小:694 KB
  • 参考文献:1. http://​support.​twitter.​com/​articles/​49309-what-are-hashtagssymbols
    2.Liu, X.H., Meng, X.F., Wei, F.R.: Entity-centric topic-oriented opinion summarization in Twitter. In: Proceedings of the Eighteenth Annual ACM Conference on Knowledge Discovery and Data Mining, pp. 379–387. ACM Press (2012)
    3.Yap, I., Loh, H.T., Shen, L., Liu, Y.: Topic detection using MFSs. In: Ali, M., Dapoigny, R. (eds.) IEA/AIE 2006. LNCS (LNAI), vol. 4031, pp. 342–352. Springer, Heidelberg (2006)CrossRef
    4.Seo, Y.-W., Sycara, K.: Text clustering for topic detection. Technical report CMU-RI-TR-04-03, Robotics Institute, Pittsburgh, PA, January 2004
    5.Wang, T., Zhang, X.Y.: Research of technologies on topic detection and tracking. J. Front. Comput. Sci. Technol. 261(3), 347–357 (2009). Higher Education Press, BeijingCrossRef
    6.Wang, H.F., Xu, G.: The development of topic models in natural language processing. Chin. J. Comput. 34(8), 1423–1436 (2011). Science Press, BeijingMathSciNet CrossRef
    7.Chen, F., Brants, T.: A system for new event detection. In: Proceedings of the 26th Annual International ACM Conferenceon Research and Development in Information Retrieval, pp. 330–337. ACM Press, New York (2003)
    8.Zhai, C.X., Mei, Q.Z.: Discovering evolutionary theme patterns from text: an exploration oftemporal text mining. In: Proceedings of the Eleventh Annual ACM Conference on Knowledge Discovery and Data Mining, pp. 198–207. ACM Press, New York (2005)
    9.Peng, P.C., Nallapati, R., Feng, A.: Event threading within news topics. In: Proceedings of the Thirteenth ACM Conference of Information and Knowledge Management, pp. 446–453. ACM Press (2004)
    10.Hong, Y., Zhang, Y., Liu, T., et al.: Topic detection and tracking review. J. Chin. Inf. Process. 21(6), 77–79 (2007)
    11.Lee, L., Pang, B.: Opinion mining and sentiment analysis. Found. Trends Inf. Retrieval 2, 1–135 (2008)CrossRef
    12.Rappoport, A., Davidov, D., Tsur, O.: Enhanced sentiment learning using Twitter hashtags and smileys. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 241–249. Tsinghua University Press, Beijing (2010)
    13.Zhou, M., Jiang, L., Yu, M.: Target-dependent Twitter sentiment classification. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 151–160. The Association for Computer Linguistics, Stroudsburg (2011)
    14.Liu, X., Wang, X., Wei, F.: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM Conference on Information and Knowledge Management, pp. 151–160. ACM Press (2011)
    15.Wilson, T., Boss, J., et al.: The arte of rhetorique (1998)
    16. http://​www.​owenspencerthoma​s.​com/​journalism/​
    17.Mei, Q.Z., Jiang, Y.L., Lin, C.X.D.: Context comparison of bursty events in web search and online media. In: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing, pp. 1077–1087 (2010)
    18.Java, A., Song, X., Finin, T., Tseng, B.: Why we Twitter: an analysis of a microblogging community. In: Zhang, H., Spiliopoulou, M., Mobasher, B., Giles, C.L., McCallum, A., Nasraoui, O., Srivastava, J., Yen, J. (eds.) WebKDD 2007. LNCS, vol. 5439, pp. 118–138. Springer, Heidelberg (2009)CrossRef
    19.Blake, K.: Inverted pyramid story format. http://​kelab.​tamu.​edu/​spb_​encyclopedia
    20.Yamaguchi, M., Kise, K., Mizuno, H.: On the use of density distribution of keywords for automated generation of hypertext links from arbitrary parts of documents, document analysis and recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition, pp. 301–304. IEEE Computer Society, Washington (1999)
  • 作者单位:Zhibin Zhao (18)
    Jiahong Sun (18)
    Zhenyu Mao (18)
    Shi Feng (18)
    Yubin Bao (18)

    18. School of Computer Science and Engineering, Northeastern University, 3-11 Wenhua Road, Heping District, Shenyang, 110819, China
  • 丛书名:Big Data Computing and Communications
  • ISBN:978-3-319-42553-5
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9784
文摘
A hashtag is an important metadata in microblogs and used to mark topics or index messages. With topic-related hashatags microblogs are well grouped, and users can retrieve the microblogs efficiently and then follow the interested conversations. At the same time, microblogging service providers can leverage hashtags to classify the massive microblogs for building high-level applications such as event detection and tracking, sentiment analysis, and opinion mining. However, statistics show that hashtags are absent from most of the microblogs. In this paper, we summarize the similarities between microblogs and short-message-style news, and then propose an algorithm named 5WTAG for detecting microblog topics based on the model of five Ws(When, Where, Who, What, hoW). Since five-W(5W) attributes are the core components in event description, it is guaranteed theoretically that 5WTAG can extract the semantical topic from a microblogs properly. We introduce the detailed procedure of the algorithm 5WTAG in this paper including microblog segmentation and candidate hashtag construction. We propose a novel method of recommendation computing for ranking candidate hashtags, which combines syntax analysis and semantic analysis, and observes the distribution law of human-annotated topic tags. We conduct comprehensive experiments to verify the semantical correctness and completeness of the candidate hashtags as well as the accuracy of recommendation using the real data from Sina Weibo.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700