Resampling-Based Gap Analysis for Detecting Nodes with High Centrality on Large Social Network
详细信息    查看全文
  • 作者:Kouzou Ohara (10)
    Kazumi Saito (11)
    Masahiro Kimura (12)
    Hiroshi Motoda (13) (14)

    10. Department of Integrated Information Technology
    ; Aoyama Gakuin University ; Kanagawa ; Japan
    11. School of Administration and Informatics
    ; University of Shizuoka ; Shizuoka ; Japan
    12. Department of Electronics and Informatics
    ; Ryukoku University ; Shiga ; Japan
    13. Institute of Scientific and Industrial Research
    ; Osaka University ; Osaka ; Japan
    14. School of Computing and Information Systems
    ; University of Tasmania ; Hobart ; Australia
  • 关键词:Gap analysis ; Error estimation ; Resampling ; Node centrality
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2015
  • 出版时间:2015
  • 年:2015
  • 卷:9077
  • 期:1
  • 页码:135-147
  • 全文大小:344 KB
  • 参考文献:1. Bonacichi, P (1987) Power and centrality: A family of measures. Amer. J. Sociol. 92: pp. 1170-1182 CrossRef
    2. Brandes, U (2001) A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25: pp. 163-177 CrossRef
    3. Brin, S, Page, L (1998) The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30: pp. 107-117 CrossRef
    4. Chen, W, Lakshmanan, L, Castillo, C (2013) Information and influence propagation in social networks. Synthesis Lectures on Data Management 5: pp. 1-177 CrossRef
    5. Freeman, L (1979) Centrality in social networks: Conceptual clarification. Social Networks 1: pp. 215-239 CrossRef
    6. Henzinger, MR, Heydon, A, Mitzenmacher, M, Najork, M (2000) On near-uniform url sampling. The International Journal of Computer and Telecommunications Networking 33: pp. 295-308
    7. Katz, L (1953) A new status index derived from sociometric analysis. Sociometry 18: pp. 39-43
    8. Kleinberg, J (2008) The convergence of social and technological networks. Communications of ACM 51: pp. 66-72 CrossRef
    9. Klimt, B, Yang, Y The enron corpus: a new dataset for email classification research. In: Boulicaut, J-F, Esposito, F, Giannotti, F, Pedreschi, D eds. (2004) Machine Learning: ECML 2004. Springer, Heidelberg, pp. 217-226 CrossRef
    10. Kurant, M, Markopoulou, A, Thiran, P (2011) Towards unbiased bfs sampling. IEEE Journal on Selected Areas in Communications 29: pp. 1799-1809 CrossRef
    11. Leskovec, J., Faloutsos, C.: Sampling from large graphs. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), pp. 631鈥?36 (2006)
    12. Newman, M.E.J.: Scientific collaboration networks. ii. shortest paths, weighted networks, and centrality. Physical Review E 64, 016132 (2001)
    13. Ohara, K, Saito, K, Kimura, M, Motoda, H Resampling-based framework for estimating node centrality of large social network. In: D啪eroski, S, Panov, P, Kocev, D, Todorovski, L eds. (2014) Discovery Science. Springer, Heidelberg, pp. 228-239 CrossRef
    14. Zhuge, H, Zhang, J (2010) Topological centrality and its e-science applications. Journal of the American Society of Information Science and Technology 61: pp. 1824-1841 CrossRef
  • 作者单位:Advances in Knowledge Discovery and Data Mining
  • 丛书名:978-3-319-18037-3
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
文摘
We address a problem of identifying nodes having a high centrality value in a large social network based on its approximation derived only from nodes sampled from the network. More specifically, we detect gaps between nodes with a given confidence level, assuming that we can say a gap exists between two adjacent nodes ordered in descending order of approximations of true centrality values if it can divide the ordered list of nodes into two groups so that any node in one group has a higher centrality value than any one in another group with a given confidence level. To this end, we incorporate confidence intervals of true centrality values, and apply the resampling-based framework to estimate the intervals as accurately as possible. Furthermore, we devise an algorithm that can efficiently detect gaps by making only two passes through the nodes, and empirically show, using three real world social networks, that the proposed method can successfully detect more gaps, compared to the one adopting a standard error estimation framework, using the same node coverage ratio, and that the resulting gaps enable us to correctly identify a set of nodes having a high centrality value.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700