An Online Gibbs Sampler Algorithm for Hierarchical Dirichlet Processes Prior
详细信息    查看全文
  • 关键词:Topic model ; hierarchical Dirichlet processes ; Mini ; batch online algorithm ; generalized hierarchical Dirichlet processes
  • 刊名:Lecture Notes in Computer Science
  • 出版年:2016
  • 出版时间:2016
  • 年:2016
  • 卷:9851
  • 期:1
  • 页码:509-523
  • 全文大小:599 KB
  • 参考文献:1.Ahmed, A., Xing, E.P.: Timeline: A dynamic hierarchical Dirichlet process model for recovering birth/death and evolution of topics in text stream. arXiv preprint arXiv:​1203.​3463 (2012)
    2.Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)MATH
    3.Chae, M., Chung, H., Kang, B., Jeun, W.C., Kim, Y.: Distributed algorithms for hierarchical Dirichlet process via partially collapsed Gibbs sampler, pp. 1–8 (2015)
    4.Ferguson, T.S.: A Bayesian analysis of some nonparametric problems. Ann. Stat. 1, 209–230 (1973)MathSciNet CrossRef MATH
    5.Heinrich, G.: Infinite LDA implementing the HDP with minimum code complexity. Technical note, Feb 170 (2011)
    6.Hoffman, M.D., Blei, D.M., Bach, F.R.: Online learning for latent Dirichlet allocation. In: NIPS, vol. 2, p. 5 (2010)
    7.Newman, D., Asuncion, A., Smyth, P., Welling, M.: Distributed algorithms for topic models. J. Mach. Learn. Res. 10, 1801–1828 (2009)MathSciNet MATH
    8.Pitman, J.: Combinatorial Stochastic Processes: Ecole D’Eté de Probabilités de Saint-Flour XXXII-2002. Springer (2006)
    9.Ren, L., Dunson, D.B., Carin, L.: The dynamic hierarchical Dirichlet process. In: Proceedings of the 25th International Conference on Machine Learning, pp. 824–831. ACM (2008)
    10.Sethuraman, J.: A constructive definition of Dirichlet priors. Stat. Sinica 4(2), 639–650 (1994)MathSciNet MATH
    11.Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101, 1566–1581 (2006)MathSciNet CrossRef MATH
    12.Teh, Y.W., Kurihara, K., Welling, M.: Collapsed variational inference for HDP. In: NIPS (2007)
    13.Teh, Y.W., Newman, D., Welling, M.: A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. NIPS 6, 1378–1385 (2006)
    14.Wang, C., Blei, D.M.: A split-merge MCMC algorithm for the hierarchical Dirichlet process. arXiv preprint arXiv:​1201.​1657 (2012)
    15.Wang, C., Blei, D.M.: Truncation-free online variational inference for bayesian nonparametric models. In: Advances in Neural Information Processing Systems, pp. 413–421 (2012)
    16.Wang, C., Paisley, J.W., Blei, D.M.: Online variational inference for the hierarchical Dirichlet process. In: International Conference on Artificial Intelligence and Statistics. pp. 752–760 (2011)
  • 作者单位:Yongdai Kim (17)
    Minwoo Chae (18)
    Kuhwan Jeong (17)
    Byungyup Kang (19)
    Hyoju Chung (19)

    17. Department of Statistics, Seoul National University, Seoul, South Korea
    18. Department of Mathematics, The University of Texas at Austin, Austin, USA
    19. NAVER Corp., Seongnam, South Korea
  • 丛书名:Machine Learning and Knowledge Discovery in Databases
  • ISBN:978-3-319-46128-1
  • 刊物类别:Computer Science
  • 刊物主题:Artificial Intelligence and Robotics
    Computer Communication Networks
    Software Engineering
    Data Encryption
    Database Management
    Computation by Abstract Devices
    Algorithm Analysis and Problem Complexity
  • 出版者:Springer Berlin / Heidelberg
  • ISSN:1611-3349
  • 卷排序:9851
文摘
The hierarchical Dirichlet processes (HDP) is a Bayesian nonparametric model that provides a flexible mixed-membership to documents. In this paper, we develop a novel mini-batch online Gibbs sampler algorithm for the HDP which can be easily applied to massive and streaming data. For this purpose, a new prior process so called the generalized hierarchical Dirichlet processes (gHDP) is proposed. The gHDP is an extension of the standard HDP where some prespecified topics can be included in the top-level Dirichlet process. By analyzing various datasets, we show that the proposed mini-batch online Gibbs sampler algorithm performs significantly better than the online variational algorithm for the HDP.

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700