Communication-efficient algorithms for parallel latent Dirichlet allocation

详细信息查看全文

作者：Jian-Feng Yan ; Jia Zeng ; Yang Gao ; Zhi-Qiang Liu
关键词：Latent Dirichlet allocation ; Parallel learning ; Zipf’s law ; Belief propagation ; Gibbs sampling
刊名：Soft Computing - A Fusion of Foundations, Methodologies and Applications
出版年：2015
出版时间：January 2015
年：2015
卷：19
期：1
页码：3-11
全文大小：907 KB
参考文献：1. Ahmed A, Aly M, Gonzalez J, Narayanamurthy S, Smola A (2012) Scalable inference in latent variable models. In: WSDM, pp 123-32
2. Blei DM (2012) Introduction to probabilistic topic models. Commun ACM 55(4): 77-4
3. Blei DM, Ng AY, Jordan MI (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993-022
4. Canini KR, Shi L, Griffths TL (2009) Online inference of topics with latent Dirichlet allocation. In: AISTATS, pp 65-2
5. Griffiths TL, Steyvers M (2004) Finding scientific topics. Proc Natl Acad Sci 101:5228-235 CrossRef
6. Hoffman M, Blei D, Bach F (2010) Online learning for latent Dirichlet allocation. In: NIPS, pp 856-64
7. Liu Z, Zhang Y, Chang E, Sun M (2011) Plda+: parallel latent Dirichlet allocation with data placement and pipeline processing. ACM Trans Intell Syst Technol 2(3):26:1-6:18
8. Newman D, Asuncion A, Smyth P, Welling M (2009) Distributed algorithms for topic models. J Mach Learn Res 10:1801-828
9. Smola A, Narayanamurthy S (2010) An architecture for parallel topic models. In: PVLDB, pp 703-10
10. Wang Y, Bai H, Stanton M, Chen WY, Chang E (2009) Plda: parallel latent Dirichlet allocation for large-scale applications. In: Algorithmic aspects in information and management, pp 301-14
11. Winn J, Bishop CM (2005) Variational message passing. J Mach Learn Res 6:661-94
12. Zeng J (2012) A topic modeling toolbox using belief propagation. J Mach Learn Res 13:2233-236
13. Zeng J, Cao XQ, Liu ZQ (2012) Residual belief propagation for topic modeling. In: ADMA, pp 739-52
14. Zeng J, Cheung WK, Liu J (2013) Learning topic models by belief propagation. IEEE Trans Pattern Anal Mach Intell 33(5):1121-134 CrossRef
15. Zeng J, Liu ZQ, Cao XQ (2012) A new approach to speeding up topic modeling. arXiv:1204.0170 [cs.LG]
16. Zeng J, Liu ZQ, Cao XQ (2012) Online belief propagation for topic modeling. arXiv:1210.2179
17. Zhai K, Boyd-Graber J, Asadi N (2011) Using variational inference and MapReduce to scale topic modeling. arXiv:1107.3765v1 [cs.AI]
18. Zipf GK (1949) Human behavior and the principle of least effort. Addison-Wesley, Cambridge
刊物类别：Engineering
刊物主题：Numerical and Computational Methods in Engineering
Theory of Computation
Computing Methodologies
Mathematical Logic and Foundations
Control Engineering
出版者：Springer Berlin / Heidelberg
ISSN：1433-7479

文摘

Latent Dirichlet allocation (LDA) is a popular topic modeling method which has found many multimedia applications, such as motion analysis and image categorization. Communication cost is one of the main bottlenecks for large-scale parallel learning of LDA. To reduce communication cost, we introduce Zipf’s law and propose novel parallel LDA algorithms that communicate only partial important information at each learning iteration. The proposed algorithms are much more efficient than the current state-of-the-art algorithms in both communication and computation costs. Extensive experiments on large-scale data sets demonstrate that our algorithms can greatly reduce communication and computation costs to achieve a better scalability.

地址：北京市海淀区学院路29号邮编：100083

电话：办公室：(+86 10)66554848；文献借阅、咨询服务、科技查新：66554700