Neural Mixed Counting Models for Dispersed Topic Discovery

Jiemin WU, Yanghui RAO*, Zusheng ZHANG, Haoran XIE, Qing LI, Fu Lee WANG, Ziye CHEN

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)

Abstract

Mixed counting models that use the negative binomial distribution as the prior can well model over-dispersed and hierarchically dependent random variables; thus they have attracted much attention in mining dispersed document topics. However, the existing parameter inference method like Monte Carlo sampling is quite time-consuming. In this paper, we propose two efficient neural mixed counting models, i.e., the Negative Binomial-Neural Topic Model (NB-NTM) and the Gamma Negative Binomial-Neural Topic Model (GNB-NTM) for dispersed topic discovery. Neural variational inference algorithms are developed to infer model parameters by using the reparameterization of Gamma distribution and the Gaussian approximation of Poisson distribution. Experiments on real-world datasets indicate that our models outperform state-of-the-art baseline models in terms of perplexity and topic coherence. The results also validate that both NB-NTM and GNB-NTM can produce explainable intermediate variables by generating dispersed proportions of document topics.
Original languageEnglish
Title of host publicationProceedings of the 58th Annual Meeting of the Association for Computational Linguistics
EditorsDan JURAFSKY, Joyce CHAI, Natalie SCHLUTER, Joel TETREAULT
PublisherAssociation for Computational Linguistics (ACL)
Pages6159-6169
Number of pages11
Publication statusPublished - Jul 2020

Bibliographical note

The first two authors contributed equally to this work which was finished when Jiemin Wu was an undergraduate student of his final year.

We are grateful to the reviewers for their constructive comments and suggestions on this study. This work has been supported by the National Natural Science Foundation of China (61972426), Guangdong Basic and Applied Basic Research Foundation (2020A1515010536), HKIBS Research Seed Fund 2019/20 (190-009), the Research Seed Fund (102367), and LEO Dr David P. Chan Institute of Data Science of Lingnan University, Hong Kong. This work has also been supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (UGC/FDS16/E01/19), Hong Kong Research Grants Council through a General Research Fund (project no. PolyU 1121417), and by the Hong Kong Polytechnic University through a start-up fund (project no. 980V).

Fingerprint Dive into the research topics of 'Neural Mixed Counting Models for Dispersed Topic Discovery'. Together they form a unique fingerprint.

  • Cite this

    WU, J., RAO, Y., ZHANG, Z., XIE, H., LI, Q., WANG, F. L., & CHEN, Z. (2020). Neural Mixed Counting Models for Dispersed Topic Discovery. In D. JURAFSKY, J. CHAI, N. SCHLUTER, & J. TETREAULT (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (pp. 6159-6169). Association for Computational Linguistics (ACL). https://www.aclweb.org/anthology/2020.acl-main.548.pdf