Parallel Dynamic Topic Modeling via Evolving Topic Adjustment and Term Weighting Scheme

Hongyu JIANG, Zhiqi LEI, Yanghui RAO*, Haoran XIE, Fu Lee WANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

3 Citations (Scopus)

Abstract

The parallel Hierarchical Dirichlet Process (pHDP) is an efficient topic model which explores the equivalence of the generation process between Hierarchical Dirichlet Process (HDP) and Gamma-Gamma-Poisson Process (G2PP), in order to achieve parallelism at the topic level. Unfortunately, pHDP loses the non-parametric feature of HDP, i.e., the number of topics in pHDP is predetermined and fixed. Furthermore, under the bootstrap structure of pHDP, the topic-indiscriminate words are of high probabilities to be assigned to different topics, resulting in poor qualities of the extracted topics. To achieve parallelism without sacrificing the non-parametric feature of HDP, in addition to improve the quality of extracted topics, we propose a parallel dynamic topic model by developing an adjustment mechanism of evolving topics and reducing the sampling probabilities of topic-indiscriminate words. Both supervised and unsupervised experiments on benchmark datasets show the competitive performance of our model.
Original languageEnglish
Pages (from-to)176-193
Number of pages18
JournalInformation Sciences
Volume585
Early online date26 Nov 2021
DOIs
Publication statusPublished - Mar 2022

Bibliographical note

Publisher Copyright:
© 2021 Elsevier Inc.

Funding

The authors are thankful to the reviewers for their constructive comments and suggestions. The research described in this paper was supported by the National Natural Science Foundation of China (61972426), Guangdong Basic and Applied Basic Research Foundation (2020A1515010536), the Direct Grant (DR21A5) and the Faculty Research Grants (DB21B6 and DB21A9) of Lingnan University, Hong Kong, and a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (UGC/FDS16/E01/19).

Keywords

  • Dynamic topic model
  • Parallel gibbs sampling
  • Term weighting scheme

Fingerprint

Dive into the research topics of 'Parallel Dynamic Topic Modeling via Evolving Topic Adjustment and Term Weighting Scheme'. Together they form a unique fingerprint.

Cite this