Context Reinforced Neural Topic Modeling over Short Texts

Jiachun FENG, Zusheng ZHANG, Cheng DING, Yanghui RAO*, Haoran XIE, Fu Lee WANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

10 Citations (Scopus)

Abstract

As one of the prevalent topic mining methods, neural topic modeling has attracted a lot of interests due to the advantages of low training costs and strong generalisation abilities. However, the existing neural topic models may suffer from the feature sparsity problem when applied to short texts, due to the lack of context in each message. To alleviate this issue, we propose a Context Reinforced Neural Topic Model (CRNTM), whose characteristics can be summarized as follows. First, by assuming that each short text covers only a few salient topics, the proposed CRNTM infers the topic for each word in a narrow range. Second, our model exploits pre-trained word embeddings by treating topics as multivariate Gaussian distributions or Gaussian mixture distributions in the embedding space. Extensive experiments on two benchmark short corpora validate the effectiveness of the proposed model on both topic discovery and text classification.
Original languageEnglish
Pages (from-to)79-91
Number of pages13
JournalInformation Sciences
Volume607
Early online date1 Jun 2022
DOIs
Publication statusPublished - Aug 2022

Bibliographical note

Funding Information:
The research described in this paper was supported by the National Natural Science Foundation of China (61972426), Guangdong Basic and Applied Basic Research Foundation (2020A1515010536), the Direct Grant (DR22A2) and the Faculty Research Grants (DB22B4 and DB22B7) of Lingnan University, Hong Kong, and a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (UGC/FDS16/E01/19).

Publisher Copyright:
© 2022 Elsevier Inc.

Keywords

  • Context reinforcement
  • Neural topic model
  • Short texts

Fingerprint

Dive into the research topics of 'Context Reinforced Neural Topic Modeling over Short Texts'. Together they form a unique fingerprint.

Cite this