Text classification requires a deep understanding of the linguistic features in text; in particular, the intra-sentential (local) and inter-sentential features (global). Models that operate on word sequences have been successfully used to capture the local features, yet they are not effective in capturing the global features in long-text. We investigate graph-level extensions to such models and propose a novel architecture for combining alternative text features. It uses an attention mechanism to dynamically decide how much information to use from a sequence- or graph-level component. We evaluated different architectures on a range of text classification datasets, and graph-level extensions were found to improve performance on most benchmarks. In addition, the attention-based architecture, as adaptively-learned from the data, outperforms the generic and fixed-value concatenation ones.
|Title of host publication||Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’20)|
|Publisher||Association for Computing Machinery (ACM)|
|Publication status||Published - 25 Jul 2020|