Social emotion classification draws many natural language processing researchers’ attention in recent years, since analyzing user-generated emotional documents on the Web is quite useful in recommending products, gathering public opinions, and predicting election results. However, the documents that evoke prominent social emotions are usually mixed with noisy instances, and it is also challenging to capture the textual meaning of short messages. In this work, we focus on reducing the impact of noisy instances and learning a better representation of sentences. For the former, we introduce an “emotional concentration” indicator, which is derived from emotional ratings to weight documents. For the latter, we propose a new architecture named PCNN, which utilizes two cascading convolutional layers to model the word-phrase relation and the phrase–sentence relation. This model regards continuous tokens as phrases based on an assumption that neighboring words are very likely to have internal relations, and semantic feature vectors are generated based on the phrase representation. We also present a Bayesian-based model named WMCM to learn document-level semantic features. Both PCNN and WMCM classify social emotions by capturing semantic regularities in language. Experiments on two real-world datasets indicate that the quality of learned semantic vectors and the performance of social emotion classification can be improved by our models.
Bibliographical noteThis paper is an extended version of our previous conference paper (Li et al., 2016).
- Convolutional neural network
- Emotional concentration
- Social emotion classification
- Topic modeling