Abstract
In this study, we developed 'SACMR: Sentiment Analysis in Chinese Language using Modified RoBERTa', a modified model focusing on Chinese sentiment analysis. To address the deficiencies in the sentiment analysis of existing Chinese LLMs (large language models), we employed the RoBERTa-wwm-ext model for sentiment analysis in Chinese text, which is specifically pre-trained for Chinese and fine-tuned for sentiment detection. Experiments were conducted on the Chinese portion of the multilingual-sentiment-datasets, with random splitting into training, validation, and testing sets in an 8:1:1 ratio. We also compared our approach with a random forest classifier and a DistilBERT model designed for sentiment analysis. The experimental results indicate that SACMR outperformed other methods in terms of accuracy and F1 scores by up to 25% and 39% respectively, highlighting the importance of pre-training and fine-tuning specifically for the Chinese context.
| Original language | English |
|---|---|
| Title of host publication | 2024 IEEE 9th International Conference on Computational Intelligence and Applications, ICCIA 2024 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 84-88 |
| Number of pages | 5 |
| ISBN (Electronic) | 9798350352214 |
| DOIs | |
| Publication status | Published - 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2024 IEEE.
Funding
The work described in this paper was substantially sponsored by the research project (Grant No. 62202128) supported by the National Natural Science Foundation of China. This research was substantially sponsored by the grant from Hainan University (Grant No. KYQD(ZR)23125).
Keywords
- Chinese sentiment analysis
- DistilBERT
- Performance evaluation
- Random Forest
- RoBERTa