Abstract
Data scarcity is one of the challenges faced by aspect category sentiment analysis (ACSA) due to limited labeled data. While recent studies leverage large language models (LLMs) with handcrafted prompts for data augmentation, these approaches often fail to preserve the semantics of the original text. We introduce a semantics-preserved, linguistically diverse data augmentation approach for ACSA that employs structured prompt templates to guide LLMs in generating predefined content. To further enhance semantic consistency, a cosine-similarity-based filtering mechanism ensures that augmented sentences remain faithful to their original meanings. Beyond data augmentation, we propose a reliability-aware fine-tuning strategy that reweights the training objective using a reliability score that combines token-level correctness and sequence-level confidence. Experimental results demonstrate that our method improves performance across benchmark datasets compared with strong baselines.
| Original language | English |
|---|---|
| Article number | 113629 |
| Journal | Pattern Recognition |
| Volume | 179 |
| Issue number | Part B |
| Early online date | 2 Apr 2026 |
| DOIs | |
| Publication status | E-pub ahead of print - 2 Apr 2026 |
Bibliographical note
Publisher Copyright:© 2026 The Authors
Funding
The research described in this article has been supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (R1015-23), the Research Impact Fund by the Research Grants Council of Hong Kong (Project No. 130272); and Interdisciplinary & Strategic Research Grant (ISRG252606), the Faculty Research Grants (SDS24A8, SDS25A15 and SDS24A19), and the Direct Grants (DR25E8 and DR26F2) of Lingnan University, Hong Kong.
Keywords
- Aspect category sentiment analysis
- Data augmentation
- Large language models
- Reliability-aware fine-tuning
Fingerprint
Dive into the research topics of 'Semantic-preserved Augmentation with Reliability-aware Fine-tuning for Aspect Category Sentiment Analysis'. Together they form a unique fingerprint.Projects
- 6 Active
-
LLM-Augmented News Terminal: Entity Annotations, Explanation Quality, and Interest‑Aware Conversational Recommendation
1/01/26 → 31/12/27
Project: Grant Research
-
Large Language Models for Aspect-Based Sentiment Analysis
XIE, H. (PI)
1/01/26 → 31/12/27
Project: Grant Research
-
An Integrated Fake Financial News Detection Framework: Knowledge Graph, Large Language Models, Uncertainty Modeling, and Contrastive Learning
XIE, H. (PI)
1/07/25 → 30/06/27
Project: Grant Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver