Automatic classification of Chinese programming MOOC reviews using fine-tuned BERTs and GPT-augmented data

Xieling CHEN, Haoran XIE*, Di ZOU, Lingling XU, Fu Lee WANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

In massive open online course (MOOC) environments, computer-based analysis of course reviews enables instructors and course designers to develop intervention strategies and improve instruction to support learners’ learning. This study aimed to automatically and effectively identify learners’ concerned topics within their written reviews. First, we examined the distribution of topics in 13,660 reviews related to a Chinese programming MOOC and identified “instructional skills,” “perceived course value,” “instructor characteristics,” and “perceived course difficulty” as primary concerns among learners. Second, we proposed a GPTaug-BERT model that integrates fine-tuned bidirectional encoder representations from Transformers (BERT) models with augmented data generated using generative pre-trained Transformers (GPT) and applied it to classify learners’ concerned topics automatically. Results showed that compared with machine learning and other deep learning architectures, the GPTaug-BERT model improved the F1 scores of the MOOC review topic recognition task by 7%. Third, we compared the effectiveness of the GPTaug-BERT model with the BERT-Chinese model in distinguishing between topics, showing that the GPTaug-BERT model achieved better performance with an accuracy of above 67% across all categories even for “online programming tools,” “feedback and problemsolving,” and “course structure” that were largely misclassified by the BERT-Chinese model. Findings offer insights into the effectiveness of combining fine-tuned BERT models with GPT-augmented data for facilitating accurate topic identification from MOOC reviews.

Original languageEnglish
Pages (from-to)230-249
Number of pages20
JournalEducational Technology and Society
Volume28
Issue number1
DOIs
Publication statusPublished - Jan 2025

Bibliographical note

Publisher Copyright:
© (2025), (International Forum of Educational Technology and Society). All rights reserved.

Funding

This work was supported by the National Natural Science Foundation of China (No. 62307010), the Philosophy and Social Science Planning Project of Guangdong Province of China (Grant No. GD24XJY17), and the Faculty Research Grant (DB24C5) of Lingnan University, Hong Kong.

Keywords

  • BERT
  • Data augmentation
  • GPT
  • Massive open online courses
  • Multilabel classification

Fingerprint

Dive into the research topics of 'Automatic classification of Chinese programming MOOC reviews using fine-tuned BERTs and GPT-augmented data'. Together they form a unique fingerprint.

Cite this