AI-Driven Automated Language Assessment of Picture Writing Tasks

Ruibin ZHAO, Yipeng ZHUANG, Di ZOU, Qin XIE, Philip L.H. YU

Research output: Book Chapters | Papers in Conference ProceedingsConference (Extended Abstracts)peer-review

Abstract

In assessing students’ language learning progress, it will be most helpful if a tool can automatically score students’ writing tasks. It can help reduce teachers’ workload and shorten the time to provide feedback to students. For these reasons, researchers have been paying an effort in studying automatic writing assessment, and various automated scoring tools have been developed over the years. However, as far as we know, most of the previous studies evaluated writing quality by extracting some language-related features based on natural language processing (NLP), and the developed scoring tools can only be used for text-based writing tasks such as composition and story writing.

Picture writing is commonly used as a language assessment task, particularly for K-12 students. A typical picture writing task asks students to write a sentence to describe a given picture. In assessing such a writing task, two different information modalities, that is, picture and its textual description, are involved, and the above-mentioned text-based scoring tools are not applicable. To address this need, we proposed an efficient AI-driven automated scoring tool. Given a visual picture and a textual answer, the method estimated the similarities between them by employing crossmodal matching AI models, and the similarities were considered as indices to evaluate how well the picture is described by the answer. Meanwhile, some NLP algorithms were employed to extract some indices for the answer to measure its grammar, spelling, fluency, and sentence structure. Based on the estimation of these two types of indices, we developed an automated scoring model for picture writing tasks.

In this study, we designed a picture writing test consisting of 15 tasks, each of which requires students to write a sentence to describe a picture and conducted the writing test in K-12 schools in Mainland China. Overall, nearly 4,000 sentences were written by Grade 7-8 students, and each valid sentence was graded by language experts based on a set of six grading rubrics such as comprehensiveness, vividness, grammar, and spelling, resulting in a total score ranging from 0 to 10. The sentences and their scores were used to develop an AI-driven automated scoring model. Various algorithms have been considered for model building, including support vector machine, regression, neural network, decision tree, ensemble methods, etc.

We used several popular measures to evaluate the scoring model, including mean absolute error, exact and adjacent agreement rates. Applying our developed AI scoring model to a hold-out testing set, we obtain a small mean absolute error of 0.506 and a high adjacent-agreement rate of 90.8%, demonstrating that our proposed AI model can achieve an accurate scoring performance. Grading language assignments is inherently subjective and could be time-consuming; therefore, it is believed that our AI model could reduce the subjective elements and save teachers’ time so that teachers can objectively identify the strengths and weaknesses of students to improve the students’ performance in language learning.
Original languageEnglish
Title of host publicationProceedings of the 1st APSCE International Conference on Future Language Learning (ICFULL) 2022
EditorsYanjie SONG, Maiga CHANG, Ting-Chia HSU, Hiroaki OGATA, Yun WEN
PublisherThe Education University of Hong Kong
Pages62
Number of pages1
ISBN (Print)9789888636853
Publication statusPublished - Jul 2022
Externally publishedYes
EventThe 1st APSCE International Conference on Future Language Learning 2022 - The Education University of Hong Kong, Hong Kong
Duration: 1 Jul 20223 Jul 2022
https://www.eduhk.hk/ICFULL2022/index.html

Conference

ConferenceThe 1st APSCE International Conference on Future Language Learning 2022
Country/TerritoryHong Kong
Period1/07/223/07/22
Internet address

Keywords

  • automated scoring
  • picture writing
  • image-text matching
  • natural language processing

Fingerprint

Dive into the research topics of 'AI-Driven Automated Language Assessment of Picture Writing Tasks'. Together they form a unique fingerprint.

Cite this