Enhancing Continuous Sign Language Recognition with Self-Attention and MediaPipe Holistic

Yufeng JIANG, Fengheng LI, Zongxi LI*, Ziwei LIU, Zijian WANG

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

1 Citation (Scopus)

Abstract

Sign language recognition (SLR) is an interdisciplinary application that combines computer vision and natural language processing. This paper focuses on Continuous sign language recognition (CSLR)1, which refers to recognizing a continuous sequence of sign language sentences, phrases, or words expressed in a short video. Currently, most phrase-level CSLR research primarily uses recurrent neural networks (RNNs). However, RNNs struggle to capture global dependencies and can only model sequential actions. Sign language involves complex spatial patterns formed by hand gestures, facial expressions, and body movements. Therefore, the global dependency of the spatial features extracted from the video is crucial for this task. In this paper, we introduce a novel pipeline framework for addressing CSLR. We first employ MediaPipe Holistic to extract key points from sign language videos, which are converted into sequential input data that the model can process. To overcome the disadvantages of RNNs, we use the Self-Attention mechanism, which excels at identifying relationships among key points and capturing global dependencies between sign language actions within a sequence. Combining the Self-Attention model with the extracted key points creates a more effective and efficient solution for CSLR. Additionally, we discuss the selection of MediaPipe Holistic key points, as not all key points equally contribute to the recognition. Experimental results show that the proposed pipeline exhibits promising performance on the first ten glosses (classes) of the Word-Level American Sign Language (WLASL-10) dataset.

Original languageEnglish
Title of host publicationProceedings of the 2023 International Conference on Instrumentation, Control, and Automation, ICA 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages97-102
Number of pages6
ISBN (Electronic)9798350301274
ISBN (Print)2379-755X
DOIs
Publication statusPublished - 2023
Externally publishedYes
Event8th International Conference on Instrumentation, Control, and Automation, ICA 2023 - Jakarta, Indonesia
Duration: 9 Aug 202311 Aug 2023

Conference

Conference8th International Conference on Instrumentation, Control, and Automation, ICA 2023
Country/TerritoryIndonesia
CityJakarta
Period9/08/2311/08/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Continuous Sign Language Recognition
  • MediaPipe
  • Self-Attention
  • Sign Language Recognition
  • Video Classification

Fingerprint

Dive into the research topics of 'Enhancing Continuous Sign Language Recognition with Self-Attention and MediaPipe Holistic'. Together they form a unique fingerprint.

Cite this