The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing

  • Xinwei GUO
  • , Jiashi GAO
  • , Junlei ZHOU
  • , Jiaxin ZHANG
  • , Guanhua CHEN
  • , Xiangyu ZHAO
  • , Quanying LIU
  • , Haiyan WU
  • , Xin YAO
  • , Xuetao WEI*
  • *Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

Large language models (LLMs) are increasingly integrated into our daily lives, raising significant ethical concerns, especially about perpetuating stereotypes. While group-specific debiasing methods have made progress, they often fail to address multiple biases simultaneously. In contrast, group-agnostic debiasing has the potential to mitigate a variety of biases at once, but remains underexplored. In this work, we investigate the role of neutral words-the group-agnostic component-in enhancing the group-agnostic debiasing process. We first reveal that neutral words are essential for preserving semantic modeling, and we propose ϵ-DPCE, a method that incorporates a neutral word semantics-based loss function to effectively alleviate the deterioration of the Language Modeling Score (LMS) during the debiasing process. Furthermore, by introducing the SCM-Projection method, we demonstrate that SCM-based debiasing eliminates stereotypes by indirectly disrupting the association between attribute and neutral words in the Stereotype Content Model (SCM) space. Our experiments show that neutral words, which often embed multi-group stereotypical objects, play a key role in contributing to the group-agnostic nature of SCM-based debiasing.

Original languageEnglish
Title of host publicationThe 63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025: Proceedings
EditorsWanxiang CHE, Joyce NABENDE, Ekaterina SHUTOVA, Mohammad Taher PILEHVAR
PublisherAssociation for Computational Linguistics (ACL)
Pages20360-20371
Number of pages12
ISBN (Electronic)9798891762565
DOIs
Publication statusPublished - 2025
Event63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025 - Vienna, Austria
Duration: 27 Jul 20251 Aug 2025

Publication series

NameFindings of the Association for Computational Linguistics
PublisherAssociation for Computational Linguistics
VolumeACL 2025
ISSN (Print)0736-587X

Conference

Conference63rd Annual Meeting of the Association for Computational Linguistics, ACL 2025
Country/TerritoryAustria
CityVienna
Period27/07/251/08/25

Bibliographical note

Publisher Copyright:
© 2025 Association for Computational Linguistics.

Funding

This work was supported in part by Key Program of Guangdong Province under Grant 2021QN02X166, and in part by the National Natural Science Foundation of China (Project No. 72031003).

Fingerprint

Dive into the research topics of 'The Elephant in the Room: Exploring the Role of Neutral Words in Language Model Group-Agnostic Debiasing'. Together they form a unique fingerprint.

Cite this