Whose Values Prevail? Bias in Large Language Model Value Alignment

Ruoxi QI, Gleb PAPYSHEV, Kellee TSAI, Antoni B. CHAN, Janet H. HSIAO*

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

As large language models (LLMs) are increasingly integrated into our lives, concerns have been raised about whether they are biased towards the values of particular cultures. We show that while LLMs were biased toward the values of WEIRD populations, some non-Western populations, including East Asia and Russia, were also represented relatively well. Notably, the Rich dimension was the strongest predictor of LLM's alignment instead of the most discussed Western dimension. This suggests the need to attend to less prosperous populations instead of focusing only on easily accessible populations. We also found that one source of this bias could be unbalanced training data as approximated by an Internet Freedom measure, and that prompting the model to act as individuals from different populations reduced the bias but could not eliminate it. These findings raise the importance of training process disclosure and the consideration of culture-specific models to ensure ethical usage of LLMs.
Original languageEnglish
Title of host publicationProceedings of the 47th Annual Conference of the Cognitive Science Society
EditorsDavid BARNER, Neil R. BRAMLEY, Azzurra RUGGERI, Caren M. WALKER
PublishereScholarship Publishing
Pages665-672
Number of pages8
Publication statusPublished - 2025
Externally publishedYes
Event47th Annual Conference of the Cognitive Science Society - San Francisco, United States
Duration: 30 Jul 20252 Aug 2025

Publication series

NameProceedings of the Annual Meeting of the Cognitive Science Society
PublishereScholarship Publishing
Volume47
ISSN (Electronic)1069-7977

Conference

Conference47th Annual Conference of the Cognitive Science Society
Abbreviated titleCogSci2025
Country/TerritoryUnited States
CitySan Francisco
Period30/07/252/08/25

Funding

This study was funded by Research Grant Council of Hong Kong, Area of Excellence (Project number AoE/E-601/24-N).

Keywords

  • Large Language Model (LLM)
  • Value Alignment
  • WEIRD Population

Fingerprint

Dive into the research topics of 'Whose Values Prevail? Bias in Large Language Model Value Alignment'. Together they form a unique fingerprint.

Cite this