Association of Objects May Engender Stereotypes: Mitigating Association-Engendered Stereotypes in Text-to-Image Generation

Junlei ZHOU, Jiashi GAO, Xiangyu ZHAO, Xin YAO, Xuetao WEI*

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Text-to-Image (T2I) has witnessed significant advancements, demonstrating superior performance for various generative tasks. However, the presence of stereotypes in T2I introduces harmful biases that require urgent attention as the T2I technology becomes more prominent. Previous work for stereotype mitigation mainly concentrated on mitigating stereotypes engendered with individual objects within images, which failed to address stereotypes engendered by the association of multiple objects, referred to as Association-Engendered Stereotypes. For example, mentioning “black people” and “houses” separately in prompts may not exhibit stereotypes. Nevertheless, when these two objects are associated in prompts, the association of “black people” with “poorer houses” becomes more pronounced. To tackle this issue, we propose a novel framework, MAS, to Mitigate Association-engendered Stereotypes. This framework models the stereotype problem as a probability distribution alignment problem, aiming to align the stereotype probability distribution of the generated image with the stereotype-free distribution. The MAS framework primarily consists of the Prompt-Image-Stereotype CLIP (PIS CLIP) and Sensitive Transformer. The PIS CLIP learns the association between prompts, images, and stereotypes, which can establish the mapping of prompts to stereotypes. The Sensitive Transformer produces the sensitive constraints, which guide the stereotyped image distribution to align with the stereotype-free probability distribution. Moreover, recognizing that existing metrics are insufficient for accurately evaluating association-engendered stereotypes, we propose a novel metric, Stereotype-Distribution-Total-Variation (SDTV), to evaluate stereotypes in T2I. Comprehensive experiments demonstrate that our framework effectively mitigates association-engendered stereotypes.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 37 (NeurIPS 2024)
EditorsA. GLOBERSON, L. MACKEY, D. BELGRAVE, A. FAN, U. PAQUET, J. TOMCZAK, C. ZHANG
PublisherNeural Information Processing Systems Foundation
Number of pages33
Volume37
ISBN (Electronic)9798331314385
Publication statusPublished - 2024
Event38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada
Duration: 9 Dec 202415 Dec 2024

Publication series

NameAdvances in Neural Information Processing Systems
PublisherNeural information processing systems foundation
ISSN (Print)1049-5258

Conference

Conference38th Conference on Neural Information Processing Systems, NeurIPS 2024
Country/TerritoryCanada
CityVancouver
Period9/12/2415/12/24

Bibliographical note

Publisher Copyright:
© 2024 Neural information processing systems foundation. All rights reserved.

Funding

This work was supported by Key Programs of Guangdong Province under Grant 2021QN02X166. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding parties.

Fingerprint

Dive into the research topics of 'Association of Objects May Engender Stereotypes: Mitigating Association-Engendered Stereotypes in Text-to-Image Generation'. Together they form a unique fingerprint.

Cite this