Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities, surpassing human experts in various benchmark tests and playing a vital role in various industry sectors. Despite their effectiveness, a notable drawback of LLMs is their inconsistent moral behavior, which raises ethical concerns. This work delves into symmetric moral consistency in large language models and demonstrates that modern LLMs lack sufficient consistency ability in moral scenarios. Our extensive investigation of twelve popular LLMs reveals that their assessed consistency scores are influenced by position bias and selection bias rather than their intrinsic abilities. We propose a new framework tSMC, which gauges the effects of these biases and effectively mitigates the bias impact based on the Kullback-Leibler divergence to pinpoint LLMs' mitigated Symmetric Moral Consistency. We find that the ability of LLMs to maintain consistency varies across different moral scenarios. Specifically, LLMs show more consistency in scenarios with clear moral answers compared to those where no choice is morally perfect. The average consistency score of 12 LLMs ranges from 60.7% in high-ambiguity moral scenarios to 84.8% in low-ambiguity moral scenarios.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 37 (NeurIPS 2024) |
| Editors | A. GLOBERSON, L. MACKEY, D. BELGRAVE, A. FAN, U. PAQUET, J. TOMCZAK, C. ZHANG |
| Publisher | Neural Information Processing Systems Foundation |
| Number of pages | 24 |
| Volume | 37 |
| ISBN (Electronic) | 9798331314385 |
| Publication status | Published - Dec 2024 |
| Event | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 - Vancouver, Canada Duration: 9 Dec 2024 → 15 Dec 2024 |
Publication series
| Name | Advances in Neural Information Processing Systems |
|---|---|
| Publisher | Neural information processing systems foundation |
| ISSN (Print) | 1049-5258 |
Conference
| Conference | 38th Conference on Neural Information Processing Systems, NeurIPS 2024 |
|---|---|
| Country/Territory | Canada |
| City | Vancouver |
| Period | 9/12/24 → 15/12/24 |
Bibliographical note
Publisher Copyright:© 2024 Neural information processing systems foundation. All rights reserved.
Funding
This work was supported by Key Programs of Guangdong Province under Grant 2021QN02X166. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funding parties.