Abstract
Despite significant advancements in AI, unavoidable risks, such as specification gaming and hallucinations, remain as inherent features of these systems. Current regulatory frameworks, including initiatives like the EU AI Act, focus on risk prevention but fail to adequately address liability for harms caused by these unavoidable risks. To address this gap, we developed a gametheoretic model that examines the optimal liability framework for AI developers. Our model proposes a dynamic liability regime that incentivizes developers to invest in explainability practices. Under this framework, liability exposure decreases as developers demonstrate higher levels of explainability, thereby creating a direct economic incentive for improving interpretability. The regime links liability to explainability benchmarking, allowing courts to evaluate whether harm was truly unavoidable or attributable to deficiencies in the system design. The framework we advocate for is flexible and adaptive, relying on industry-driven benchmarking standards to ensure that liability rules evolve alongside technological advancements.
| Original language | English |
|---|---|
| Publication status | Published - 12 Apr 2025 |
| Externally published | Yes |
| Event | Technical AI Safety Conference 2025 - Tokyo Midtown Tower, Tokyo, Japan Duration: 12 Apr 2025 → 12 Apr 2025 |
Workshop
| Workshop | Technical AI Safety Conference 2025 |
|---|---|
| Abbreviated title | TAIS 2025 |
| Country/Territory | Japan |
| City | Tokyo |
| Period | 12/04/25 → 12/04/25 |