Mind the Gap: Revealing Inconsistencies Across Heterogeneous AI Accelerators

  • Elliott WEN
  • , Sean MA
  • , Evan TEMPERO
  • , Bruce SHAM
  • , Yousong SUN
  • , Hong JIA
  • , Daniel LUO
  • , Jiayi HUA
  • , Jens DIETRICH
  • , Kaiqi ZHAO
  • , Jiaxing SHEN

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

While NVIDIA remains the dominant provider of AI accelerators within cloud data center, emerging vendors such as AMD, Intel, Mac, and Huawei offer cost-effective alternatives with claims of compatibility and performance. This paper presents the first empirical study investigating divergence in machine learning model across heterogeneous AI accelerators. Utilizing an automated pipeline, we synthesize over 100,000 variant models derived from 4,000 real-world models and execute them across five different enterprise-grade accelerators. Our findings suggest that newer AI platforms from Mac and Huawei support at least $17 fewer operators than NVIDIA. These platforms also exhibit a higher rate of output discrepancies (exceeding 5%), which stem from differences in operator implementations, handling of exceptional numerical values, and instruction scheduling. They are also more susceptible to failures during model compilation-based acceleration, and in some cases, the compiled models produce outputs that differ noticeably from those generated using the standard execution mode. In addition, we identify 7 implementation flaws in PyTorch and 40 platform-specific issues across vendors. These results underscore the challenges of achieving consistent machine learning behavior in an increasingly diverse hardware ecosystem.
Original languageEnglish
Title of host publication2025 21st International Conference on Mobility, Sensing and Networking, MSN 2025: Proceedings
PublisherIEEE
Pages359-366
Number of pages8
ISBN (Electronic)9798331561802
DOIs
Publication statusPublished - Dec 2025
Event21st International Conference on Mobility, Sensing and Networking - Bandung, Indonesia
Duration: 3 Dec 20255 Dec 2025

Conference

Conference21st International Conference on Mobility, Sensing and Networking
Abbreviated titleMSN 2025
Country/TerritoryIndonesia
CityBandung
Period3/12/255/12/25

Fingerprint

Dive into the research topics of 'Mind the Gap: Revealing Inconsistencies Across Heterogeneous AI Accelerators'. Together they form a unique fingerprint.

Cite this