Skip to main navigation Skip to search Skip to main content

Prototype-based Multi-view Fine-Grained 3D Classification and Ad-Hoc Interpretability

  • Shuxian MA
  • , Zihao DONG*
  • , Runmin CONG
  • , Sam KWONG
  • , Xiuli SHAO
  • *Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Deep learning-based multi-view coarse-grained 3D shape classification has achieved remarkable success over the past decade, leveraging the powerful feature learning capabilities of various backbone architectures including both CNN-based and ViT-based models. However, as a challenging research area critical for detailed shape understanding, fine-grained 3D classification remains understudied due to the limited discriminative information captured during multi-view feature aggregation. Our analysis reveals that current state-of-the-art methods are significantly limited by subtle inter-class variations, severe class imbalance scenarios, and inherent interpretability of parametric model decision-making. To address these problems, we propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification, which achieves a paradigm shift from parametric softmax to non-parametric prototype learning while simultaneously advancing interpretability from post-hoc to ad-hoc. Firstly, Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association, where both feature types are adaptively mapped to shared learnable prototypes in a unified embedding space. Secondly, our framework automatically refines prototypes via Online Clustering, improving both the robustness of multi-view feature allocation and inter-subclass balance. Finally, we establish prototype-guided supervised learning with joint optimization, which enhances fine-grained discrimination via prototype-view correlation analysis and enables ad-hoc interpretability through transparent case-based reasoning. Experimental results on FG3D and ModelNet40 datasets demonstrate that Proto-FG3D surpasses state-of-the-art methods in classification accuracy, transparent predictions, and extensive ad-hoc interpretability analysis with visualizations, challenging conventional approaches to fine-grained 3D recognition model design.
Original languageEnglish
Article number113596
JournalPattern Recognition
Volume179
Early online date26 Mar 2026
DOIs
Publication statusE-pub ahead of print - 26 Mar 2026

Bibliographical note

Publisher Copyright:
© 2026 Elsevier Ltd

Funding

This work was supported in part by the Youth Foundation of Shandong Natural Science Foundation of China ZR2021QF043, in part by the Taishan Scholar Project of Shandong Province under Grant tsqn202306079, in part by Xiaomi Young Talents Program, in part by the Hong Kong GRF-RGC General Research Fund under Grant 11209819, Grant 11,203,820 and Grant 13200425, in part by the the National Natural Science Foundation of China under Grant 62471278, and in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant STG5/E-103/24-R.

Keywords

  • 3D fine-grained shape classification
  • Interpretability
  • Multi-view images
  • Prototype learning

Fingerprint

Dive into the research topics of 'Prototype-based Multi-view Fine-Grained 3D Classification and Ad-Hoc Interpretability'. Together they form a unique fingerprint.

Cite this