TY - JOUR
T1 - Prototype-based Multi-view Fine-Grained 3D Classification and Ad-Hoc Interpretability
AU - MA, Shuxian
AU - DONG, Zihao
AU - CONG, Runmin
AU - KWONG, Sam
AU - SHAO, Xiuli
N1 - Publisher Copyright:
© 2026 Elsevier Ltd
PY - 2026/3/26
Y1 - 2026/3/26
N2 - Deep learning-based multi-view coarse-grained 3D shape classification has achieved remarkable success over the past decade, leveraging the powerful feature learning capabilities of various backbone architectures including both CNN-based and ViT-based models. However, as a challenging research area critical for detailed shape understanding, fine-grained 3D classification remains understudied due to the limited discriminative information captured during multi-view feature aggregation. Our analysis reveals that current state-of-the-art methods are significantly limited by subtle inter-class variations, severe class imbalance scenarios, and inherent interpretability of parametric model decision-making. To address these problems, we propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification, which achieves a paradigm shift from parametric softmax to non-parametric prototype learning while simultaneously advancing interpretability from post-hoc to ad-hoc. Firstly, Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association, where both feature types are adaptively mapped to shared learnable prototypes in a unified embedding space. Secondly, our framework automatically refines prototypes via Online Clustering, improving both the robustness of multi-view feature allocation and inter-subclass balance. Finally, we establish prototype-guided supervised learning with joint optimization, which enhances fine-grained discrimination via prototype-view correlation analysis and enables ad-hoc interpretability through transparent case-based reasoning. Experimental results on FG3D and ModelNet40 datasets demonstrate that Proto-FG3D surpasses state-of-the-art methods in classification accuracy, transparent predictions, and extensive ad-hoc interpretability analysis with visualizations, challenging conventional approaches to fine-grained 3D recognition model design.
AB - Deep learning-based multi-view coarse-grained 3D shape classification has achieved remarkable success over the past decade, leveraging the powerful feature learning capabilities of various backbone architectures including both CNN-based and ViT-based models. However, as a challenging research area critical for detailed shape understanding, fine-grained 3D classification remains understudied due to the limited discriminative information captured during multi-view feature aggregation. Our analysis reveals that current state-of-the-art methods are significantly limited by subtle inter-class variations, severe class imbalance scenarios, and inherent interpretability of parametric model decision-making. To address these problems, we propose the first prototype-based framework named Proto-FG3D for fine-grained 3D shape classification, which achieves a paradigm shift from parametric softmax to non-parametric prototype learning while simultaneously advancing interpretability from post-hoc to ad-hoc. Firstly, Proto-FG3D establishes joint multi-view and multi-category representation learning via Prototype Association, where both feature types are adaptively mapped to shared learnable prototypes in a unified embedding space. Secondly, our framework automatically refines prototypes via Online Clustering, improving both the robustness of multi-view feature allocation and inter-subclass balance. Finally, we establish prototype-guided supervised learning with joint optimization, which enhances fine-grained discrimination via prototype-view correlation analysis and enables ad-hoc interpretability through transparent case-based reasoning. Experimental results on FG3D and ModelNet40 datasets demonstrate that Proto-FG3D surpasses state-of-the-art methods in classification accuracy, transparent predictions, and extensive ad-hoc interpretability analysis with visualizations, challenging conventional approaches to fine-grained 3D recognition model design.
KW - 3D fine-grained shape classification
KW - Interpretability
KW - Multi-view images
KW - Prototype learning
UR - https://www.scopus.com/pages/publications/105034387632
U2 - 10.1016/j.patcog.2026.113596
DO - 10.1016/j.patcog.2026.113596
M3 - Journal Article (refereed)
SN - 0031-3203
VL - 179
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 113596
ER -