Abstract
Accurate estimation of object poses and sizes within specific categories is crucial for applications such as robotic manipulation and scene understanding. Despite recent advances, significant intra-category shape variations pose a great challenge, reducing the accuracy and robustness of shape prior-based methods. This paper proposes a novel network that leverages shape descriptors and local geometric features for category-level object pose estimation. By capturing geometric structures of the object through shape descriptors, our approach effectively handles shape variations and efficiently distinguishes between instances within the same category. Additionally, we design a local feature detector to extract fine-grained geometric details for enhancing shape descriptor-guided learning. Moreover, an attention mechanism is employed to adaptively highlight significant features, improving the model’s robustness for objects with complex structures. Our network also includes a confidence-based pose estimator that assigns a confidence score to each pose prediction. This integration allows for the acquisition of accurate poses with high confidence by penalizing poor poses with low confidence. Experimental results on the CAMERA25 and REAL275 datasets demonstrate the effectiveness of the proposed network, which achieves accuracy improvements of 5.1 and 12.5, respectively, under the 52cm metric compared to state-of-the-art methods. These results underscore our network’s superiority in handling objects with large shape variations and complex structures. The code will be released at https://github.com/yliu1999/Shape-Descriptor.
| Original language | English |
|---|---|
| Article number | 185 |
| Number of pages | 20 |
| Journal | Multimedia Tools and Applications |
| Volume | 85 |
| Issue number | 3 |
| Early online date | 25 Feb 2026 |
| DOIs | |
| Publication status | Published - Mar 2026 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2026.
Funding
The work described in this paper was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (No. UGC/FDS16/E14/21).
Keywords
- Attention mechanism
- Object pose estimation
- Shape descriptor-guided learning
Fingerprint
Dive into the research topics of 'Fusing shape descriptors and geometric details for robust category-level object pose estimation'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver