Abstract
Few-shot segmentation (FSS) methods aim to segment objects using only a few pixel-level annotated samples. Current approaches either derive a generalized class representation from support samples to guide the segmentation of query samples, which often discards crucial spatial contextual information, or rely heavily on spatial affinity between support and query samples, without adequately summarizing and utilizing the core information of the target class. Consequently, the former struggles with fine detail accuracy, while the latter tends to produce errors in overall localization. To address these issues, we propose a novel FSS framework, CCFormer, which balances the transmission of core semantic concepts with the modeling of spatial context, improving both macro and micro-level segmentation accuracy. Our approach introduces three key modules: (1) the Concept Perception Generation (CPG) module, which leverages pre-trained category perception capabilities to capture high-quality core representations of the target class; (2) the Concept-Feature Integration (CFI) module, which injects the core class information into both support and query features during feature extraction; and (3) the Contextual Distribution Mining (CDM) module, which utilizes a Brownian Distance Covariance matrix to model the spatial-channel distribution between support and query samples, preserving the fine-grained integrity of the target. Experimental results on the PASCAL-5i and COCO-20i datasets demonstrate that CCFormer achieves state-of-the-art performance, with visualizations further validating its effectiveness. Our code is available at github.com/lourise/ccformer.
| Original language | English |
|---|---|
| Pages (from-to) | 9190-9204 |
| Number of pages | 15 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Volume | 35 |
| Issue number | 9 |
| Early online date | 24 Mar 2025 |
| DOIs | |
| Publication status | Published - 2025 |
Bibliographical note
Publisher Copyright:© 1991-2012 IEEE.
Funding
This work was supported in part by Taishan Scholar Project of Shandong Province under Grant tsqn202306079, in part by the National Natural Science Foundation of China under Grant 62471278, in part by the Key Project of Science and Technology Innovation 2030 funded by the Ministry of Science and Technology of China under Grant 2018AAA0101301, and in part by Hong Kong General Research Fund (GRF)-Research Grant Council (RGC) GRF under Grant 11209819 (CityU 9042816) and Grant 11203820 (CityU 9042598).
Keywords
- Few-shot Learning
- Few-shot Segmentation
- Semantic Segmentation