Abstract
Salient object detection (SOD) in remote sensing images faces significant challenges due to large variations in object sizes, the computational cost of self-attention mechanisms, and the limitations of convolutional neural network (CNN)-based extractors in capturing global context and long-range dependencies. Existing methods that rely on fixed convolution kernels often struggle to adapt to diverse object scales, leading to detail loss or irrelevant feature aggregation. To address these issues, this work aims to enhance robustness to scale variations and achieve precise object localization. We propose the region proportion-aware dynamic adaptive SOD network (RDNet), which replaces the CNN backbone with the Swin Transformer for global context modeling and introduces three key modules: 1) the dynamic adaptive detail-aware (DAD) module, which applies varied convolution kernels guided by object region proportions; 2) the frequency-matching context enhancement (FCE) module, which enriches contextual information through wavelet interactions and attention; and 3) the region proportion-aware localization (RPL) module, which employs cross-attention to highlight semantic details and integrates a proportion guidance (PG) block to assist the DAD module. By combining these modules, RDNet achieves robustness against scale variations and accurate localization, delivering superior detection performance compared with state-of-the-art methods.
| Original language | English |
|---|---|
| Article number | 5647912 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Geoscience and Remote Sensing |
| Volume | 63 |
| Early online date | 20 Oct 2025 |
| DOIs | |
| Publication status | Published - 2025 |
Bibliographical note
Publisher Copyright:© 1980-2012 IEEE.
Funding
This work was supported in part by the opening project of State Key Laboratory of Autonomous Intelligent Unmanned Systems under Grant ZZKF2025-2-8, in part by the Taishan Scholar Project of Shandong Province under Grant tsqn202306079, in part by the National Natural Science Foundation of China Grant 62471278 and Grant 62271180, and in part by the Research Grants Council of the Hong Kong Special Administrative Region, China under Grant STG5/E-103/24-R.
Keywords
- Dynamic adaptive detail-aware (DAD) module
- frequency-matching context enhancement (FCE) module
- optical remote sensing image
- region proportion-aware localization (RPL) module
- salient object detection (SOD)