Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images

Pengwei DONG, Bo WANG*, Runmin CONG, Hai-Han SUN, Chongyi LI

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

1 Citation (Scopus)


Despite salient object detection in optical remote sensing images (ORSI-SOD) has made great strides in recent years, it is still a very challenging topic due to various scales and shapes of objects, cluttered backgrounds, and diverse imaging orientations. Most previous deep learning-based methods fails to effectively capture local and global features, resulting in ambiguous localization and semantic information and inaccurate detail and boundary prediction for ORSI-SOD. In this paper, we propose a novel Transformer with large convolutional kernel decoding network, named TLCKD-Net, which effectively models the long-range dependence that is indispensable for feature extraction of ORSI-SOD. First, we utilize Transformer backbone network to perceive global and local details of salient objects. Second, a large convolutional kernel decoding module based on self-attention mechanism is designed for different sizes of salient objects to extract feature information at different scales. Then, a large convolutional refinement and a Salient Feature Enhancement Module are used to recover and refine the saliency features to obtain high quality saliency maps. Extensive experiments on two public ORSI-SOD datasets show that our proposed method outperforms 16 state-of-the-art methods both qualitatively and quantitatively. In addition, a series of ablation studies demonstrate the effectiveness of different modules for ORSI-SOD. Our source code is publicly available at

Original languageEnglish
Article number103917
JournalComputer Vision and Image Understanding
Early online date4 Jan 2024
Publication statusPublished - Mar 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Inc.


  • Large convolutional kernel
  • Optical remote sensing image
  • Salient object detection
  • Transformer


Dive into the research topics of 'Transformer with large convolution kernel decoder network for salient object detection in optical remote sensing images'. Together they form a unique fingerprint.

Cite this