A Jointly Guided Deep Network for Fine-Grained Cross-Modal Remote Sensing Text-Image Retrieval

Lei YANG, Yong FENG*, Mingling ZHOU, Xiancai XIONG, Yongheng WANG, Baohua QIANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

3 Citations (Scopus)


Remote sensing (RS) cross-modal text-image retrieval has great application value in many fields such as military and civilian. Existing methods utilize the deep network to project the images and texts into a common space and measure the similarity. However, the majority of those methods only utilize the inter-modality information between different modalities, which ignores the rich semantic information within the specific modality. In addition, due to the complexity of the RS images, there exists a lot of interference relation information within the extracted representation from the original features. In this paper, we propose a jointly guided deep network for fine-grained cross-modal RS text-image retrieval. First, we capture the fine-grained semantic information within the specific modality and then guide the learning of another modality of representation, which can make full use of the intra- and inter-modality information. Second, to filter out the interference information within the representation extracted from the two modalities of data, we propose an interference filtration module based on the gated mechanism. According to our experimental results, significant improvements in terms of retrieval tasks can be achieved compared with state-of-the-art algorithms. The source code is available at https://github.com/CQULab/JGDN.

Original languageEnglish
Article number2350221
JournalJournal of Circuits, Systems and Computers
Issue number13
Early online date4 Mar 2023
Publication statusPublished - 15 Sept 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2023 World Scientific Publishing Company.


  • fine-grained cross-modal retrieval
  • interference filtration
  • jointly guided deep network
  • Remote sensing image


Dive into the research topics of 'A Jointly Guided Deep Network for Fine-Grained Cross-Modal Remote Sensing Text-Image Retrieval'. Together they form a unique fingerprint.

Cite this