Deep Multiscale Fine-Grained Hashing for Remote Sensing Cross-Modal Retrieval

Jiaxiang HUANG, Yong FENG*, Mingliang ZHOU, Xiancai XIONG, Yongheng WONG, Baohua QIANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review


Hashing retrieval is a widely used technique in high spatial resolution remote sensing (RS) images due to its efficient retrieval speed and low memory overhead. However, existing hashing retrieval methods primarily focus on matching multilabel RS images, neglecting the extensive fine-grained semantic information in cross-modal RS data. Moreover, RS images exhibit notable object size differences and contain redundant features that lack effective multiscale feature extraction methods. To address these issues, we propose a novel deep multiscale fine-grained hashing (DMFH) method for cross-modal hashing retrieval of RS data. The DMFH method comprises two modules: the feature extraction module and hashing retrieval module. In the feature extraction module, we introduce a multiscale feature representation method to extract both low-level and high-level features from RS images while using a redundant optimizer to remove duplicate features. In addition, we used embedding vectors to extract fine-grained semantic information from description texts. The hashing retrieval module uses contrastive loss and triplet loss to guide the hash function toward learning and generating hash codes from extracted features. Our proposed DMFH method achieves state-of-the-art performance in two public RS image-text datasets (RSICD and RSITMD) through extensive experiments and ablation studies.

Original languageEnglish
Article number6002205
Pages (from-to)1-5
Number of pages5
JournalIEEE Geoscience and Remote Sensing Letters
Publication statusPublished - 2024
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2004-2012 IEEE.


  • Deep learning
  • fine-grained
  • hashing retrieval
  • multiscale
  • remote sensing (RS) cross-modal retrieval


Dive into the research topics of 'Deep Multiscale Fine-Grained Hashing for Remote Sensing Cross-Modal Retrieval'. Together they form a unique fingerprint.

Cite this