Clustering-Based Semi-Supervised Cross-Modal Retrieval Using Scene Graph

Yixue KONG, Yong FENG*, Mingliang ZHOU, Xiancai XIONG, Yongheng WANG, Baohua QIANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

1 Citation (Scopus)


In this paper, we propose a clustering-based semi-supervised cross-modal retrieval method to relieve the problem of insufficient annotation in cross-modal datasets. First, we reconstruct cross-modal data as scene graph structure to filter meaningless information. Second, we extract embedding representation features of images and texts to put them into a common space. Finally, we propose a clustering-based classification method with modality-independent constraint to discriminate samples. According to our experimental results, significant improvement on performance shows the accuracy of our method in terms of three widely used cross-modal datasets compared with the state-of-the-art methods.

Original languageEnglish
Pages (from-to)1299-1314
Number of pages16
JournalJournal of Circuits, Systems and Computers
Issue number12
Publication statusPublished - 1 Aug 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 World Scientific Publishing Company.


  • Clustering classification
  • cross-modal retrieval
  • scene graph
  • semi-supervised


Dive into the research topics of 'Clustering-Based Semi-Supervised Cross-Modal Retrieval Using Scene Graph'. Together they form a unique fingerprint.

Cite this