Global-and-Local Collaborative Learning for Co-Salient Object Detection

Runmin CONG, Ning YANG, Chongyi LI, Huazhu FU, Yao ZHAO, Qingming HUANG, Sam KWONG

Research output: Journal PublicationsJournal Article (refereed)peer-review

53 Citations (Scopus)

Abstract

The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. Therefore, how to effectively extract interimage correspondence is crucial for the CoSOD task. In this article, we propose a global-and-local collaborative learning (GLNet) architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture the comprehensive interimage corresponding relationship among different images from the global and local perspectives. First, we treat different images as different time slices and use 3-D convolution to integrate all intrafeatures intuitively, which can more fully extract the global group semantics. Second, we design a pairwise correlation transformation (PCT) to explore similarity correspondence between pairwise images and combine the multiple local pairwise correspondences to generate the local interimage relationship. Third, the interimage relationships of the GCM and LCM are integrated through a global-and-local correspondence aggregation (GLA) module to explore more comprehensive interimage collaboration cues. Finally, the intra and inter features are adaptively integrated by an intra-and-inter weighting fusion (AEWF) module to learn co-saliency features and predict the co-saliency map. The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms 11 state-of-the-art competitors trained on some large datasets (about 8k–200k images).
Original languageEnglish
Pages (from-to)1920-1931
Number of pages12
JournalIEEE Transactions on Cybernetics
Volume53
Issue number3
Early online date22 Jul 2022
DOIs
Publication statusPublished - Mar 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Funding

This work was supported in part by the National Key Research and Development Program of China under Grant 2021ZD0112100; in part by the Beijing Nova Program under Grant Z201100006820016; in part by the National Natural Science Foundation of China under Grant 62002014, Grant U1936212, Grant 62120106009, Grant U21B2038, and Grant 61931008; in part by the Beijing Natural Science Foundation under Grant 4222013; in part by the Hong Kong Research Grants Council (RGC) General Research Funds under Grant 9042816 (CityU 11209819) and Grant 9042958 (CityU 11203820); in part by the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA); in part by the Young Elite Scientist Sponsorship Program by the China Association for Science and Technology under Grant 2020QNRC001; in part by the Young Elite Scientist Sponsorship Program by the Beijing Association for Science and Technology; in part by the Hong Kong Scholars Program under Grant XJ2020040; in part by the CAAI-Huawei MindSpore Open Fund; and in part by the Miss Yang's Project of the Fundamental Research Funds for the Central Universities under Grant 2021YJS046.

Keywords

  • 3-D convolution
  • co-salient object detection (CoSOD)
  • global correspondence modeling (GCM)
  • local correspondence modeling (LCM)

Fingerprint

Dive into the research topics of 'Global-and-Local Collaborative Learning for Co-Salient Object Detection'. Together they form a unique fingerprint.

Cite this