Language Guided Local Infiltration for Interactive Image Retrieval

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

Interactive Image Retrieval (IIR) aims to retrieve images that are generally similar to the reference image but under the requested text modification. The existing methods usually concatenate or sum the features of image and text simply and roughly, which, however, is difficult to precisely change the local semantics of the image that the text intends to modify. To solve this problem, we propose a Language Guided Local Infiltration (LGLI) system, which fully utilizes the text information and penetrates text features into image features as much as possible. Specifically, we first propose a Language Prompt Visual Localization (LPVL) module to generate a localization mask which explicitly locates the region (semantics) intended to be modified. Then we introduce a Text Infiltration with Local Awareness (TILA) module, which is deployed in the network to precisely modify the reference image and generate image-text infiltrated representation. Extensive experiments on various benchmark databases validate that our method outperforms most state-of-the-art IIR approaches.
Original languageEnglish
Title of host publicationProceedings: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2023
PublisherIEEE Computer Society
Pages6104-6113
Number of pages10
ISBN (Electronic)9798350302493
DOIs
Publication statusPublished - Jun 2023
Externally publishedYes
Event2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops - Vancouver, Canada
Duration: 18 Jun 202322 Jun 2023

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
PublisherIEEE
Volume2023-June
ISSN (Print)2160-7508
ISSN (Electronic)2160-7516

Conference

Conference2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops
Abbreviated titleCVPRW 2023
Country/TerritoryCanada
CityVancouver
Period18/06/2322/06/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Funding

This work was partially supported by National Natural Science Fund of China (62271090), Chongqing Natural Science Fund (cstc2021jcyj-jqX0023), National Key R&D Program of China (2021YFB3100800), CCF Hikvision Open Fund (CCF-HIKVISION OF 20210002), CAAI-Huawei MindSpore Open Fund, and Beijing Academy of Artificial Intelligence (BAAI).

Fingerprint

Dive into the research topics of 'Language Guided Local Infiltration for Interactive Image Retrieval'. Together they form a unique fingerprint.

Cite this