End-to-end Image Compression with Swin-Transformer

Meng WANG*, Kai ZHANG, Li ZHANG, Yue LI, Junru LI, Yue WANG, Shiqi WANG*

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

1 Citation (Scopus)

Abstract

In this paper, we propose an end-to-end image compression framework, which cooperates with the swin-transformer modules to capture the localized and non-localized similarities in image compression. In particular, the swin-transformer modules are deployed in the analysis and synthesis stages, interleaving with convolution layers. The transformer layers are expected to perceive more flexible receptive fields, such that the spatially localized and non-localized redundancies could be more effectively eliminated. The proposed method reveals the excellent capability of signal conjunction and prediction, leading to the improvement of the rate and distortion performance. Experimental results show that the proposed method is superior to the existing methods on both natural scene and screen content images, where 22.46% BD-Rate savings are achieved when compared with the BPG. Over 30% BD-Rate gains could be observed with screen content images when compared with the classical hyper-prior end-to-end coding method.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022
PublisherIEEE
ISBN (Electronic)9781665475921
DOIs
Publication statusPublished - 2022
Externally publishedYes
Event2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022 - Suzhou, China
Duration: 13 Dec 202216 Dec 2022

Conference

Conference2022 IEEE International Conference on Visual Communications and Image Processing, VCIP 2022
Country/TerritoryChina
CitySuzhou
Period13/12/2216/12/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Keywords

  • convolution
  • end-to-end compression
  • Image compression
  • transformer

Fingerprint

Dive into the research topics of 'End-to-end Image Compression with Swin-Transformer'. Together they form a unique fingerprint.

Cite this