Extreme Image Compression Using Fine-tuned VQGANs

Qi MAO*, Tinghan YANG, Yinuo ZHANG, Zijian WANG, Meng WANG, Shiqi WANG, Libiao JIN, Siwei MA

*Corresponding author for this work

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Researchpeer-review

Abstract

Recent advances in generative compression methods have demonstrated remarkable progress in enhancing the perceptual quality of compressed data, especially in scenarios with low bitrates. However, their efficacy and applicability to achieve extreme compression ratios (< 0.05 bpp) remain constrained. In this work, we propose a simple yet effective coding framework by introducing vector quantization (VQ)-based generative models into the image compression domain. The main insight is that the codebook learned by the VQGAN model yields a strong expressive capacity, facilitating efficient compression of continuous information in the latent space while maintaining reconstruction quality. Specifically, an image can be represented as VQ-indices by finding the nearest codeword, which can be encoded using lossless compression methods into bitstreams. We propose clustering a pre-trained large-scale codebook into smaller codebooks through the K-means algorithm, yielding variable bitrates and different levels of reconstruction quality within the coding framework. Furthermore, we introduce a transformer to predict lost indices and restore images in unstable environments. Extensive qualitative and quantitative experiments on various benchmark datasets demonstrate that the proposed framework outperforms state-of-the-art codecs in terms of perceptual quality-oriented metrics and human perception at extremely low bitrates (≤ 0.04 bpp). Remarkably, even with the loss of up to 20% of indices, the images can be effectively restored with minimal perceptual loss.

Original languageEnglish
Title of host publicationProceedings : DCC 2024 : 2024 Data Compression Conference
EditorsAli BILGIN, James E. FOWLER, Joan SERRA-SAGRISTA, Yan YE, James A. STORER
PublisherIEEE
Pages203-212
Number of pages10
ISBN (Electronic)9798350385878
DOIs
Publication statusPublished - 2024
Externally publishedYes
Event2024 Data Compression Conference, DCC 2024 - Snowbird, United States
Duration: 19 Mar 202422 Mar 2024

Conference

Conference2024 Data Compression Conference, DCC 2024
Country/TerritoryUnited States
CitySnowbird
Period19/03/2422/03/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Fingerprint

Dive into the research topics of 'Extreme Image Compression Using Fine-tuned VQGANs'. Together they form a unique fingerprint.

Cite this