CSformer: Bridging Convolution and Transformer for Compressive Sensing

Dongjie YE, Zhangkai NI, Hanli WANG, Jian ZHANG, Shiqi WANG, Sam KWONG

Research output: Journal PublicationsJournal Article (refereed)peer-review

40 Citations (Scopus)

Abstract

Convolutional Neural Networks (CNNs) dominate image processing but suffer from local inductive bias, which is addressed by the transformer framework with its inherent ability to capture global context through self-attention mechanisms. However, how to inherit and integrate their advantages to improve compressed sensing is still an open issue. This paper proposes CSformer, a hybrid framework to explore the representation capacity of local and global features. The proposed approach is well-designed for end-to-end compressive image sensing, composed of adaptive sampling and recovery. In the sampling module, images are measured block-by-block by the learned sampling matrix. In the reconstruction stage, the measurements are projected into an initialization stem, a CNN stem, and a transformer stem. The initialization stem mimics the traditional reconstruction of compressive sensing but generates the initial reconstruction in a learnable and efficient manner. The CNN stem and transformer stem are concurrent, simultaneously calculating fine-grained and long-range features and efficiently aggregating them. Furthermore, we explore a progressive strategy and window-based transformer block to reduce the parameters and computational complexity. The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing, which achieves superior performance compared to state-of-the-art methods on different datasets. Our codes is available at: https://github.com/Lineves7/CSformer.
Original languageEnglish
Pages (from-to)2827-2842
Number of pages16
JournalIEEE Transactions on Image Processing
Volume32
Early online date15 May 2023
DOIs
Publication statusPublished - 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 1992-2012 IEEE.

Funding

This work is supported by the Hong Kong Innovation and Technology Commission [InnoHK Project Centre for Intelligent Multidimensional Data Analysis (CIMDA)]; in part by the Hong Kong GRF-RGC General Research Fund under Grant 11209819 (CityU 9042816) and Grant 11203820 (CityU 9042598); in part by the Natural Science Foundation of China under Grant 62201387, Grant 61772344, and Grant 61672443; in part by the Shanghai Pujiang Program under Grant 22PJ1413300; in part by the Fundamental Research Funds for the Central Universities; and in part by the Key Project of Science and Technology Innovation 2030 supported by the Ministry of Science and Technology of China under Grant 2018AAA0101301. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Lisimachos P. Kondi. (Corresponding author: Sam Kwong.)

Keywords

  • CNN
  • Compressive sensing
  • image reconstruction
  • transformer

Fingerprint

Dive into the research topics of 'CSformer: Bridging Convolution and Transformer for Compressive Sensing'. Together they form a unique fingerprint.

Cite this