Projects per year
Abstract
Recently, learned video compression has attracted copious research attention. However, among the existing methods, the motion used for alignment is limited to one hypothesis only, leading to inaccurate motion estimation, especially for the complicated scenes with complex movements. Motivated by multiple hypotheses philosophy in traditional video compression, we develop the multiple hypotheses based motion compensation for the learned video compression, in an effort to enhance the motion compensation efficiency by providing diverse hypotheses with efficient temporal information fusion. In particular, the multiple hypotheses module which produces multiple motions and warped features for mining sufficient temporal information, is proposed to provide various hypotheses inferences from the reference frame. To utilize these hypotheses more copiously, the hypotheses attention module is adopted by introducing the channel-wised squeeze-and-excitation layer and the multi-scale network. In addition, the context combination is employed to fuse the weighted hypotheses to generate effective contexts with powerful temporal priors. Finally, the valid contexts are used for promoting the compression efficiency by merging weighted warped features. Extensive experiments show that the proposed method can significantly improve the rate-distortion performance of learned video compression. Compared with the state-of-the-art method for end-to-end video compression, over 13% bit rate reductions on average in terms of PSNR and MS-SSIM can be achieved.
Original language | English |
---|---|
Article number | 126396 |
Number of pages | 13 |
Journal | Neurocomputing |
Volume | 548 |
Early online date | 1 Jun 2023 |
DOIs | |
Publication status | Published - 1 Sept 2023 |
Externally published | Yes |
Bibliographical note
Acknowledgement:This work is supported by Key Project of Science and Technology Innovation 2030 supported by the Ministry of Science and Technology of China (Grant No. 2018AAA0101301), the Hong Kong Innovation and Technology Commission (InnoHK Project CIMDA), and in part by the Hong Kong GRF-RGC General Research Fund under Grant 11209819 (CityU 9042816) and Grant 11203820 (9042598).
Publisher Copyright:
© 2023 Elsevier B.V.
Keywords
- Learned video compression
- Motion compensation
- Motion estimation
- Multiple hypotheses
- Temporal alignment
Fingerprint
Dive into the research topics of 'Multiple hypotheses based motion compensation for learned video compression'. Together they form a unique fingerprint.Projects
- 1 Active
-
Adaptive Dynamic Range Enhancement Oriented to High Dynamic Display (面向高動態顯示的自適應動態範圍增強)
KWONG, S. T. W. (PI), KUO, C.-C. J. (CoI), WANG, S. (CoI) & ZHANG, X. (CoI)
Research Grants Council (HKSAR)
1/01/21 → 31/12/24
Project: Grant Research