Deep learning-based intra mode derivation for versatile video coding

Linwei ZHU, Yun ZHANG, Na LI, Gangyi JIANG, Sam KWONG

Research output: Journal PublicationsJournal Article (refereed)peer-review

3 Citations (Scopus)

Abstract

In intra coding, Rate Distortion Optimization (RDO) is performed to achieve the optimal intra mode from a pre-defined candidate list. The optimal intra mode is also required to be encoded and transmitted to the decoder side besides the residual signal, where lots of coding bits are consumed. To further improve the performance of intra coding in Versatile Video Coding (VVC) , an intelligent intra mode derivation method is proposed in this paper, termed as Deep Learning based Intra Mode Derivation (DLIMD) . In specific, the process of intra mode derivation is formulated as a multi-class classification task, which aims to skip the module of intra mode signaling for coding bits reduction. The architecture of DLIMD is developed to adapt to different quantization parameter settings and variable coding blocks including non-square ones, where only one single trained model is required. Different from the existing deep learning based classification problems, the hand-crafted features are also fed into intra mode derivation network besides the learned features from feature learning network. To compete with traditional methods, one additional binary flag is utilized in the video codec to indicate the selected scheme with RDO. Extensive experimental results reveal that the proposed method can achieve 2.28%, 1.74%, and 2.18% bit rate reduction on average for Y, U, and V components on the platform of VVC test model, which outperforms the state-of-the-art works.
Original languageEnglish
Pages (from-to)1-20
Number of pages20
JournalACM Transactions on Multimedia Computing, Communications, and Applications
Volume19
Issue number2s
DOIs
Publication statusPublished - 17 Feb 2023
Externally publishedYes

Fingerprint

Dive into the research topics of 'Deep learning-based intra mode derivation for versatile video coding'. Together they form a unique fingerprint.

Cite this