TY - JOUR
T1 - Neural Network Based Multi-Level In-Loop Filtering for Versatile Video Coding
AU - ZHU, Linwei
AU - ZHANG, Yun
AU - LI, Na
AU - WU, Wenhui
AU - WANG, Shiqi
AU - KWONG, Sam
N1 - Publisher Copyright:
IEEE
PY - 2024/6/28
Y1 - 2024/6/28
N2 - To further improve the performance of Versatile Video Coding (VVC), a neural network based multi-level in-loop filtering framework for luma and chroma is presented in this letter, which includes Reference pixel Level (RL), Coding tree unit Level (CL), and Frame Level (FL). The neural network based filters in these levels can be flexibly enabled. In RL, the coding performance upper bound is analyzed and asymmetric convolution is designed. In CL, the pixels located at the bottom and rightmost have been assigned greater weights for loss calculation during training. In addition, the co-located luma is adopted in CL and FL chroma filtering for guiding chroma enhancement due to the high correlation between them. For the architecture of neural network, two input channel fusion schemes are combined to enjoy both of their benefits. Extensive experimental results show that the proposed multi-level in-loop filtering method can achieve 6.87%, 32.8%, and 36.9% bit rate reductions on average for Y, U, and V components under all intra configuration, which outperforms the state-of-the-art works.
AB - To further improve the performance of Versatile Video Coding (VVC), a neural network based multi-level in-loop filtering framework for luma and chroma is presented in this letter, which includes Reference pixel Level (RL), Coding tree unit Level (CL), and Frame Level (FL). The neural network based filters in these levels can be flexibly enabled. In RL, the coding performance upper bound is analyzed and asymmetric convolution is designed. In CL, the pixels located at the bottom and rightmost have been assigned greater weights for loss calculation during training. In addition, the co-located luma is adopted in CL and FL chroma filtering for guiding chroma enhancement due to the high correlation between them. For the architecture of neural network, two input channel fusion schemes are combined to enjoy both of their benefits. Extensive experimental results show that the proposed multi-level in-loop filtering method can achieve 6.87%, 32.8%, and 36.9% bit rate reductions on average for Y, U, and V components under all intra configuration, which outperforms the state-of-the-art works.
KW - In-loop filtering
KW - multi-level
KW - neural network
KW - versatile video coding
UR - http://www.scopus.com/inward/record.url?scp=85197060427&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2024.3420435
DO - 10.1109/TCSVT.2024.3420435
M3 - Journal Article (refereed)
SN - 1051-8215
SP - 1
EP - 1
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -