TY - JOUR
T1 - Reinforcement learning based coding unit early termination algorithm for high efficiency video coding
AU - LI, Na
AU - ZHANG, Yun
AU - ZHU, Linwei
AU - LUO, Wenhan
AU - KWONG, Sam
PY - 2019/4
Y1 - 2019/4
N2 - In this paper, we propose a Reinforcement Learning (RL) based Coding Unit (CU) early termination algorithm for High Efficiency Video Coding (HEVC). RL is utilized to learn a CU early termination classifier independent of depths for low complexity video coding. Firstly, we model the process of CU decision as a Markov Decision Process (MDP) according to the Markov property of CU decision. Secondly, based on the MDP, a CU early termination classifier independent of depths is learned from trajectories of CU decision across different depths with the end-to-end actor-critic RL algorithm. Finally, a CU decision early termination algorithm is introduced with the learned classifier, so as to reduce computational complexity of CU decision. We implement the proposed scheme with different neural network structures. Two different neural network structures are utilized in the implementation of RL based video encoder, which are evaluated to reduce video coding complexity by 34.34% and 43.33%. With regard to Bjøntegaard delta peak signal-to-noise ratio and Bjøntegaard delta bit rate, the results are −0.033 dB and 0.85%, −0.099 dB and 2.56% respectively on average under low delay B main configuration, when compared with the HEVC test model version 16.5.
AB - In this paper, we propose a Reinforcement Learning (RL) based Coding Unit (CU) early termination algorithm for High Efficiency Video Coding (HEVC). RL is utilized to learn a CU early termination classifier independent of depths for low complexity video coding. Firstly, we model the process of CU decision as a Markov Decision Process (MDP) according to the Markov property of CU decision. Secondly, based on the MDP, a CU early termination classifier independent of depths is learned from trajectories of CU decision across different depths with the end-to-end actor-critic RL algorithm. Finally, a CU decision early termination algorithm is introduced with the learned classifier, so as to reduce computational complexity of CU decision. We implement the proposed scheme with different neural network structures. Two different neural network structures are utilized in the implementation of RL based video encoder, which are evaluated to reduce video coding complexity by 34.34% and 43.33%. With regard to Bjøntegaard delta peak signal-to-noise ratio and Bjøntegaard delta bit rate, the results are −0.033 dB and 0.85%, −0.099 dB and 2.56% respectively on average under low delay B main configuration, when compared with the HEVC test model version 16.5.
KW - Actor-critic
KW - Coding tree unit
KW - Early termination
KW - High efficiency video coding
KW - Markov decision processing
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=85062370650&partnerID=8YFLogxK
U2 - 10.1016/j.jvcir.2019.02.021
DO - 10.1016/j.jvcir.2019.02.021
M3 - Journal Article (refereed)
SN - 1047-3203
VL - 60
SP - 276
EP - 286
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
ER -