Abstract
Learned reference picture resampling control (LRPRC) adaptively adjusts the coding scale for each frame using an offline-trained neural network. It demonstrates promising promising rate-distortion (R-D) performance improvements over traditional methods, particularly in high-resolution, low-bit-rate video coding scenarios. However, existing LRPRC methods rely exclusively on locally optimal decision labels derived from greedy strategies for network training, leading to suboptimal control performance. To address this limitation, we introduce a novel data-centric solution that substantially improves training label quality, thereby enhancing overall LRPRC performance. Specifically, our key contribution is a parallelized beam search-based coding scale labeling algorithm, which captures decision dependencies across coding steps and produces higher-quality training labels with enhanced R-D performance. By fully exploiting the intra-trellis and inter-trellis parallelism of beam search and hierarchical coding, our proposed labeling algorithm achieves logarithmic-squared time complexity, making it highly suitable for large-scale cluster computing. We validate this simple yet effective data-centric LRPRC approach in the Versatile Video Encoder (VVenC) using 4K video sequences. Experimental results demonstrate that merely upgrading the beam search labels (without any neural architecture re-designs) consistently outperforms the state-of-the-art LRPRC method, achieving BD-rate reductions of 5.09%, 3.98%, and 3.59% under the fast, medium, and slow presets, respectively.
| Original language | English |
|---|---|
| Number of pages | 11 |
| Journal | IEEE Transactions on Circuits and Systems for Video Technology |
| Early online date | 12 Jan 2026 |
| DOIs | |
| Publication status | E-pub ahead of print - 12 Jan 2026 |
Bibliographical note
Publisher Copyright:© 1991-2012 IEEE.
Funding
This work was supported in part by the “Pioneer and Leading +X” Science and Technology Plan, Zhejiang Provincial Science and Technology Plan Project (No. 2025C01035), in part by the National Key R&D Program of China (2023YFA1008500), in part by the National Natural Science Foundation of China (NSFC) under Grants U22B2035 and 62502116, and in part by the China Post-Doctoral Science Foundation under Grant 2025M774315.
Keywords
- Versatile video coding
- resampling-based compression
- beam search
- machine learning