Reinforcement learning-based QoE-oriented dynamic adaptive streaming framework

Xuekai WEI, Mingliang ZHOU, Sam KWONG, Hui YUAN, Shiqi WANG, Guopu ZHU, Jingchao CAO

Research output: Journal PublicationsJournal Article (refereed)peer-review

18 Citations (Scopus)


Dynamic adaptive streaming over the HTTP (DASH) standard has been widely adopted by many content providers for online video transmission and greatly improve the performance. Designing an efficient DASH system is challenging because of the inherent large fluctuations characterizing both encoded video sequences and network traces. In this paper, a reinforcement learning (RL)-based DASH technique that addresses user quality of experience (QoE) is constructed. The DASH adaptive bitrate (ABR) selection problem is formulated as a Markov decision process (MDP) problem. Accordingly, an RL-based solution is proposed to solve the MDP problem, in which the DASH clients act as the RL agent, and the network variation constitutes the environment. The proposed user QoE is used as the reward by jointly considering the video quality and buffer status. The goal of the RL algorithm is to select a suitable video quality level for each video segment to maximize the total reward. Then, the proposed RL-based ABR algorithm is embedded in the QoE-oriented DASH framework. Experimental results show that the proposed RL-based ABR algorithm outperforms state-of-the-art schemes in terms of both temporal and visual QoE factors by a noticeable margin while guaranteeing application-level fairness when multiple clients share a bottlenecked network.
Original languageEnglish
Pages (from-to)786-803
JournalInformation Sciences
Early online date13 May 2021
Publication statusPublished - Aug 2021
Externally publishedYes

Bibliographical note

This work was supported in part by the Natural Science Foundation of China under Grant 61801303, 61672443, and 61871342, in part by Hong Kong RGC General Research Fund 9042489 under Grant CityU 11206317, under Grant 9042958 (CityU 11203820), under Grant 9042816 (CityU 11209819), in part by the General Program of National Natural Science Foundation of Chongqing under Grant cstc2020jcyj-msxmX0790, the Fundamental Research Funds for the Central Universities under Grant 2020CDJ-LHZZ-052, the Guangxi Key Laboratory of Cryptography and Information Security under Grant GCIS201905, the Human Resources and Social Security Bureau project of Chongqing under Grant cx2020073, the Suzhou Institute of USTC under Grant H20201528 and the Equipment advance research fund under Grant 80915010102.


  • Machine learning
  • Quality of experience
  • Reinforcement learning


Dive into the research topics of 'Reinforcement learning-based QoE-oriented dynamic adaptive streaming framework'. Together they form a unique fingerprint.

Cite this