Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning

Simi JOB, Xiaohui TAO, Lin LI, Haoran XIE, Taotao CAI, Jianming YONG, Qing LI

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Personalized clinical decision support systems are increasingly being adopted due to the emergence of data-driven technologies, with this approach now gaining recognition in critical care. The task of incorporating diverse patient conditions and treatment procedures into critical care decision-making can be challenging due to the heterogeneous nature of medical data. Advances in Artificial Intelligence (AI), particularly Reinforcement Learning (RL) techniques, enables the development of personalized treatment strategies for severe illnesses by using a learning agent to recommend optimal policies. In this study, we propose a Deep Reinforcement Learning (DRL) model with a tailored reward function and an LSTM-GRU-derived state representation to formulate optimal treatment policies for vasopressor administration in stabilizing patient physiological states in critical care settings. Using an ICU dataset and the Medical Information Mart for Intensive Care (MIMIC-III) dataset, we focus on patients with Acute Respiratory Distress Syndrome (ARDS) that has led to Sepsis, to derive optimal policies that can prioritize patient recovery over patient survival. Both the DDQN (RepDRL-DDQN) and Dueling DDQN (RepDRL-DDDQN) versions of the DRL model surpass the baseline performance, with the proposed model’s learning agent achieving an optimal learning process across our performance measuring schemes. The robust state representation served as the foundation for enhancing the model’s performance, ultimately providing an optimal treatment policy focused on rapid patient recovery.
Original languageEnglish
JournalACM Transactions on Intelligent Systems and Technology
DOIs
Publication statusPublished - 1 Feb 2024

Fingerprint

Dive into the research topics of 'Optimal Treatment Strategies for Critical Patients with Deep Reinforcement Learning'. Together they form a unique fingerprint.

Cite this