TY - JOUR
T1 - LeSkill: Structured Skill Learning for Long-Horizon Robotic Manipulation Tasks
AU - HUANG, Xiucai
AU - CHEN, Shifeng
AU - SONG, Yongduan
PY - 2025/10/6
Y1 - 2025/10/6
N2 - In long-horizon tasks, leveraging prior knowledge to streamline task execution is essential. However, navigating complex environments to achieve long-term objectives poses a significant challenge due to the vast exploration space involved. To address this issue, we propose a skill-based hierarchical reinforcement learning (RL) framework, termed LeSkill. This framework utilizes a conditional generative model to pretrain a comprehensive and generalizable skill repository from heterogeneous datasets, facilitating skill inference across diverse contexts. This strategy enhances transferability to novel tasks, thereby minimizing the need for extensive, task-specific training. Subsequently, a concise set of task-specific demonstrations is employed to guide the selection process, allowing the model to efficiently sample relevant skills from the pre-existing skill repository, which effectively reduces the exploration space. This approach accelerates the acquisition of highly effective policies tailored for task completion. Our framework undergoes rigorous evaluation on two challenging long horizon, multistep tasks: a standard task and a distribution mismatch task. The results highlight the framework’s superior performance in mastering intricate tasks and its remarkable generalization capabilities.
AB - In long-horizon tasks, leveraging prior knowledge to streamline task execution is essential. However, navigating complex environments to achieve long-term objectives poses a significant challenge due to the vast exploration space involved. To address this issue, we propose a skill-based hierarchical reinforcement learning (RL) framework, termed LeSkill. This framework utilizes a conditional generative model to pretrain a comprehensive and generalizable skill repository from heterogeneous datasets, facilitating skill inference across diverse contexts. This strategy enhances transferability to novel tasks, thereby minimizing the need for extensive, task-specific training. Subsequently, a concise set of task-specific demonstrations is employed to guide the selection process, allowing the model to efficiently sample relevant skills from the pre-existing skill repository, which effectively reduces the exploration space. This approach accelerates the acquisition of highly effective policies tailored for task completion. Our framework undergoes rigorous evaluation on two challenging long horizon, multistep tasks: a standard task and a distribution mismatch task. The results highlight the framework’s superior performance in mastering intricate tasks and its remarkable generalization capabilities.
UR - https://www.scopus.com/pages/publications/105018375940
U2 - 10.1109/TSMC.2025.3605528
DO - 10.1109/TSMC.2025.3605528
M3 - Journal Article (refereed)
SN - 2168-2216
JO - IEEE Transactions on Systems, Man, and Cybernetics: Systems
JF - IEEE Transactions on Systems, Man, and Cybernetics: Systems
ER -