Abstract
Contextual policy search methods have demonstrated the potential to acquire robotic skill generalization on trajectory-shaping-based tasks. However, it is still challenging for robotic contact-rich manipulation tasks because contact force regulation, reference trajectory adaptation, and task generalization must be fulfilled simultaneously. To this end, a hierarchical compliance-based contextual policy search (HC-CPS) approach is proposed to learn the robotic compliant skills for force, motion, and task adaptation. Specifically, the parameterized impedance-conditioned action space is proposed for reinforcement learning lower-level policy to obtain the compliance for reference motion regulation and contact force control, while a linear Gaussian contextual policy is formulated as the higher-level policy to optimize the context-conditioned impedance parameters for task generalization; therefore, a family of contact-rich manipulation tasks with multiple objectives is achieved. Moreover, data efficiency is further improved by two aspects: first, a variation encoder-decoder model is proposed to estimate the underlying constraints of impedance parameters over the actions, leading to the mitigated extrapolation error for lower-level policy off-policy learning; second, a composite forward model is proposed to generate artificial trajectories and reduce the reward bias for higher-level contextual policy learning. The HC-CPS approach is validated by three simulated manipulation tasks and the real-world dual peg-in-hole assembly tasks with two kinds of objectives, and the results demonstrate the effectiveness of HC-CPS.
| Original language | English |
|---|---|
| Pages (from-to) | 5444-5455 |
| Number of pages | 12 |
| Journal | IEEE Transactions on Industrial Informatics |
| Volume | 19 |
| Issue number | 4 |
| Early online date | 19 Jul 2022 |
| DOIs | |
| Publication status | Published - Apr 2023 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2005-2012 IEEE.
Funding
This work was supported in part by the Natural Science Foundation of Beijing Municipality under Grant L192001, in part by the National Natural Science Foundation of China under Grant 51935010 and Grant 62173198, in part by the China National Postdoctoral Program for Innovative Talents under Grant BX2021152, and in part by the State Key Laboratory of Tribology of China under Grant SKLT2022C17.
Keywords
- Contextual policy search
- reinforcement learning
- robotic manipulation
- task-level generalization