Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning

  • Shan ZUO*
  • , Yongduan SONG
  • , Frank L. LEWIS
  • , Ali DAVOUDI
  • *Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

77 Citations (Scopus)

Abstract

This paper investigates optimal robust output containment problem of general linear heterogeneous multiagent systems (MAS) with completely unknown dynamics. A model-based algorithm using offline policy iteration (PI) is first developed, where the p-copy internal model principle is utilized to address the system parameter variations. This offline PI algorithm requires the nominal model of each agent, which may not be available in most real-world applications. To address this issue, a discounted performance function is introduced to express the optimal robust output containment problem as an optimal output-feedback design problem with bounded L2-gain. To solve this problem online in real time, a Bellman equation is first developed to evaluate a certain control policy and find the updated control policies, simultaneously, using only the state/output information measured online. Then, using this Bellman equation, a model-free off-policy integral reinforcement learning algorithm is proposed to solve the optimal robust output containment problem of heterogeneous MAS, in real time, without requiring any knowledge of the system dynamics. Simulation results are provided to verify the effectiveness of the proposed method.
Original languageEnglish
Pages (from-to)3197-3207
Number of pages11
JournalIEEE Transactions on Cybernetics
Volume48
Issue number11
Early online date30 Oct 2017
DOIs
Publication statusPublished - Nov 2018
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2013 IEEE.

Funding

This work was supported in part by the State Key Development Program for Basic Research of China under Grant 2012CB215202, in part by the U.S. National Science Foundation under Grant ECCS-1405173, and in part by the Office of Naval Research under Grant N00014-17-1-2239.

Keywords

  • Heterogeneous systems
  • integral reinforcement learning (RL)
  • internal model principle
  • optimal robust output containment
  • output-feedback

Fingerprint

Dive into the research topics of 'Optimal robust output containment of unknown heterogeneous multiagent system using off-policy reinforcement learning'. Together they form a unique fingerprint.

Cite this