Multi-agent reinforcement learning via distributed MPC as a function approximator


Reference:
S. Mallick, F. Airaldi, A. Dabiri, and B. De Schutter, "Multi-agent reinforcement learning via distributed MPC as a function approximator," Automatica, vol. 167, p. 111803, Sept. 2024.

Abstract:
This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the first work to extend this idea to the multi-agent setting. We propose the use of a distributed MPC scheme as a function approximator, with a structure allowing for distributed learning and deployment. We then show that Q-learning updates can be performed distributively without introducing nonstationarity, by reconstructing a centralized learning update. The effectiveness of the approach is demonstrated on a numerical example.


Downloads:
 * Online version of the paper   [open access]


Bibtex entry:

@article{MalAir:24-012,
        author={S. Mallick and F. Airaldi and A. Dabiri and B. {D}e Schutter},
        title={Multi-agent reinforcement learning via distributed {MPC} as a function approximator},
        journal={Automatica},
        volume={167},
        pages={111803},
        month=sep,
        year={2024},
        doi={10.1016/j.automatica.2024.111803}
        }



Go to the publications overview page.


This page is maintained by Bart De Schutter. Last update: September 1, 2024.