Regret analysis of learning-based linear quadratic gaussian control with additive exploration


Reference:
A. Athrey, O. Mazhar, M. Guo, B. De Schutter, and S. Shi, "Regret analysis of learning-based linear quadratic gaussian control with additive exploration," Proceedings of the 2024 European Control Conference, Stockholm, Sweden, pp. 1795-1801, June 2024.

Abstract:
In this paper, we analyze the regret incurred by a computationally efficient exploration strategy, known as naive exploration, for controlling unknown partially observable systems within the Linear Quadratic Gaussian (LQG) framework. We introduce a two-phase control algorithm called LQG-NAIVE, which involves an initial phase of injecting Gaussian input signals to obtain a system model, followed by a second phase of an interplay between naive exploration and control in an episodic fashion. We show that LQG-NAIVE achieves a regret growth rate of Õ(T1/2), i.e., O(T1/2) up to logarithmic factors after T time steps, and we validate its performance through numerical simulations. Additionally, we propose LQG-IF2E, which extends the exploration signal to a 'closed-loop' setting by incorporating the Fisher Information Matrix (FIM). We provide compelling numerical evidence of the competitive performance of LQG-IF2E compared to LQG-NAIVE.


Downloads:

Bibtex entry:

@inproceedings{AthMaz:24-016,
        author={A. Athrey and O. Mazhar and M. Guo and B. {D}e Schutter and S. Shi},
        title={Regret analysis of learning-based linear quadratic gaussian control with additive exploration},
        booktitle={Proceedings of the 2024 European Control Conference},
        address={Stockholm, Sweden},
        pages={1795--1801},
        month=jun,
        year={2024},
        doi={10.23919/ECC64448.2024.10590739}
        }



Go to the publications overview page.


This page is maintained by Bart De Schutter. Last update: September 29, 2024.