Policy search with cross-entropy optimization of basis functions


Reference:
L. Busoniu, D. Ernst, B. De Schutter, and R. Babuska, "Policy search with cross-entropy optimization of basis functions," Proceedings of the 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009), Nashville, Tennessee, pp. 153-160, Mar.-Apr. 2009.

Abstract:
This paper introduces a novel algorithm for approximate policy search in continuous-state, discrete-action Markov Decision Process (MDP). Previous policy search approaches have typically used ad-hoc parameterizations developed for specific MDPs. In contrast, the novel algorithm employs a flexible policy parameterization, suitable for solving general discrete-action MDPs. The algorithm looks for the best closed-loop policy that can be represented using a given number of basis functions, where a discrete action is assigned to each basis function. The locations and shapes of the basis functions are optimized, together with the action assignments. This allows a large class of policies to be represented. The optimization is carried out with the cross-entropy method and evaluates the policies by their empirical return from a representative set of initial states. We report simulation experiments in which the algorithm reliably obtains good policies with only a small number of basis functions, albeit at sizable computational costs.


Downloads:
 * Corresponding technical report: pdf file (241 KB)
      Note: More information on the pdf file format mentioned above can be found here.


Bibtex entry:

@inproceedings{BusErn:08-031,
        author={L. Bu{\c{s}}oniu and D. Ernst and B. {D}e Schutter and R. Babu{\v{s}}ka},
        title={Policy search with cross-entropy optimization of basis functions},
        booktitle={Proceedings of the 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL 2009)},
        address={Nashville, Tennessee},
        pages={153--160},
        month=mar # {--} # apr,
        year={2009}
        }



Go to the publications overview page.


This page is maintained by Bart De Schutter. Last update: March 21, 2022.