A recently proposed formulation of the stochastic planning and control problem as one of parameter estimation for suitable artificial statistical models has led to the adoption of inference algorithms for this notoriously hard problem. At the algorithmic level, the focus has been on developing Expectation-Maximization (EM) algorithms. For example, Toussaint et al (2006) uses EM with optimal smoothing in the E step to solve finite state-space Markov Decision Processes. In this paper, we extend this EM approach in two directions. First, we derive a non-trivial EM algorithm for linear Gaussian models where the reward function is represented by a mixture of Gaussians, as opposed to the less flexible classical single quadratic function. Second, in order to treat arbitrary continuous state-space models, we present an EM algorithm with particle smoothing. However, by making the crucial observation that the stochastic control problem can be reinterpreted as one of trans-dimensional inference, we are able to propose a novel reversible jump Markov chain Monte Carlo (MCMC) algorithm that is more efficient than its smoothing counterparts. Moreover, this observation also enables us to design an alternative full Bayesian approach for policy search, which can be implemented using a single MCMC run.
If you have any questions or comments regarding this page please send mail to email@example.com.