Optimal and Approximate Stochastic Planning using Decision Diagrams

TR-2000-05

Authors

Jesse Hoey, Robert St-Aubin, Alan Hu and Craig Boutilier

Publishing date

June 10, 2000

Length

35 pages

Abstract

Structured methods for solving factored Markov decision processes (MDPs) with large state spaces have been proposed recently to allow dynamic programming to be applied without the need for complete state enumeration. We present algebraic decision diagrams (ADDs) as efficient data structures for solving very large MDPs. We describe a new value iteration algorithm for exact dynamic programming, using an ADD input representation of the MDP. We extend this algorithm with an approximate version for generating near-optimal value functions and policies with much lower time and space requirements than the exact version. We demonstrate our methods on a class of large MDPs (up to 34 billion states). We show that significant gains can be had with our optimal value iteration algorithm when compared to tree-structured representations (with up to a thirty-fold reduction in the number of nodes required to represent optimal value functions). We then demonstrate our approximate algorithm and compare results with the optimal ones. Finally, we examine various variable reordering techniques and demonstrate their use within the context of our methods.

File(s)

View PDF version of TR-2000-05 (293452 bytes)

Save gzipped PostScriptversion of TR-2000-05 (145138 bytes)