• Sorted by Date • Classified by Publication Type • Sorted by First Author Last Name • Classified by Author Last Name •

Koosha Khalvati and Alan K. Mackworth.
A Fast Pairwise Heuristic for Planning under Uncertainty. In *Proceedings of The Twenty-Seventh AAAI Conference on Artificial
Intelligence, AAAI 2013*, pp. 187–193, 2013.

POMDP (Partially Observable Markov Decision Process) is a mathematical framework that models planning under uncertainty. Solving a POMDP is an intractable problem and even the state of the art POMDP solvers are too computationally expensive for large domains. This is a major bottleneck. In this paper, we propose a new heuristic, called the pairwise heuristic, that can be used in a one-step greedy strategy to find a near optimal solution for POMDP problems very quickly. This approach is a good candidate for large problems where real-time solution is a necessity but exact optimality of the solution is not vital. The pairwise heuristic uses the optimal solutions for pairs of states. For each pair of states in the POMDP, we find the optimal sequence of actions to resolve the uncertainty and to maximize the reward, given that the agent is uncertain about which state of the pair it is in. Then we use these sequences as a heuristic and find the optimal action in each step of the greedy strategy using this heuristic. We have tested our method on the available large classical test benchmarks in various domains. The resulting total reward is close to, if not greater than, the total reward obtained by other state of the art POMDP solvers, while the time required to find the solution is always much less.

@INPROCEEDINGS{KhalvatiAAAI2013, author={Koosha Khalvati and Alan K. Mackworth}, booktitle={Proceedings of The Twenty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2013}, title={A Fast Pairwise Heuristic for Planning under Uncertainty}, year={2013}, pages={187-193}, abstract={POMDP (Partially Observable Markov Decision Process) is a mathematical framework that models planning under uncertainty. Solving a POMDP is an intractable problem and even the state of the art POMDP solvers are too computationally expensive for large domains. This is a major bottleneck. In this paper, we propose a new heuristic, called the pairwise heuristic, that can be used in a one-step greedy strategy to find a near optimal solution for POMDP problems very quickly. This approach is a good candidate for large problems where real-time solution is a necessity but exact optimality of the solution is not vital. The pairwise heuristic uses the optimal solutions for pairs of states. For each pair of states in the POMDP, we find the optimal sequence of actions to resolve the uncertainty and to maximize the reward, given that the agent is uncertain about which state of the pair it is in. Then we use these sequences as a heuristic and find the optimal action in each step of the greedy strategy using this heuristic. We have tested our method on the available large classical test benchmarks in various domains. The resulting total reward is close to, if not greater than, the total reward obtained by other state of the art POMDP solvers, while the time required to find the solution is always much less.}, bib2html_pubtype ={Refereed Conference Proceeding}, bib2html_rescat ={}, }

Generated by bib2html.pl (written by Patrick Riley ) on Wed Apr 23, 2014 19:08:35