Probabilistic Planning in the Graphplan Framework
journal contributionposted on 01.03.2012, 00:00 by Avrim Blum, John C. Langford
The Graphplan planner has enjoyed considerable success as a planning algorithm for classical STRIPS domains. In this paper we explore the extent to which its representation can be used for probabilistic planning. In particular, we consider an MDP-style framework in which the state of the world is known but actions are probabilistic, and the objective is to produce a finite horizon contingent plan with highest probability of success within the horizon. We describe two extensions of Graphplan in this direction. The first, PGraphplan, produces an optimal contingent plan. It typically suffers a performance hit compared to Graphplan but still appears to be fast compared with other approaches to probabilistic planning problems. The second, TGraphplan, runs at essentially the same speed as Graphplan, but produces potentially sub-optimal policies: TGraphplan’s policy selects the first action on the highest probability trajectory from its current state to the goal. Ideally, we would like an optimal planner for probabilistic domains with the same speed that Graphplan would have if the domain were made deterministic. By comparing the speed and quality of these two planners to each other and to other existing planners, we are able to estimate how far off we are from our ideal. PGraphplan is based on a forward-chaining search, unlike the backward-chaining search of the standard Graphplan algorithm. Thus, one focus of this paper is exploring the extent to which Graphplan’s representation can be used to speed up forward search in addition to the backward search for which it was originally intended.