## 8.5 Partial-Order Planning

The forward and regression planners enforce a total ordering on actions at all stages of the planning process. The CSP planner commits to the particular time that the action will be carried out. This means that those planners have to commit to an ordering of actions that cannot occur concurrently when adding them to a partial plan, even if there is no particular reason to put one action before another.

The idea of a partial-order planner is to have a partial ordering between actions and only commit to an ordering between actions when forced. This is sometimes also called a non-linear planner, which is a misnomer because such planners often produce a linear plan.

A partial ordering is a less-than relation that is transitive and asymmetric. A partial-order plan is a set of actions together with a partial ordering, representing a "before" relation on actions, such that any total ordering of the actions, consistent with the partial ordering, will solve the goal from the initial state. Write act0 < act1 if action act0 is before action act1 in the partial order. This means that action act0 must occur before action act1.

For uniformity, treat start as an action that achieves the relations that are true in the initial state, and treat finish as an action whose precondition is the goal to be solved. The pseudoaction start is before every other action, and finish is after every other action. The use of these as actions means that the algorithm does not require special cases for the initial situation and for the goals. When the preconditions of finish hold, the goal is solved.

An action, other than start or finish, will be in a partial-order plan to achieve a precondition of an action in the plan. Each precondition of an action in the plan is either true in the initial state, and so achieved by start, or there will be an action in the plan that achieves it.

We must ensure that the actions achieve the conditions they were assigned to achieve. Each precondition P of an action act1 in a plan will have an action act0 associated with it such that act0 achieves precondition P for act1. The triple ⟨act0,P,act1 is a causal link. The partial order specifies that action act0 occurs before action act1, which is written as act0 < act1. Any other action A that makes P false must either be before act0 or after act1.

Informally, a partial-order planner works as follows: Begin with the actions start and finish and the partial order start < finish. The planner maintains an agenda that is a set of ⟨P,A⟩ pairs, where A is an action in the plan and P is an atom that is a precondition of A that must be achieved. Initially the agenda contains pairs ⟨G,finish⟩, where G is an atom that must be true in the goal state.

At each stage in the planning process, a pair ⟨G,act1 is selected from the agenda, where P is a precondition for action act1. Then an action, act0, is chosen to achieve P. That action is either already in the plan - it could be the start action, for example - or it is a new action that is added to the plan. Action act0 must happen before act1 in the partial order. It adds a causal link that records that act0 achieves P for action act1. Any action in the plan that deletes P must happen either before act0 or after act1. If act0 is a new action, its preconditions are added to the agenda, and the process continues until the agenda is empty.

This is a non-deterministic procedure. The "choose" and the "either ...or ..." form choices that must be searched over. There are two choices that require search:

• which action is selected to achieve G and
• whether an action that deletes G happens before act0 or after act1.

non-deterministic procedure PartialOrderPlanner(Gs)
2:           Inputs
3:                     Gs: set of atomic propositions to achieve
4:           Output
5:                     linear plan to achieve Gs
6:           Local
7:                     Agenda: set of ⟨P,A⟩ pairs where P is atom and A an action
8:                     Actions: set of actions in the current plan
9:                     Constraints: set of temporal constraints on actions
10:                     CausalLinks: set of ⟨act0,P,act1 triples
11:           Agenda ←{⟨G,finish⟩:G ∈Gs}
12:           Actions ←{start,finish}
13:           Constraints ←{start<finish}
15:           repeat
16:                     select and remove ⟨G,act1 from Agenda
17:                     either
18:                               choose act0 ∈Actions such that act0 achieves G
19:                     or
20:                               choose act0 ∉Actions such that act0 achieves G
21:                               Actions ←Actions ∪{act0}
24:                                         Constraints ←protect(CL,act0,Constraints)
25:
26:                               Agenda ←Agenda ∪{⟨P,act0⟩: P is a precondition of act0 }
27:
30:                     for each A∈Actions do
31:                               Constraints ←protect(⟨acto,G,act1⟩,A,Constraints)
32:
33:           until Agenda={}
Figure 8.5: Partial-order planner

The algorithm PartialOrderPlanner is given in Figure 8.5.

The function add_const(act0<act1,Constraints) returns the constraints formed by adding the constraint act0<act1 to Constraints, and it fails if act0<act1 is incompatible with Constraints. There are many ways this function can be implemented. See Exercise 8.13.

The function protect(⟨acto,G,act1⟩,A,Constraints) checks whether A≠act0 and A≠act1 and A deletes G. If so, it returns either { A<act0 } ∪ Constraints or { act1<A } ∪ Constraints. This is a non-deterministic choice that is searched over. Otherwise it returns Constraints.

Example 8.14: Consider the goal ¬swc ∧ ¬mw, where the initial state contains RLoc=lab, swc, ¬rhc, mw, ¬rhm.

Initially the agenda is

⟨ ¬swc,finish⟩,⟨ ¬mw,finish⟩.

Suppose ⟨ ¬swc,finish⟩ is selected and removed from the agenda. One action exists that can achieve ¬swc, namely deliver coffee, dc, with preconditions off and rhc. At the end of the repeat loop, Agenda contains

⟨off,dc⟩,⟨rhc,dc⟩,⟨ ¬mw,finish⟩.

Constraints is {start<finish, start < dc, dc <finish}. There is one causal link, ⟨dc, ¬swc,finish⟩. This causal link means that no action that undoes ¬swc is allowed to happen after dc and before finish.

Suppose ⟨ ¬mw,finish⟩ is selected from the agenda. One action exists that can achieve this, pum, with preconditions mw and RLoc=mr. The causal link ⟨pum, ¬mw,finish⟩ is added to the set of causal links; ⟨mw,pum⟩ and ⟨mr,pum⟩ are added to the agenda.

Suppose ⟨mw,pum⟩ is selected from the agenda. The action start achieves mw, because mw is true initially. The causal link ⟨start,mw,pum⟩ is added to the set of causal links. Nothing is added to the agenda.

At this stage, there is no ordering imposed between dc and pum.

Suppose ⟨off,dc⟩ is removed from the agenda. There are two actions that can achieve off: mc_cs with preconditions cs, and mcc_lab with preconditions lab. The algorithm searches over these choices. Suppose it chooses mc_cs. Then the causal link ⟨mc_cs,off,dc⟩ is added.

The first violation of a causal link occurs when a move action is used to achieve ⟨mr,pum⟩. This action violates the causal link ⟨mc_cs,off,dc⟩, and so must happen after dc (the robot goes to the mail room after delivering coffee) or before mc_cs.

The preceding algorithm has glossed over one important detail. It is sometimes necessary to perform some action more than once in a plan. The preceding algorithm will not work in this case, because it will try to find a partial ordering with both instances of the action occurring at the same time. To fix this problem, the ordering should be between action instances, and not actions themselves. To implement this, assign an index to each instance of an action in the plan, and the ordering is on the action instance indexes and not the actions themselves. This is left as an exercise.