Exploiting Contextual Independence In Probabilistic Inference -- When to Split

Definition. Given confactor r₁ = <c₁,T₁> and context c, such that c₁ and c are compatible, to split r₁ on c means to split r₁ sequentially on each of the variables that are assigned in c that aren't assigned in c₁.

When we split r₁ on c, we end up with a single confactor with a context that is compatible with c; the contexts of all of the other confactors that are produced by the splitting are incompatible with c. These confactors that are incompatible with c are called residual confactors.

More formally, we can recursively define residual(r₁,c), where r₁ = <c₁,t₁> and c and c₁ are compatible, by:

residual(r₁,c)={} if c subset c₁
Else if c is not a subset of c₁, select a variable X that is assigned in c but not in c₁.
residual(r₁,c) = {<c₁&X=v_i,set(t₁, X=v_i)>: v_i in dom(X) & v_i != c^X}

union residual(<c₁&X=c^X,set(t₁,X=c^X)>,c)

where c^X is the value assigned to X in context c. Recall (Definition *) that set(t,X=v_i) is t if t doesn't involve X and is the selection of the X=v_i values from the table, followed by the projection onto the remaining variables, if t does involve X.

The results of splitting a confactor on a context is a set of confactors:
split(<c₁,t₁>,c)=residual(<c₁,t₁>,c) union {<c₁ union c,t₁>}.

Example. Consider residual(<a&b,t₁[C,D]>,c &e). Suppose we split on C first, then on E. This results in two residual confactors: <a&b&~c,t₂[D]> and <a&b&c &~e,t₃[D]>. Note that t₂[D] is the projection of t₁[C,D] onto C=false and t₃[D] is the projection of t₁[C,D] onto C=true. The non-residual confactor that we want from the split is <a&b&c &e,t₃[D]>.

If instead we split on E then C, we get the residual confactors: <a&b&~e,t₁[C,D]> and <a&b&~c &e,t₂[D]>, with the same non-residual confactor.

Note that the result can depend on the order in which variables are selected (see below for some useful splitting heuristics). The algorithms that use the split will be correct no matter which order the variables are selected, however some orderings may result in more splitting in subsequent operations.

Example * highlights one heuristic that seems generally applicable. When we have to split a confactor on variables that appear in its body and on variables in its table, it's better to split on variables in the table first, as these simplify the confactors that need to be subsequently split.

We can use the notion of a residual to split two rules that are compatible, and need to be multiplied. Suppose we have confactors r₁ = <c₁,t₁> and r₂ = <c₂,t₂>, that both contain the variable being eliminated and where c₁ and c₂ are compatible contexts. If we split r₁ on c₂, and split r₂ on c₁, we end up with two confactors whose contexts are identical. Thus we have the prerequisite needed for multiplying.

Example. Suppose we have confactors r₁ = <a&b &~c,t₁> and r₂ = <a &d,t₂> that both contain the variable being eliminated. We can split r₁ on the body of r₂, namely a &d, producing the confactors

<a&b &~c &d,t₁>

<a&b &~c &~d,t₁>

Only the first of these is compatible with r₂. The second confactor is a residual confactor.
We can split r₂ on the body of r₁, namely a&b &~c, by first splitting r₂ on B, then on C, producing the confactors:
<a &b &c &d,t₂>

<a &b &~c &d,t₂>

<a &~b &d,t₂>

Only the second confactor (confactor (*))is compatible with r₁ or any of the residual confactors produced by splitting r₁. Confactors (*) and (*) have identical contexts and so can be multiplied.

Suppose we have confactors r₁ = <c₁&Y=v_i,t₁> and r₂ = <c₂&Y=v_j,t₂>, where c₁ and c₂ are compatible contexts, and v_i != v_j. If we split r₁ on c₂, and split r₂ on c₁, we end up with two confactors whose contexts are identical except for the complementary values for Y. This is exactly what we need for summing out Y.

If Y is binary with domain {v_i,v_j}, and there are confactors r₁ = <c₁&Y=v_i,t₁> and r₂ = <c₂&Y=v_j,t₂>, where c₁ and c₂ are compatible contexts, and there is no other confactor that contains Y that is compatible with c₁ and c₂, summing out Y in the context c₁ union c₂ results in the confactors:

Proposition. Splitting confactor <c₁,t₁> on c creates

SUM_{X in vars(c)-vars(c₁)} (dom(X)-1)

extra confactors, independently of the order in which the variables are selected to be split, where vars(c) is the set of variables assigned in context c.

When we have to split, there is a choice as to which variable to split on first. While this choice does not influence the number of confactors created for the single split, it can influence the number of confactors created in total because of subsequent splitting. One heuristic was given above. Another useful heuristic seems to be: given a confactor with multiple possible splits, look at all of the confactors that need to be combined with this confactor to enable multiplication or addition, and split on the variable that appears most. For those cases where the conditional probability forms a tree structure, this will tend to split on the root of the tree first.

residual(r₁,c)	=	{`<`c₁&X=v_i,set(t₁, X=v_i)`>`: v_i in dom(X) & v_i != c^X}
		union residual(`<`c₁&X=c^X,set(t₁,X=c^X)`>`,c)