Parent Contexts and Contextual Belief Networks |

As in the definition of a belief network, let's assume that we have a
total ordering of the variables, *X _{1},...,X_{n}*.

**Definition.**
*
Given variable X_{i}, we say that C=c, where
C subset {X_{i-1}...X_{1}} and c in dom(C), is a parent context for
X_{i} if X_{i} is contextually independent of the predecessors of
X_{i} (namely {X_{i-1}...X_{1}}) given C=c.
*

**Example.**
*
Consider the belief network and conditional probability table of Figure
*. The predecessors
of variable E are A,B,C,D,Y,Z. A set of minimal parent
contexts for E is { {a,b}, {a,*

`~`

b}, {`~`

a,c},
{`~`

a,`~`

c,d,b}, {`~`

a,`~`

c,d,`~`

b}, {`~`

a,`~`

c,`~`

d}}. This is a mutually exclusive and exhaustive set of parent contexts. The probability of E given values for its
predecessors can be reduced to the probability of E given a parent
context. For example:

P(e|a,b,c, `~`

d,y,`~`

z) = P(e|a,b)P(e| `~`

a,b,c,`~`

d,y,`~`

z) = P(e|`~`

a,c)P(e| `~`

a,`~`

b,c, d,`~`

y,`~`

z) = P(e|`~`

a,c).

We can often (but not always) represent contextual independence in terms of trees.

Tree-structured
representations of the conditional probabilities for *E*, *B*,
and *D* given
their parents. Left branches correspond to true and right branches to
false. Thus, for example, *P(e|a &b) = 0.55*, *P(e|a &*, *P(e|* etc.

`~`

b) = 0.3`~`

a &`~`

c &d &b) = 0.025*The left side of Figure * gives a tree-based
representation for the conditional probability of E given its
parents. In this tree, internal nodes are labelled with parents of E
in the belief network. The left child of a node corresponds to
the variable labelling the node being true, and the right child
to the variable being false. The leaves are labelled with the
probability that E is true. For example P(e|a&*

`~`

b) = 0.3,
irrespectively of the value for C or D. In the tree-based
representation the variable (E in this case) is contextually
independent of its predecessors given the context defined by a path
through the tree. The paths through the tree correspond to parent
contexts.
Before showing how the structure of parent contexts can be exploited in inference, there are a few properties to note:

- The elements of a mutually exclusive and exhaustive set of parent
contexts are not always the minimal parent contexts. For example,
suppose we have a variable
*A*with parents*B*and*C*, all of which are Boolean. Suppose probability of*a*is*p*when both_{1}*B*and*C*are true and probability*p*otherwise. One mutually exclusive and exhaustive set of parent contexts for_{2}*A*is*{b&c, b &*.`~`

c,`~`

b}*b &*is not minimal as`~`

cis also a parent context. Another mutually exclusive and exhaustive set of parent contexts for this example is`~`

c*{b&c,*. The set of minimal parent contexts,`~`

b &c,`~`

c}*{b&c,*, isn't a mutually exclusive and exhaustive set as the elements are not pairwise incompatible.`~`

b ,`~`

c}One could imagine using arbitrary Boolean formulae in the contexts. This was not done as it would entail using theorem proving (or a more sophisticated subsumption algorithm) during inference. We doubt that this would be worth the extra overhead for the limited savings.

- A compact decision tree representation of conditional
probability tables [5] always corresponds to a
compact set of parent contexts (one context for each path through the
tree). However, a mutually exclusive and exhaustive set of parent
contexts cannot always be directly represented as a decision tree (as
there isn't always a single variable to split on). For example, the
mutually exclusive and exhaustive set of contexts
*{{a,b}, {*doesn't directly translate into a decision tree. More importantly, the operations we perform don't necessarily preserve the tree structure. Section * shows how we can do much better than an analogous tree-based formulation of our inference algorithm.`~`

a,c}, {`~`

b,`~`

c}, {a,`~`

b,c}, {`~`

a,b,`~`

c}}

**Definition.**
*
A contextual belief network is an acyclic directed graph where
the nodes are
random variables. Associated with each node X_{i}
is a mutually exclusive
and exhaustive set of parent contexts, Pi_{i}, and, for each
pi in Pi_{i}, a probability
distribution P(X_{i}|pi) on X_{i}. Thus a contextual belief network
is like a belief network, but we only specify the probabilities for
the parent contexts.
*

For each variable *X _{i}* and for each
assignment

This looks like the definition of a belief network (equation (*)), but which variables act as the parents depends on the values. The numbers required are the probability of each variable for each element of the mutually exclusive and exhaustive set of parent contexts. There can be many fewer of these than the number of assignments to parents in a belief network. At one extreme, there are the same number; at the other extreme there can be exponentially many more assignments of values to parents than the number of elements of a mutually exclusive and exhaustive set of parent contexts.

P(X _{1}=v_{1},...,X_{n}=v_{n})= PROD _{i = 1}^{n}P(X_{i}=v_{n}|X_{i-1}=v_{i-1},..., X_{1}=v_{1})= PROD _{i = 1}^{n}P(X_{i}=v_{i}|pi_{Xi}^{vi-1...v1})

David Poole and Nevin Lianwen Zhang,Exploiting Contextual Independence In Probabilistic Inference, Journal of Artificial Intelligence Research, 18, 2003, 263-313.

Parent Contexts and Contextual Belief Networks |