ConclusionTopEmpirical ResultsComparison With Other Proposals

Comparison With Other Proposals

In this section we compare CVE with other proposals for exploiting context-specific information.

The belief network with the conditional probability table of Figure * (i.e., with the contextual independence shown in Figure *) is particularly illuminating because other algorithms do very badly on it. Under the elimination ordering B, D, C, A, Y, Z, to find the marginal on E, the most complicated confactor multiset created is the confactor multiset for E after eliminating B (see Example *) with a total table size of 16. Observations simplify the algorithm as they mean fewer partial evaluations.

In contrast, VE requires a factor with table size 64 after B is eliminated. Clique tree propagation constructs two cliques, one containing Y,Z,A,B,C,D of size 26 = 64, and the other containing A,B,C,D,E of size 32. Neither takes the structure of the conditional probabilities into account.

Note however, that VE and clique tree propagation manipulate tables which can be indexed much faster than we can manipulate confactors. There are cases where the total size of the tables of the confactors is exponentially smaller than the tables (where added variables are only relevant in narrow contexts). There are other cases where the table size for the confactors is the same as the table size in VE, and where the overhead for manipulating contexts will not make CVE competitive with the table-based methods.

[5] present two algorithms to exploit structure. For the network transformation and clustering method, the example of Figure * is the worst case; no structure can be exploited after triangulation of the resulting graph. (The tree for E in Figure * is structurally identical to the tree for X(1) in Figure 2 of [5]). The structured cutset conditioning algorithm does well on this example. However, if the example is changed so that there are multiple (disconnected) copies of the same graph, the cutset conditioning algorithm is exponential in the number of copies, whereas the probabilistic partial evaluation algorithm is linear11.

This algorithm is most closely related to the tree-based algorithms for solving MDPs [4], but these work with much more restricted networks and with stringent assumptions on what is observable. CVE is simpler than the analogous algorithm by [3] for structured MDP with correlated action effects that represents conditional probabilities as trees. Section * shows why we can do better than the tree-based algorithms.

[25] gave an earlier version of rule-based VE, but it is more complicated in that it distinguished between the head and the body of rules as part of the inference (although the confactors for X correspond to the rules with X in the head). CVE is much more efficient than the rule-based VE as it allows for faster indexing of tables. The notion of absorption was introduced by [31], which motivated the work in a very different way. [32] give a mathematical analysis of how context-specific independence can be exploited in terms of partial functions. The union-product is a formalization of the operation we are using between confactors. The current paper extends all of these in giving a specific algorithm showing how to combine the confactors and tables, gives a more general analysis of when we need to do confactor splitting and when we don't need to, gives a more sophisticated method to initialize absorption (maintaining the confactors for a variable) and gives a much more detailed empirical evaluation (with a new implementation). The ability to handle ones is also improved.

Finally the representation should be contrasted with that of [14]. They use a similarity network that gives a global decomposition of a belief network; for different contexts they allow for different belief networks. We allow for a local decomposition of conditional probabilities. The algorithms are not very similar.

David Poole and Nevin Lianwen Zhang,Exploiting Contextual Independence In Probabilistic Inference, Journal of Artificial Intelligence Research, 18, 2003, 263-313.

ConclusionTopEmpirical ResultsComparison With Other Proposals