A general theme of my research is to find natural representations for problems and to exploit the representation for efficient computation. This means that we want a compact representation so that it is easy to map between a problem and a representation of a problem. We then want general methods to exploit the compactness of the representation. In this way we seek to exploit the special features of a problem to make inference quick.

The main problem I consider is: what should an agent do based on its prior knowledge, what it observes about the world, and its values or goals?

Independence Choice Logic

Much of this work is based on the "independent choice logic". This is a semantic framework which allows for independent choices made by various agents, and a logic program to give the consequences of the choices. This is an expansion of Probabilistic Horn abduction to include a richer logic (including negation as failure), and choices by multiple agents. This extends logic programs, Bayesian networks, influence diagrams, MDPs, and game theory representations.

This work can be reached from a number of different starting positions.

  • Abductive logic programming, but we structure the assumables into disjoint sets (alternatives) and let different agents (e.g., nature, self, adversary) choose which values get selected.
  • Game theory, where we specify the consequences of the choices by the various agents by a logic program.
  • Bayesian networks, influence diagrams, where we use rules to specify the conditional probability tables, the value function, and what information will be available when the agent must make a decision.
  • Dynamical systems where we structure the state in terms of propositions and specify the (stochastic) state transition functions in terms of logic programs.

The main highlights of this work are:

  • It forms a generalization of the strategic (normal) form of a game (as studied in game theory).
  • When restricted to just choices by nature (with probabilities over the choices), it forms a generalisation of Bayesian networks (see Probabilistic Horn abduction description) to a richer language (like parametrized Bayesian networks or first-order Bayesian networks), and a natural specification of ``contextual independence''.
  • It generalises, in a natural way, influence diagrams, Markov decision processes, etc. Moreover the language lets us exploit the propositional nature of the state space (i.e., the decomposition of state space into variables and the contextual independence).
  • It lets us use arbitrary (acyclic) logic programs to axiomatise the world. This is very flexible, as the language lets us model uncertainty in the choice space (there is no need for disjunction in the language).
  • It is essentially abductive. The explanations of a proposition form a sound and complete description of all of the worlds in which the proposition is true.
  • We can exploit the structure for computational gain.

For the best overview see:

There there are also slides (in PDF format) from a talk "The Independent Choice Logic: A pragmatic combination of logic and decision theory", March 1998.

For a more informal overview see:

The following recent paper abstracts the ideas of the ICL to more general programming languages.

Abductive Characterization

Apart from the representational advantages, there are computational advantages due to the rule based representation. The rule structure imposes a partitioning of the state space into sets of states that do not need to be considered separately. Put another way, the framework is essentially abductive: the sets of explanations of a proposition provide a concise description of the worlds in which the proposition is true. Thus we can see a unification of abductive logic programming and decision-theoretic representations. See:

This leads to a reasonably straightforward implementation based on a backward-chaining abductive engine. You can get the ICL code distribution (a gzipped tar file) that is based on this idea. This contains Prolog code, documentation, and the examples from the various papers. The implementation includes some knowledge-level debugging tools that have been found to be (very) useful in building ICL knowledge bases.

Inference

The work in this area has considered both inference algorithms and representations. The algorithms have been for simpler cases than the representations; this is important as we want to get the foundations for the algorithms solid before applying them to more complicated representations. It is hoped that the algorithms developed for the straightforward (but quite general) case of rule-structured probability models will carry over to the more complicated (but often more restricted) models such as POMDPs. It is important that we understand (and debug) the simple case that will form the foundations for the complicated algorithms.

The rule structure can be exploited to speedup probabilistic inference. The following paper shows how to exploit the contextual independence that can easily be stated in terms of rules. In the worst case this algorithm is as good as the structured Bayesian network evaluation algorithms, but it can be exponentially better:

This can also be extended to approximate reasoning, where we can make different approximations in different contexts and compute bounds on the resulting posterior probabilities:

One of the ways that the rule based structure can be exploited is in structured dynamic programming. Instead of considering all information states, and optimizing for each (e.g., as in an influence diagram), we can use the rule structure to determine structure over the information states (the values for the parents in an influence diagram) - we then only need to consider the structured partitioning of the information space. See:

Similarly we can consider exploiting the rule structure in inference in POMDPs, where we can build structured representations of value functions using essentially regression (as used in planning).

Representation

If we want to use the ICL in a planning situation we need to combine it with a model of time. This is done by allowing time to be represented in the logic program. I have been looking at three different cases:

Learning

The independent choiuce logic provides a natural specification language for spefifying learning problems. See:

I am currently working on learning ICL theories, combining Inductive Logic Programming and Bayesian network learning.

Applications

We are working on applications on diagnosis, robotics, user modelling...

Building applications is important to ground the work in real problems, so that we don't work on abstractions of problems that don't correspond to any actual problems.

Misrepresentations of the ICL

There have been a number of places where the ICL has been referenced, but where the claims made are false or misleading. Here are a few (please let me know of any more):

  • "Poole considers only models without recursion." [Laskey, Artificial Intelligence Journal, Vol 172 (2008) 140-178] This is false; ICL allows recursive models. Note that she references ICL, but only talks about parametrized belief networks, which are much more like [Horsch and Poole 1990].
  • ``Bayesian logic programs [compared to ICL theories] have not as many constraints on the representation language, ...., have a richer representation language and their independence assumptions represent the causality of the domain'' [K. Kersting and L. De Raedt, Bayesian logic programming: Theory and tool. In L. Getoor and B. Taskar, editors, An Introduction to Statistical Relational Learning. MIT Press, 2007]. All these claims are false. ICL can represent arbitrary logic programs with no restrictions except the acyclicity restriction that recursions have to be well-founded (none of the textbooks on logic programming I know of contains any logic programs that are not acyclic, although they do have logic programs that contains cuts and input/output that ICL does not handle.). ICL is Turing-complete. [Pearl 2000] defines causality in terms of structural equation models with exogenous noise. The ICL represents this causality directly with the structural equation models represented as logic programs.

Last updated 2008-01-21 - David Poole, poole@cs.ubc.ca