CPSC 532 - Topics in AI:
Logical, Probabilistic and Computational Foundations of Semantic Science

David Poole

Spring 2010

This is a Topics in AI seminar course that is means to explore a field of AI that currently doesn't exist. This field is at the intersection of probabilistic reasoning, logical reasoning, ontologies, semantic web and machine learning. This is not course of lectures, but one where we all work together to understand the potential of combining ontologies, relational data and probabilistic reasoning. One of the reasons that this intersection has not been investigated is that it is very complex.

Five Introductions

Why should you believe information given on the web? Current practice gives information that is justified by popularity or appeal to authority. But scientists know this is the wrong answer; we want to base our belief on empirical evidence. This course explores how his can be done.

To make informed decisions, we need to take uncertainty into account and condition on all available knowledge, What if that knowledge is produced all over the world, at various levels of abstraction and detail? How can we know that it exists? How can we condition on it? How can we ensure that the vocabulary used in the data means the same thing as the vocabulary used in the model?

The semantic web is a way to make all human knowledge available to computers. But how can computers create new knowledge? How should this new knowledge be evaluated? How can we make the knowledge created be more than the sum of human knowledge? How can this knowledge be applied to particular cases?

Science is an activity to create new knowledge by use of theories that are empirically evaluated by data. How can computers carry out the broad activity of science? Scientists who create theories need to evaluate their theories on all data. Scientists who create data should use it to evaluate all theories. How can we distinguish good theories from bad theories? How can we distinguish good data from bad data? How can the best science be applied to a particular case? How can new experiments be derived?

There was once a field of AI called expert systems. Since then, AI research has diverged into two (almost) separate fields: that which deals with uncertainty and (statistical) learning (reasoning mostly with features), and that which deals with complex semantic relationships and ontologies (that reasons in terms of objects and relationships, but largely has ignored uncertainty). To make predictions in (most) real situations, we need to reason about uncertainty as well as with objects and relations. This course will cover relational probabilistic models, which deal with objects and relations, as well as probabilistic first-order models, which also reason about identity and existence uncertainty. These form the core technologies for making (probabilistic) predictions in complex domains.

Structure of the Course

The are three parts of the course:

Foundations
State of current research
What we don't know how do

There will be 3 hours of in-class interaction per week. The first few weeks of these will be lectures on the foundations, and the rest will be student presentations, discussion of research papers and problem solving. This is a participatory class; everyone will be expected to participate fully; to have read the reading material before class, and come ready to discuss and critically analyze it.

The classes will be held:

Mondays and Wednesdays: 9:30-11:00, ICCS 304. The first class will be on Wednesday, January 6.

If you are just interested in being presented with the state of the art, this course is probably not for you. The aim is to invent what may be useful in a decade or so. It it for those who want to be part of the (difficult) endeavour of working out the details of semantic science.

Topics

Probabilistic reasoning: representations, semantics, algorithms, learning
Logical reasoning: representations, semantics, algorithms, learning
Causality
Ontologies, data, data repositories
Relational probabilistic models and probabilistic programming languages: semantics, reasoning, learning
First-order probabilistic models: existence, equality
Semantic science theories, theory ensembles, and semantic science search engines
Challenges in making this a reality

It would be good to read something about the philosophy of science before the first class. We will not cover this topic, but it is good to have some knowledge about how science works.

Everyone will choose a "vertical domain", with 4 properties: (a) you know something about it, (b) someone may be interested in predictions in this domain, (c) there is potentially rich data about the domain and (d) there are complex interactions about the concepts in the domain. Each participant needs to be ready to explain how the concepts covered in class can apply to their domain.

Assessment

The course assessment will be based on assignments, in-class participation, and a final research paper. The evaluation will be based on peer review. Assignments will not be traditional assignments, but will be to apply what is being covered in class to your vertical domain, and to present it to the class.

Last updated: 2009-10-23, David Poole