Computational Intelligence

Online Slides

November 21, 2002

These are slides from Computational Intelligence, A Logical Approach, Oxford University Press, 1998. Copyright ©David Poole, Alan Mackworth, Randy Goebel and Oxford University Press, 1999-2002. You may prefer the pdf interface for which these slides were designed (you can read pdf files using the free acrobat reader).

Chapter 11, Lecture 1


Learning is the ability to improve one's behavior based on experience.

Components of a learning problem

The following components are part of any learning problem:

Learning task

Learning architecture

Choosing a representation

Common Learning Tasks

Example Classification Data

Action Author Thread Length Where
e1 skips known new long home
e2 reads unknown new short work
e3 skips unknown old long work
e4 skips known old long home
e5 reads known new short home
e6 skips known old long work

We want to classify new examples on property Action based on the examples' Author, Thread, Length, and Where.


Learning tasks can be characterized by the feedback given to the learner.

Measuring Success


Learning as search


Characterizations of Learning

Chapter 11, Lecture 2

Learning Decision Trees

Decision trees

A decision tree is a tree where:

Example Decision Tree

Equivalent Logic Program

prop(Obj,user_action,skips) <-
prop(Obj,user_action,reads) <-
     prop(Obj,length,short) & prop(Obj,thread,new).
prop(Obj,user_action,reads) <-
     prop(Obj,length,short) & prop(Obj,thread,old) &
prop(Obj,user_action,skips) <-
     prop(Obj,length,short) & prop(Obj,thread,old) &

Issues in decision-tree learning

Searching for a Good Decision Tree

Decision tree learning: Boolean attributes

dtlearn(Goal, Examples, Attributes, DT) given Examples and Attributes construct decision tree DT for Goal.

dtlearn(Goal, Exs, Atts , Val) <-
      all_examples_agree(Goal, Exs, Val).
dtlearn(Goal, Exs, Atts, if(Cond,YT,NT)) <-
      examples_disagree(Goal, Exs) &
      select_split(Goal, Exs, Atts, Cond, Rem_Atts) &
      split(Exs, Cond, Yes, No) &
      dtlearn(Goal, Yes, Rem_Atts, YT) &
      dtlearn(Goal, No, Rem_Atts, NT).

Example: possible splits

Using this algorithm in practice

Handling Overfitting

Chapter 11, Lecture 3

Neural Networks

Why Neural Networks?

Feed-forward neural networks

The Units

A unit with k inputs is like the parameterized logic program:

prop(Obj,output,V) <-
     prop(Obj,in_1,I_1) &
     prop(Obj,in_2,I_2) &
     prop(Obj,in_k,I_k) &

Activation function

A typical activation function is the sigmoid function:

f(x)= (1)/(1+e-x) f'(x)=f(x)(1-f(x))

Neural Network for the news example

Axiomatizing the Network

predicted_prop(Obj,reads,V) <-
     prop(Obj,h_1,I_1) & prop(Obj,h_2,I_2) &
     Visf(w_0+w_1× I_1+w_2× I_2).
prop(Obj,h_1,V) <-
     prop(Obj,known,I_1) & prop(Obj,new,I_2) &
     prop(Obj,short,I_3) & prop(Obj,home,I_4) &
     Visf(w_3+w_4× I_1+w_5× I_2+w_6× I_3+ w_7× I_4).
prop(Obj,h_2,V) <-
     prop(Obj,known,I_1) & prop(Obj,new,I_2) &
     prop(Obj,short,I_3) & prop(Obj,home,I_4) &
     Visf(w_8+w_9× I_1+w_10× I_2+w_11× I_3+w_12× I_4).

Prediction Error

Neural Network Learning

Backpropagation Learning

Backpropagation Learning Algorithm

Gradient Descent for Neural Net Learning

Simulation of Neural Net Learning

Para-iteration 0iteration 1 iteration 80
meter Value Deriv Value Value
w0 0.2 0.768 -0.18 -2.98
w1 0.12 0.373 -0.07 6.88
w2 0.112 0.425 -0.10 -2.10
w3 0.22 0.0262 0.21 -5.25
w4 0.23 0.0179 0.22 1.98

What Can a Neural Network Represent?

w0 w1 w2 Logic
-15 10 10 and
-5 10 10 or
5 -10 -10 nor
Output is f(w0+w1 ×I1 + w2×I2).

A single unit can't represent xor.

Bias in neural networks and decision trees

Neural Networks and Logic

©David Poole, Alan Mackworth, Randy Goebel and Oxford University Press, 1998-2002