Computational Intelligence

Online Slides

November 21, 2002

These are slides from Computational Intelligence, A Logical Approach, Oxford University Press, 1998. Copyright ©David Poole, Alan Mackworth, Randy Goebel and Oxford University Press, 1999-2002. You may prefer the pdf interface for which these slides were designed (you can read pdf files using the free acrobat reader).

Chapter 11, Lecture 1


Learning

Learning is the ability to improve one's behavior based on experience.

Components of a learning problem

The following components are part of any learning problem:


Learning task


Learning architecture


Choosing a representation


Common Learning Tasks


Example Classification Data

Action Author Thread Length Where
e1 skips known new long home
e2 reads unknown new short work
e3 skips unknown old long work
e4 skips known old long home
e5 reads known new short home
e6 skips known old long work

We want to classify new examples on property Action based on the examples' Author, Thread, Length, and Where.

Feedback

Learning tasks can be characterized by the feedback given to the learner.


Measuring Success


Bias


Learning as search


Noise


Characterizations of Learning



Chapter 11, Lecture 2


Learning Decision Trees


Decision trees

A decision tree is a tree where:

Example Decision Tree


Equivalent Logic Program

prop(Obj,user_action,skips) <-
     prop(Obj,length,long).
prop(Obj,user_action,reads) <-
     prop(Obj,length,short) & prop(Obj,thread,new).
prop(Obj,user_action,reads) <-
     prop(Obj,length,short) & prop(Obj,thread,old) &
     prop(Obj,author,known).
prop(Obj,user_action,skips) <-
     prop(Obj,length,short) & prop(Obj,thread,old) &
     prop(Obj,author,unknown).


Issues in decision-tree learning


Searching for a Good Decision Tree


Decision tree learning: Boolean attributes

dtlearn(Goal, Examples, Attributes, DT) given Examples and Attributes construct decision tree DT for Goal.

dtlearn(Goal, Exs, Atts , Val) <-
      all_examples_agree(Goal, Exs, Val).
dtlearn(Goal, Exs, Atts, if(Cond,YT,NT)) <-
      examples_disagree(Goal, Exs) &
      select_split(Goal, Exs, Atts, Cond, Rem_Atts) &
      split(Exs, Cond, Yes, No) &
      dtlearn(Goal, Yes, Rem_Atts, YT) &
      dtlearn(Goal, No, Rem_Atts, NT).


Example: possible splits


Using this algorithm in practice


Handling Overfitting



Chapter 11, Lecture 3


Neural Networks


Why Neural Networks?


Feed-forward neural networks


The Units

A unit with k inputs is like the parameterized logic program:

prop(Obj,output,V) <-
     prop(Obj,in_1,I_1) &
     prop(Obj,in_2,I_2) &
     ···
     prop(Obj,in_k,I_k) &
     Visf(w_0+w_1×I_1+w_2×I_2+···+w_k×I_k).


Activation function

A typical activation function is the sigmoid function:

f(x)= (1)/(1+e-x) f'(x)=f(x)(1-f(x))

Neural Network for the news example


Axiomatizing the Network


predicted_prop(Obj,reads,V) <-
     prop(Obj,h_1,I_1) & prop(Obj,h_2,I_2) &
     Visf(w_0+w_1× I_1+w_2× I_2).
prop(Obj,h_1,V) <-
     prop(Obj,known,I_1) & prop(Obj,new,I_2) &
     prop(Obj,short,I_3) & prop(Obj,home,I_4) &
     Visf(w_3+w_4× I_1+w_5× I_2+w_6× I_3+ w_7× I_4).
prop(Obj,h_2,V) <-
     prop(Obj,known,I_1) & prop(Obj,new,I_2) &
     prop(Obj,short,I_3) & prop(Obj,home,I_4) &
     Visf(w_8+w_9× I_1+w_10× I_2+w_11× I_3+w_12× I_4).


Prediction Error


Neural Network Learning


Backpropagation Learning


Backpropagation Learning Algorithm


Gradient Descent for Neural Net Learning


Simulation of Neural Net Learning

Para-iteration 0iteration 1 iteration 80
meter Value Deriv Value Value
w0 0.2 0.768 -0.18 -2.98
w1 0.12 0.373 -0.07 6.88
w2 0.112 0.425 -0.10 -2.10
w3 0.22 0.0262 0.21 -5.25
w4 0.23 0.0179 0.22 1.98
Error:4.61214.61280.178


What Can a Neural Network Represent?

w0 w1 w2 Logic
-15 10 10 and
-5 10 10 or
5 -10 -10 nor
Output is f(w0+w1 ×I1 + w2×I2).

A single unit can't represent xor.


Bias in neural networks and decision trees


Neural Networks and Logic


©David Poole, Alan Mackworth, Randy Goebel and Oxford University Press, 1998-2002