CPSC 322 - Lecture 37 - December 3, 2004

CPSC 322 - Lecture 37

Perceptron Learning


I.  The perceptron

Chapter 11.2 in your textbook introduces the idea of neural
network learning.  That chapter provides a nice explanation
of how to train a collection of simulated neurons in what's
called a feed-forward backpropagation network.  Rather than
repeat what's said in Chapter 11.2 (which you should most
definitely read), I presented the evolutionary precursor
to the artificial neuron, called a perceptron.  

A perceptron is a computing input which takes some number
of inputs...each of those inputs can be a 1 or a 0.  The
individual inputs are multiplied by their corresponding
weights, which are real numbers that can be negative or
positive.  The perceptron takes the sum of those products,
and if the sum exceeds some threshold the perceptron outputs
a 1; otherwise, the perceptron outputs a 0.

What makes this little device interesting is that it can
learn to respond to certain input patterns with a 1 and
other input patterns with a 0.  If those patterns represent
positive and negative examples of some concept, then the
perceptron can learn that concept...at least sometimes.

The trick to perceptron learning is that when the
perceptron is presented with a positive example of the
concept but outputs a 0 (where a 0 means that the
perceptrons is saying that the example is not representative
of the concept), the weights corresponding to the non-zero
inputs are increased to make the perceptron more sensitive to 
those inputs.  Conversely, when the perceptron is presented
with a negative example but outputs a 1, the weights on
the non-zero inputs are decreased to make the perceptron
less sensitive to those inputs.

In class, we applied a perceptron to the task of learning
the concept of (what else?) an arch.  We created a 
representation language consisting of 1's and 0's, and
in that language we abstracted away all but the most
relevant, although simplified, features.  We presented
our perceptron with a series of examples and it learned
to respond correctly to both positive and negative examples
of archness.  We successfully taught the perceptron 
another concept too.  But as we found out, the perceptron
can't learn everything we might want it to.

In particular, we found out that the perceptron can't learn
concepts involving the notion of exclusive-or, and the 
fact that it can't learn some things led to the decline
of interest in perceptrons years ago.  But perceptrons
were resurrected with some improvements (replacing the
threshold-based step function that controlled output with
a continuous function, building networks of these units
with multiple layers and new ideas about how to adjust
the weights) and gave rise to the current excitement
about neural networks.

Here's the perceptron code, if you're interested.
It's not CILOG.

;; very very simple perceptron learning - one cell
;;
;; wx = initial weights

(define w1 '(-0.5 0 0.5 0 -0.5))  ;; range -1.0 to 1.0
(define w2 '(0.5 0.5 0.5 0.5 0.5))

;; t = initial threshold -- the program doesn't yet know how to change this

(define t 0.5)                   ;; range 0 to 1 - make it a float just to be safe


;; example 1 - a supports c and b supports c but a doesn't touch b (arch)
(define ex1 '((#t 1 1 1 1 0) (#f 1 1 1 1 1) (#f 0 0 0 0 0) (#f 0 0 1 1 0) (#f 1 0 1 0 1)   
              (#f 1 0 1 0 0) (#f 0 1 0 1 1) (#f 0 1 0 1 0) (#f 0 0 1 0 0) (#f 0 0 0 1 0)))   

;; example 2 - a touches c and b touches c (all three blocks in contact - no islands)
(define ex2 '((#t 1 1 1 1 0) (#t 1 1 1 1 1) (#f 0 0 0 0 0) (#t 0 0 1 1 0) (#f 1 0 1 0 1)
              (#f 1 0 1 0 0) (#f 0 1 0 1 1) (#f 0 1 0 1 0) (#f 0 0 1 0 0) (#f 0 0 0 1 0)))

;; example 3 - a touches c or b touches c but not both (exclusive-or)
(define ex3 '((#f 1 1 1 1 0) (#f 1 1 1 1 1) (#f 0 0 0 0 0) (#f 0 0 1 1 0) (#t 1 0 1 0 1)   
              (#t 1 0 1 0 0) (#t 0 1 0 1 1) (#t 0 1 0 1 0) (#t 0 0 1 0 0) (#t 0 0 0 1 0))) 



(define (learn_concept init_weights init_threshold init_training_set)
  (do ((weights init_weights)
       (threshold init_threshold)
       (examples init_training_set)
       (stop_training #f))
    (stop_training (display "done")(newline))
    (set! weights (train weights threshold examples))
    (display "continue training? (y/n): ")
    (set! stop_training (equal? (read) 'n))
    (newline)
    ))


(define (train weights threshold examples) ;; this function could be better
  (do ((current_weights weights)
       (guess #f) 
       (real #f) 
       (inputs ()))
    ((null? examples) current_weights)
    (set! inputs (cdar examples))
    (set! real (caar examples))
    (set! guess (compute_g current_weights inputs threshold))
    (set! current_weights (adjust_weights current_weights
                                          inputs
                                          guess
                                          real))
    (display "inputs: ")(display inputs)(newline)
    (display_weights guess real current_weights)
    (set! examples (cdr examples))))

(define (display_weights guess real current_weights)
  (display "guess: ")
  (display guess)
  (display "   answer:")
  (display real)
  (newline)
  (display_weights_2 current_weights)
  (newline))

(define (display_weights_2 weights)
  (cond [(null? weights) #t]
        [else (display "  ")
              (display (car weights))
              (newline)
              (display_weights_2 (cdr weights))]))
  

(define (adjust_weights weights inputs guess real) ;; increment shouldn't be hard coded
  (cond [(equal? guess real) weights]
        [guess (adjust_weights_2 weights inputs -0.1)]
        [else (adjust_weights_2 weights inputs 0.1)]))

(define (adjust_weights_2 weights inputs increment)
  (cond [(null? inputs) ()]
        [(equal? (car inputs) 1)
         (cons (+ (car weights) increment)
               (adjust_weights_2 (cdr weights) (cdr inputs) increment))]
        [else (cons (car weights)
                    (adjust_weights_2 (cdr weights) (cdr inputs) increment))]))

(define (compute_weighted_input weights inputs)
  (cond [(null? inputs) 0] 
        [else (+ (* (car weights) (car inputs))
                 (compute_weighted_input (cdr weights) (cdr inputs)))]))

(define (compute_g weights inputs threshold)
  (> (compute_weighted_input inputs weights) threshold))


II.  Learning as the search for the best representation again

As I said before, I'll let your textbook develop learning
in neural networks from there, but whether we're talking
about perceptrons or more sophisticated artificial neurons,
both approaches represent what they've learned not as
semantic networks or decision trees, but as a collection
of weights...in other words, a representation of an arch
is just a set of real numbers.  And what those numbers
mean may not be entirely obvious, but they were derived
by searching through possible sets of weights until
a set was found that resulted in all the right outputs for
all the right inputs.

Once again, as it has since September, it all boils down
to search.

Last revised: December 7, 2004