# CS322 Fall 1999Module 12 (Neural Network Learning)

## Assignment 12

### Solution

Question 1

The following is the same data from assignment 11:

 Example bought edu first visited more_info e1 false true false false true e2 true false true false false e3 false false true true true e4 false false true false false e5 false false false true false e6 true false false true true e7 true false false false true e8 false true true true false e9 false true true false false e10 true true true false true e11 true true false true true e12 false false false false true
We want to use this data to learn the value of more_info as a function of the values of the other variables.

In this assignment we will consider neural network learning for this data. We have a Java applet and a CILog program that can be used to answer this assignment.

1. Consider neural network learning with no hidden layers. After the network has converged, what are the parameter values? What is the Boolean function that the network represents? Are all the training examples classified correctly (if not, which aren't)? Give two examples, not in the training set, and specify what the predicted values is.
2. Consider neural network learning with one hidden layer containing two variables. After the network has converged, what are the parameter values? What is the Boolean function that the network represents? Are all the training examples classified correctly (if not, which aren't)? Give two examples, not in the training set, and specify what the predicted values is.
3. For the network with a hidden layer what is a local minima of the learning rate (within one decimal point)? The value to minimize is the number of steps before the error gets below 1.0. Hint: there is a local minima in the range [0.3,7.0].

# Solution

1. Consider neural network learning with no hidden layers.
1. After the network has converged, what are the parameter values?

After 200 iterations with a learning rate of 0.5 the parameter values are:

 Parameter Parent Value w0 1.58 w4 bought 3.96 w3 edu 3.52 w2 first -7.42 w1 visited -3.40
2. What is the Boolean function that the network represents?

When first is true, the value of the linear expression is negative unless bought and edu are true and visited is false.

When first is false, the value of the linear expression is positive unless bought and edu are false and visited is true.

This can be written as the decision tree:

So the boolean expression is:
(first &bought &edu &not visited) or
(not first &bought) or
(not first &edu) or
(not first &not visited).

3. Are all the training examples classified correctly (if not, which aren't)?

No. e3 is misclassified. The neural network classifies it as false.

4. Give two examples, not in the training set, and specify what the predicted values is.

The following

 bought edu first visited more_info true true true true false true true false false true true false true true false false true false true true
2. Consider neural network learning with one hidden layer containing two variables.
1. After the network has converged, what are the parameter values?

run the applet....

2. What is the Boolean function that the network represents?

After 200 iterations with learning rate of 0.5, we can have the following table:

 bought edu first visited more_info true true true true false true true true false true true true false true true true true false false true true false true true false true false true false false true false false true true true false false false true false true true true false false true true false false false true false true true false true false false true false false true true false false false true false false false false false true false false false false false true

This represents the same Boolean function as part (a).

3. Are all the training examples classified correctly (if not, which aren't)?

Again e3 is misclassified.

4. Give two examples, not in the training set, and specify what the predicted values is.
3. For the network with a hidden layer what is a local minima of the learning rate (within one decimal point)? The value to minimize is the number of steps before the error gets below 1.0. Hint: there is a local minima in the range [0.3,7.0].

There is local minimum at 1.7 or 1.8 (with 42 iterations), another at 2.7 (with 33 iterations) and another at 3.0 (with 34 iterations).

David Poole