|
General Help
Neural Networks
Contents
OverviewThe purpose of the Neural Network applet is to visually demonstrate the feedforward backpropagation algorithm. There is visual feedback for weight adjustments and error analysis. The Neural Network applet features support for graphical modification and creation of neural networks. It allows for separate training and test sets, where the network is trained by the training set, and the test set is a "control". Also, it has a "Construction Wizard" that allows the applet to load plain comma-delimited text files as data, and construct an appropriate neural network for it. Menu HelpThe File MenuThe File Menu has options to create graphs, load files, and save files, as well as quitting the program. Notes on loading files: It is possible to either load 'Graph and Data' or just 'Data'. Loading 'Graph and Data' will load graph information from a standard neural network file, along with its corresponding data. Loading 'Data' opens the Construction Wizard to generate a neural network automatically. Loading 'Data' can load data from both plain comma-delimited text files and standard neural network files (discarding the graph information), while loading 'Graph and Data' can only open the standard neural network files.
The Edit MenuThe Edit Menu allows the user to view a text representation of the neural network.
The View MenuThe View Menu allows the user to modify the appearance of the applet.
The Neural Options MenuThe Neural Options Menu allows the user to modify the parameters of the backpropagation.
The Construction WizardThe Neural Network Construction Wizard is designed to automate the creation of neural networks from raw, comma-delimited data. The only requirement is that the text file must start with a line defining the categories of the data. This line has to be of the form T:[category1],[category2], ..., [categoryN]; , in the same order as the data. For example, this is the first two lines of a data file for the Wizard: T: price, maint-cost, doors, persons, trunk-size, safety, acceptable; vhigh, vhigh, 2, 2, small, low, unacc The Wizard can also load the normal applet data files, but it will ignore the graph information and just load the examples. Once the file is loaded, the Wizard dialog will pop up and query the user for information on the neural network that he/she wants to build. The user should input the number of hidden layers needed, and the number of nodes for a specific hidden layer. Hidden layers can be selected using the pull-down choice menu. The number of nodes default to 2. The user has to choose which categories are outputs. Depress the radio button to the left of the category name to make it an output. Input categories become input nodes, and output categories become output nodes. Also, it may be necessary for some non-numerical categories to be declared as "ordered" by depressing the corresponding checkbox beside the category name. What this means is that this category can be represented as a continuum of numbers. The Wizard will prompt for value mappings for each element of the category. For example, the category "University" with members "SFU, UBC, UVic" cannot be represented as such, but the category "Rating" with members "Low, Medium, High" can be (one can map them as numbers 0, .5, and 1). Numerical categories are already ordered, and hence are not affected if they are declared as ordered. Once all mappings have been declared, the Wizard will create the specified neural network. Also, it will distribute the data evenly into the training and test sets. Training and Test SetsThe neural network applet uses two sets of data for the network: the training set and the test set. The training set is the set of examples that are used to train the neural network. The test set is a "control" which allows the user to observe how the network can generalize from the training set to other data. The applet graphs both training and test set errors in the Plot Window, and the user can get more detailed statistics for the test set from the Summary Statistics window. Create ModeCreate Mode allows the user to create a neural network manually. Click on a button on the toolbar to enable its function. Only one button can be depressed at a time. For best results, the neural network should be totally connected; a node should be connected to all the nodes of the layer below it.
Solve ModeSolve mode allows the user to train the neural network that is in memory. It also allows the user to observe the training, and to add or delete examples from the training or test sets. Finally, the user can also access statistics on the network through solve mode.
The Plot WindowThe plot window shows a graph of the error of the neural network. As the backpropagation algorithm is run, the error should be minimized. The plot window also has Initialize Parameter, Step, Step X, Step to Target Error, and Stop buttons, which work in the same way as their solve mode counterparts. In addition, there are buttons to close, redraw, clear, and print the plot window. There is also a checkbox to switch between logarithmic and standard display modes. The error values of the training and the test sets are displayed on the right side of the plot. The blue plot is the training set error, while the orange plot is the test set error. The Edit Examples WindowThe Edit Examples window displays all the examples in both the training test sets, and allows the user to add and remove examples, and switch examples between the training and test sets. To select an example, click on the example. To select multiple examples, keep clicking on the desired examples until all of them are selected. One can also pull down the Select... choice box and choose between Select All, Select None, and Select Percentage. To transfer an example to another set, click the appropriate arrow button on the dialog. The Summary Statistics WindowThe Summary Statistics Window displays statistics for the test set, and as such, can only be pulled up once there is at least one example in the test set. The window displays all the test examples as a table, as well as the predicted value. It classifies the examples as correct or incorrect depending on a classification range which is determined by the user. This "threshold" defaults to .5. The window also gives a percentage correct or incorrect, and also allows the user to select which output's predicted value is displayed in the table. There is a color key at the bottom of the applet window. Red is a weight less than 0, and green is a weight greater than 0. Zero weight results in a clear edge, and the color becomes darker as the weight gets larger. There is a message at the top of the canvas that cues the user as to what the applet is doing as it is running. Algorithms and TheoryThis is a very basic introduction to neural networks and the feedforward backpropagation algorithm. A neural network (in the context of this applet) is a set of nodes that are connected to each other via edges. A node can only send information (usually numeric data) through an edge. A node sums up all its received "signals" and inputs it into an activation function, then sends the result to all its children. This signal is multiplied by a weight associated with an edge, then is received by the target node. An activation function is simply a function that is used to introduce nonlinearity to the network. There are four functions supported by the applet: linear, sigmoid, exponential, and hyperbolic tangent. Note that all the functions are differentiable - this is because the backpropagation algorithm requires that activation functions have to be differentiable.
Nodes without parents are input nodes - the user must provide them with input. Nodes without children are output nodes. Everything else are hidden nodes. This applet deals with a feedforward neural network - it requires that the graph should be acyclic, and that a "layer" (a set of nodes that have the same depth) should be totally connected to the layer below it. This means that each node should have an edge going to all the nodes of the layer below it, and nowhere else. This network is trained by an algorithm called the backpropagation algorithm. The backprop algorithm is essentially a minimization technique that minimizes the error of a neural network. The Feedforward Backpropagation AlgorithmThe Neural Network applet demonstrates a widely-used algorithm called the Backpropagation algorithm. To train a neural network, a set of training examples is fed into the network. Each example will produce an output that may be different from the expected result. This error (usually sum of squares error) is "backpropagated" through edges and hidden nodes; the magnitude of this error is used to determine how much to adjust the weights, and in what direction. An epoch, or iteration, is a whole training set fed into a neural network in this way. Caveats and WarningsIf the network doesn't seem to learn, maybe the learning rate is too high. Adjust the learning rate to something smaller. Be VERY careful with activation functions. Remember that the sigmoid function only has a range [0,1] and the hyperbolic tangent has a range [-1,1]. Set your output node activation functions to linear if the outputs are not within these ranges. The backpropagation algorithm as it is implemented here is very vulnerable to numerical errors. This is because the total sum of squares error can be large if the training set is large. The algorithm requires that this error be fed into the derivative of the activation function. As the most common activation functions involve exponential functions, this leads to numerical errors. The linear activation function does not suffer from this problem, but since it is not a "squashing" function (the sigmoid and hyperbolic tangent has finite ranges), it also has numerical problems with large training sets as all the sum of squares errors add up to large numbers. To work around this, set the learning rate to something small (less than .005). This will usually bring the computation to a manageable level. DTD Definition<!DOCTYPE MLDBIF [ <!ELEMENT MLDBIF ( DB ) > <!ELEMENT DB ( NETWORK, EXAMPLES ) > <!ELEMENT EXAMPLES ( PARAMETER+, EXAMPLE+ ) > <!ELEMENT PARAMETER ( #PCDATA ) > <!ATTLIST PARAMETER type NMTOKEN #REQUIRED > <!ELEMENT EXAMPLE ( VALUE+ ) > <!ATTLIST EXAMPLE type NMTOKEN #REQUIRED > <!ELEMENT VALUE ( #PCDATA ) > <!ATTLIST VALUE parameter CDATA #REQUIRED > <!ELEMENT NETWORK ( NODE+, EDGE+ ) > <!ELEMENT NODE ( NAME, WEIGHT, XPOS, YPOS, INDEX, FUNCTION ) > <!ELEMENT NAME ( #PCDATA ) > <!ELEMENT WEIGHT ( #PCDATA ) > <!ELEMENT FUNCTION ( #PCDATA ) > <!ELEMENT INDEX ( #PCDATA ) > <!ELEMENT XPOS ( #PCDATA ) > <!ELEMENT YPOS ( #PCDATA ) > <!ELEMENT EDGE ( STARTINDEX, ENDINDEX, WEIGHT ) > <!ELEMENT STARTINDEX ( #PCDATA ) > <!ELEMENT ENDINDEX ( #PCDATA ) > ]>This is an example of the Neural Network XML (the Boolean example from the applet): <?xml version="1.0" ?> <MLDBIF> <DB> <!-- Neural Network Definition --> <NETWORK> <!-- Node Definitions --> <NODE> <NAME>Input 1</NAME> <WEIGHT>0.0</WEIGHT> <XPOS>-121.04071</XPOS> <YPOS>-91.9425</YPOS> <INDEX>0</INDEX> <FUNCTION>sigmoid</FUNCTION> </NODE> <NODE> <NAME>Input 2</NAME> <WEIGHT>0.0</WEIGHT> <XPOS>118.37389</XPOS> <YPOS>-90.12185</YPOS> <INDEX>1</INDEX> <FUNCTION>sigmoid</FUNCTION> </NODE> <NODE> <NAME>Output (and)</NAME> <WEIGHT>0.1</WEIGHT> <XPOS>-184.50099</XPOS> <YPOS>91.16629</YPOS> <INDEX>2</INDEX> <FUNCTION>sigmoid</FUNCTION> </NODE> <NODE> <NAME>Output (or)</NAME> <WEIGHT>0.2</WEIGHT> <XPOS>2.7630477</XPOS> <YPOS>91.9425</YPOS> <INDEX>3</INDEX> <FUNCTION>sigmoid</FUNCTION> </NODE> <NODE> <NAME>Output (nor)</NAME> <WEIGHT>0.3</WEIGHT> <XPOS>185.50098</XPOS> <YPOS>91.16629</YPOS> <INDEX>4</INDEX> <FUNCTION>sigmoid</FUNCTION> </NODE> <!-- Edge Definitions --> <EDGE> <STARTINDEX>0</STARTINDEX> <ENDINDEX>2</ENDINDEX> <WEIGHT>0.4</WEIGHT> </EDGE> <EDGE> <STARTINDEX>0</STARTINDEX> <ENDINDEX>3</ENDINDEX> <WEIGHT>0.1</WEIGHT> </EDGE> <EDGE> <STARTINDEX>0</STARTINDEX> <ENDINDEX>4</ENDINDEX> <WEIGHT>0.2</WEIGHT> </EDGE> <EDGE> <STARTINDEX>1</STARTINDEX> <ENDINDEX>2</ENDINDEX> <WEIGHT>0.3</WEIGHT> </EDGE> <EDGE> <STARTINDEX>1</STARTINDEX> <ENDINDEX>3</ENDINDEX> <WEIGHT>0.4</WEIGHT> </EDGE> <EDGE> <STARTINDEX>1</STARTINDEX> <ENDINDEX>4</ENDINDEX> <WEIGHT>0.5</WEIGHT> </EDGE> </NETWORK> <!-- Example Database --> <EXAMPLES> <!-- Parameter Definition --> <PARAMETER type="input">Input 1</PARAMETER> <PARAMETER type="input">Input 2</PARAMETER> <PARAMETER type="output">Output (and)</PARAMETER> <PARAMETER type="output">Output (or)</PARAMETER> <PARAMETER type="output">Output (nor)</PARAMETER> <!-- Examples --> <EXAMPLE type="training"> <VALUE parameter="Input 1">0.0</VALUE> <VALUE parameter="Input 2">0.0</VALUE> <VALUE parameter="Output (and)">0.0</VALUE> <VALUE parameter="Output (or)">0.0</VALUE> <VALUE parameter="Output (nor)">1.0</VALUE> </EXAMPLE> <EXAMPLE type="training"> <VALUE parameter="Input 1">0.0</VALUE> <VALUE parameter="Input 2">1.0</VALUE> <VALUE parameter="Output (and)">0.0</VALUE> <VALUE parameter="Output (or)">1.0</VALUE> <VALUE parameter="Output (nor)">0.0</VALUE> </EXAMPLE> <EXAMPLE type="training"> <VALUE parameter="Input 1">1.0</VALUE> <VALUE parameter="Input 2">0.0</VALUE> <VALUE parameter="Output (and)">0.0</VALUE> <VALUE parameter="Output (or)">1.0</VALUE> <VALUE parameter="Output (nor)">0.0</VALUE> </EXAMPLE> <EXAMPLE type="training"> <VALUE parameter="Input 1">1.0</VALUE> <VALUE parameter="Input 2">1.0</VALUE> <VALUE parameter="Output (and)">1.0</VALUE> <VALUE parameter="Output (or)">1.0</VALUE> <VALUE parameter="Output (nor)">0.0</VALUE> </EXAMPLE> </EXAMPLES> </DB> </MLDBIF> Bibliography
|
Main Tools: Graph Searching | Consistency for CSP | SLS for CSP | Deduction | Belief and Decision Networks | Decision Trees | Neural Networks | STRIPS to CSP |