Neural Networks; what are they?

Artificial Neural Networks (ANN), also known simply as Neural Networks (NN), have their inspiration in the human brain. Millions of neurons, each computing an extremely simple function, are interconnected in such a way as to allow for very complex computations. Siegelman and Sontag (1992) have proven that NN are Turing Equivalent.

The basic element of a NN is the neuron. Each neuron has a fixed number of input connections. These input connections, or simply inputs, come either from another neuron in the network or from the outside environment. A neuron receives an input value through each of these inputs, and each of these inputs gets multiplied by a weight related to the particular connection. The main function of a neuron is to take a weighted sum of the inputs it is receiving and, based on that input and the neuron's particular transfer function, produce an output. This output signal then travels either to another neuron (or neurons) or to the outside world.

Figure 1: Sketch of a typical Neuron
Figure 1 shows a drawing of a typical neuron. It has N inputs (labeled X1 through XN). Each input has a corresponding weight (also labeled W1 through WN). The weighted sum of these inputs is Swixi. The output of the neuron is the result of evaluating the transfer function with this weighted sum as input. Some typical transfer functions are shown in Figure 2.

Figure 2: Examples of some transfer functions. Figure 2(A) shows a threshold transfer function with a cut-off of one and a maximum activation value of one. Figure 2(B) depicts a similar function, this time with a cut-off value (also known as bias) of 2. Figure 2(C) shows a ramp transfer function with maximum activation value of .75, while Figure 2(D) is an example of a sigmoidal transfer function.
Papert and Minsky (1966) proved that a single neuron, by itself, can do very little. The power of the NN computational model arises when neurons are connected to each other to produce networks. Figure 3 shows an example of such a network. This is a layered network. In a layered network nodes are divided into disjoint subsets. There are no connections between nodes that belong to the same layer, and a node in layer Y can only have input connections from nodes in layer Y-1 and output connections to nodes in layer Y+1. Layer 1 has inputs from the environment. The last layer has outputs to the environment. Layers that have no connection to the environment (that is, all layers other than the input and output layer) are called hidden layers, and nodes in hidden layers are known as hidden nodes. The layered network shown in Figure 3 is also an example of a fully connected NN. Each node in layer Y is connected to every node in layers Y-1 and Y+1. Layered NN can, in general, have any number of layers, and each layer can have any number of nodes. A layer that does not meet the above description is called an unlayered network.
Figure 5: Layered NN
Most of the attention given to NN has been based on the fact that they have the ability to adjust their connectivity in such a way as to allow them to perform superior input/output mapping. By changing the weights of the connections in a NN, the same input can come to be mapped to a different output. For example, assume that we would like to have a network that can compute the exclusive-or operation.

Figure 4:

(A)Untrained NN

(B)Trained NN

The network shown in Figure 4 could be initialized to have all of its weights to 1. In that case, it would not be able to compute the x-or function correctly. By performing a particular learning algorithm, the network can identify those weight changes that will allow it to come closer to the desired output. Iterating the learning algorithm will monotonically decrease the difference between the actual and desired outputs. For example, the network in Figure 4(A) would produce an input/output mapping of [(00,0), (01,1), (10,1), (11,1)]. Note that the correct mapping for the x-or operation is [(00,0), (01,1), (10,1), (11,0)]. After modifying the weights to those in Figure 4(B), the network produces the correct outputs. Although there are several different learning algorithms that can be used to train a NN, almost all of them consist of iteratively changing the connection weights in such a manner as to reduce the magnitude of the error in each iteration. Training data is presented at the input connections, and the NN then computes an output based on this input and the current connections weights. The output of the NN is then compared with the desired output, and the weights are changed according to the particular learning algorithm being used. Backpropagation, the most commonly used training algorithm, was presented by Rummelhart and McClelland (1986). A brief description of several other learning algorithms is presented later in this paper.

Back to the Table of Content

Back to the previous

To the next topic