IndexWeblogRecipesProjectsLinks
AboutAtom

Categories

ThadeusBpython, ai, robots

Artificial Neural Networks - Multi-Layer Perceptron

This will be my first real tutorial so bear with me as I am sure itwill undergo many revisions. Please feel free to ask any questionsin the comments. Note:  If I actually finish this tutorial andprovide source code depends on the response I get to the article.If you liked it and would like me to write more then please let meknow! What will be covered:

  • Overview - Of Neural Networks
  • Multi-Layer Perceptron - A kind of neural network
  • Feed Foward Algorithm - Calculating an answer
  • Back Propagation Algorithm - Supervised Training
  • Application - Neural Network written in C#.NET

Introduction:

Lets start from the beginning. Artificial Intelligence is simplydefined as making a computer seem more human. The field ofartificial intelligence is vast, and every month there areinnovations in the field that is getting computers one step closerto their makers. Things such as video game opponents, opticalcharacter recognition, facial recognition, voice synthesis, datamining, robotic surgery, and much more are accomplished by usingartificial intelligence.

One of the most popular ways to emulate human decision making is bysimulating the human brain. What better way to make a computer morehuman than to design it like a human?

Our brain is composed of a network of cells called neurons, andthese neurons are linked to each other by dendrites. Long storyshort, a neuron receives an input from all the dendrites it isconnected to, does some math, if it is above a certain voltagelevel then that neuron will fire and send its value out to eachneuron that is connected to it. By some act of God, that is how wehave thoughts.

When we simulate this process in a computer, we call it anartificial neural network. There are a few different kinds ofartificial neural networks ( ANN for short ). These are just someof the different kinds of ANN.

  • Feed-Foward
    • Perceptron
    • Multi-Layer Perceptron
  • Feedback
    • Hopfield Net
    • Self- Organizing Maps

The main difference in each is how the neurons are arranged and howeach neuron interacts with every other neuron.  Since this articleis about ML- perceptrons, that is all I will be explaining.

There are two ways of teaching an ANN to do what you expect it to,make an reasonable decision.

  • Supervised Learning
  • Un-supervised Learning

With supervised training you give the network some input data, suchas an image of a letter. It spits out an answer as to what itthinks it is (which will be wrong until it is trained). You tell itwhat the correct answer was, and it fixes itself so the next timeit sees that same image, it will be closer to the correct answer.

In un-supervised learning the network will make its own assumptionsabout the input data, and over time possibly provide answers to aproblem that had not even been thought of. A Self-Organizing map isan example of a un-supervised network that will organize like datainto groups.

Basics of a Multi-Layer Perceptron:

A ML-perceptron is a ANN that is composed of layers of neurons inwhich each neuron in a layer is connected to every neuron in theprevious layer.

*Note: Dendrites are also referred to as Weights*

[caption id="attachment_247" align="alignnone" width="600"caption="ExampleMulti-Layer Perceptron"]`|Example Perceptron| <http://blog.thadeusb.com/wp-content/uploads/2008/12/multi-layer->`_[/caption]

In a ML-perceptron, there are at least three layers

  • 1 Input
  • 1 - ∞ Hidden
  • 1 Output

In short:

The input layer receives data about something. Each input neuronrepresents a piece of data such as pixes to a image of a letter.

The hidden layers do some calculations.

The output layers provide an answer in percentage. Each outputneuron represents a different possibility.

In Long:

The input layer is the layer of neurons that will be receiving thedata about something. For character recognition there would need tobe one neuron for each pixel in the image of the character.Obviously there can be only one input layer. Input is usuallynormalized to a double between 0.0 and 1.0.

The hidden layer is a layer used for calculations. This is whatmakes up the "brain" of a ML-perceptron. There can be as manyhidden layers with any number of neurons. Typically one hiddenlayer is usually enough, though to get the best performance(learning rate vs. correctness) different configurations could betested.

The output layer is the results of what the network has calculated.For character recognition that could only see the letters of thealphabet there would need to be 32 output neurons (one for eachletter). A neurons output value is a normalized double between 0.0and 1.0. 0 is false, 1 is true. It could also be though of as apercentage of correctness for what that neuron is to represent. Inthe OCR network, if a neuron represented the capital letter T was a.85, that would me the network thinks the input is an 85% chance T,yet also the neuron representing the letter F might be .64, meaningthe network thinks the input is an 64% chance F.

The Example:

The following example will go through a network that is trained tosolve the XOR problem.

XOR Problem: One and only one can be true but one has to be true.

A XOR B = C
-----------
1 XOR 1 = 0
1 XOR 0 = 1
0 XOR 1 = 1
0 XOR 0 = 0

The neural network that can solve such a problem is shown below

[caption id="attachment_253" align="alignnone" width="600"caption="XOR Multi-Layer Perceptron"]`|XOR Perceptron| <http://blog.thadeusb.com/wp-content/uploads/2008/12/multi-layer->`_[/caption]

Since the XOR problem has only two binary inputs, the network willhave two neurons in the input layer. Also since the XOR problemsoutput is just one answer, there will only be one output neuron.

How A Multi-Layer Perceptron Calculates An Answer:

The calculations happen inside each individual neuron. A neuronneeds three pieces of information.

  • Output values of every neuron in the previous layer
  • Value of every dendrite(weight) connecting the current neuron toevery corresponding previous neuron
  • Activation Function

The value of every dendrite(weight) in a ML-perceptron arerandomized double values between 0.0 and 1.0.

The weights of the network are what make up the magic. These arethe main determiners of what the final answer is. Later, when thenetwork is to be trained or taught, these dendrites(weights) arewhat will be altered.

The input layer does not have any calculations performed on it,therefore the input to the input layer neurons is also theiroutput.

Starting with the first hidden layer and using the input layers,for each neuron in the layer sum all of the previous layers outputvalues * their corresponding weights. Pass the sum through anactivation function. Continue to the next layer. Do this for theoutput layer as well. The values of each neuron in the output layeris the answer from the network. This can be represented in calculusas follows:

v = Sigmoid( Σ( pv * pw ) )

Where

  • v = output value of the neuron
  • Σ = calculus for [summation][5]
  • pv = a previous neurons output value
  • pw = the dendrite or weight value connecting to the previous neuron
  • Sigmoid = activation function

[caption id="attachment_256" align="alignnone" width="600"caption="Activation of aNeuron"]`|Activation of a Neuron| <http://blog.thadeusb.com/wp-content/uploads/2008/12/muti-layer->`_[/caption]

There is a problem with this structure though. If all the inputs tothe network are zero, when the network is training, it will not beable to learn what all zero inputs are. For this reason there mustbe a pseudo input added to each layer excluding the output layer.This bias acts as a neuron with a value that is always 1 or -1and it has a dendrite (weight) attached to it and is calculated inwith the summation.

The Activation:

This is probably the most important calculation in a neuralnetwork. The activation function. Simply put, this function decidesif the neuron should "fire" and send a signal to receiving neurons.There are a few types of activation functions:

  • Threshold
  • Piecewise
  • Sigmoid Logistics Hyperbolic Tangent Algebriac Sigmoid

Note: A Sigmoid function is also referred to as a squashing function since it normalizes input

Typically a ML-Perceptron uses a sigmoid function as itsactivation.

The sigmoid function looks as follows:

threshold = 1 sum = Σ( pv * pw ) sigmoid = ( 1 / ( 1 + e ^ ( -( sum ) / threshold )))

Where

  • e = [Mathematical constant e ][8]
  • sum = sum of previous neurons
  • threshold = Activation value

How A Multi-Layer Perceptron Learns:

With randomized weights, it will be luck if the network spits out acorrect answer. The answer will seemingly be randomized as well.

These weights are what will be altered to make the network spit outan answer more to what is being expected. However, to know how tochange these weights the current output of the neural network willhave to be known so that we can calculate the error rate. The errorrate is how far off the neural network is from the correct answer.

The training process happens on a per-input basis. If the networkis trained on one set of data, such as the letter "A", it will notbe trained if it comes across the letter "B". All the neuralnetwork will know is how much "B" is like "A".

Training may have to be done hundreds or even thousands of times oneach kind of input until the error rate of the network is within asatisfactory range. Usually less than 1%.

The process as follows:

  • Compute resulting output
  • Compute ERROR for neurons in output layer
  • Compute ERROR for all other neurons
  • Compute CHANGE for all weights

To calculate the error for the output layer.

Error is equal to the desired output minus the actual output.

β = d - o

Where:

  • β = Error
  • d = Desired output
  • o = Actual output

To calculate the error for all other neurons.

Take the weights and outputs and errors of each neuron in theright-side layer.

Sum the product of them all. Also include a 1 - output value forthe slope of the line.

βj= Σ ( wk> * ok ( 1 - ok ) βk )

Where:

  • βj = Error for current neuron
  • wk = Weight for neuron in right-side layer
  • ok = Output for neuron in right-side layer
  • βk = Error for neuron in right-side layer
  • j = Current Neuron
  • k = Neuron in right-side layer
  • Σ = calculus for[summation][5]

To calculate the changes for the weights.

For a weight of a neuron. Take the neuron in the left-side layerthe weight is connected to Multiply it with the slope and the errorof the current neuron Multiply in a learning rate (How fast thenetwork will learn) .20-.25 seem to be optimal

Δwj = r * oi * oj ( 1 - oj ) βj

wj= wj+ Δwj

Where:

  • Δwj = Change in weight for current neuron
  • wj = Weight of the current neuron
  • r = Learning rate of the network ( how fast the network learns ) .2o-.25 are good.
  • oi = Output for neuron in the left-side layer
  • oj = Output for neuron in the right-side layer
  • βj = Error for current neuron
  • j = Current Neuron
  • i = Neuron in left-side layer

System Message: ERROR/3 (<string>, line 325)

Error in "figure" directive:invalid option value: (option: "figwidth"; value: u'25%')invalid literal for int() with base 10: '25%'.

.. figure:: /dl/posts/artificial_neural_networks_multi_layer_perceptron/backprop.jpg
    :alt: Neural Network With Many Layers
    :figwidth: 25%
    :width: 95%
    :align: center

    Neural Network With Many Layers

Putting it all together:

wj = wj + Δwj wj

Δwj = r * oi * oj ( 1 - oj ) βj

βj = Σ ( wk * ok ( 1 - ok ) βk )

βz = d - o

How A Multi-Layer Perceptron Can Be Used:

A few uses for a neural network include optical characterrecognition, facial recognition, texture analysis, data validation,sales forecasting, etc.

For a more extensive list visit this websiteAlyuda.com

A tutorial on writing a neural network with C# will be comingsoon!

The project will be a simple optical character recognitionprogram.

List of missing files.

perceptron-300x187.jpg (Multi-Layer-Perceptron)perceptron.jpgperceptron-xor-300x187.jpg (Multi-Layer-Perceptron-XOR)perceptron-xor.jpgperceptron-activation-300x225.jpg (Muti-Layer-Perceptron-Activation)perceptron-activation.jpg

blog comments powered by Disqus