This is the simplest artificial neural network possible explained and demonstrated.
Artificial neural networks are inspired by the brain by having interconnected artificial neurons store patterns and communicate with each other. The simplest form of an artificial neuron has one or multiple inputs each having a specific weight and one output .
At the simplest level, the output is the sum of its inputs times its weights.
The idea is to adjust the weights in such a way that the given inputs produce the desired output.
If we calculate the output of this network, we will get
One common way to measure the error (also referred to as the cost function) is to use the mean squared error:
If we had multiple associations of inputs and target values, then the error becomes the average sum of each association.
We use the mean squared error to measure how far away the results are from our desired target. The squaring removes negative signs and gives more weight to bigger differences between output and target.
However, in order to adjust the weights of our neural networks for many different inputs and target values, we need a learning algorithm to do this for us automatically.
The idea is to use the error to understand how each weight should be adjusted so that the error is minimized, but first, we need to learn about gradients.
It's essentially a vector pointing to the direction of the steepest ascent of a function. The gradient is denoted with and is simply the partial derivative of each variable of a function expressed as a vector.
It looks like this for a two variable function:
The descent part simply means using the gradient to find the direction of steepest ascent of our function and then going in the opposite direction by a small amount many times to find the function global (or sometimes local) minimum.
Once we have the gradient, we can update our weights by subtracting the calculated gradient times the learning rate.
And we repeat this process until the error is minimized and is close enough to zero.
The included example teaches the following dataset to a neural network with two inputs and one output using gradient descent:
docker build -t simplest-network . docker run --rm simplest-network