• Learning through ANN. Source: Sharma. 2017b.
• Biological Vs Artificial Neural Network. Source: Roell. 2017.
• Structure of a Neuron and ANN. Source: Vieira. 2017.
• Perceptron example with weights. Source: Sharma. 2017.

# Artificial Neural Network

1315 DevCoins

arvindpdmn
319 DevCoins

gurumoorthyP
11 DevCoins
Last updated by arvindpdmn
on 2020-08-17 05:50:31
on 2019-06-20 09:30:43

## Summary

Artificial Neural Network (ANN) belongs to the field of Machine Learning. It consists of computational models inspired from the human brain and biological neural networks. The goal is to simulate human intelligence, reasoning and memory to solve forecasting, pattern recognition and classification problems.

ANN is effective in scenarios where traditional ML methods such as regression, time series analysis or PCA cannot perform or forecast accurately. This could be because of data bias, mix of continuous and categorical data, unclean or uncertain data.

Complex networks with multiple layers, nodes and neurons are possible today, thanks to dramatic increase in computing power, super-efficient GPUs and Big Data. A modern approach to ANN known as Deep Learning which processes and transforms data cascading through hierarchical layers, has gained immense prominence. Computer vision, character and image recognition, speech detection and NLP are the popular applications of ANN.

## Milestones

1943

Neurophysiologist Warren McCulloch and mathematician Walter Pitts write a paper on how neurons might work. In order to describe how neurons in the brain might work, they model a simple neural network using electrical circuits.

1948

Donald Hebb in his book The Organization of Behaviour proposes a model of learning based on neural plasticity. Later called Hebbian Learning it's often summarized by the phrase "cells that fire together, wire together".

1958

Frank Rosenblatt, a psychologist at Cornell proposes the idea of a Perceptron, modeled on the McCulloch-Pitts neuron.

1960

Widrow and Hoff develop a learning procedure that examines weight values and determine output of perceptrons accordingly.

1975

The first multilayered network is developed. It's an unsupervised network.

1975

A key trigger for renewed interest in neural networks and learning is Werbos's backpropagation algorithm that made the training of multi-layer networks feasible and efficient.

1985

American Institute of Physics, establishes a Neural Networks in Computing annual meeting.

1987

The first International Conference on Neural Networks is organized by the Institute of Electrical and Electronics Engineers (IEEE).

2009

Recurrent neural networks and deep feedforward neural networks developed in Schmidhuber's research group win eight international competitions in pattern recognition and machine learning.

2010

Backpropagation training through max-pooling is accelerated by GPUs and shown to perform better than other pooling variants.

2012

Ng and Dean create a network that learns to recognize higher level concepts such as cats, only from watching unlabeled images taken from YouTube videos.

## Discussion

• In what ways do ANN replicate the functioning of the human neural network?

In a biological neural network, nerve cells (neurons) are interconnected by signal transmitters (synapses) which pass on electrical/chemical signals to a target neuron. By summation of potentials, either a signal to excite (positive) or inhibit (negative) is transmitted. Human brain contains about 86 billion neurons on average.

When stimulated by an electrical pulse, neurotransmitters are released. They cross into the synaptic gap between neurons and bind to chemical receptors in the receiving neuron. This affects the potential charge of the receiving neuron, and starts up a new electrical signal in the receiving neuron. The whole process takes less than 1/500th of a second. As a message moves from one neuron to another, it is converted from an electrical signal to a chemical signal and back in an ongoing chain of events which is the basis of all brain activity.

To draw parallels in ANN, a network of data elements called artificial neurons receive input and change their internal state (activation) based on that input. They then produce a result depending on the input value and the activation function.

• What is the structure and function of an ANN?

The basic structure of an ANN involves a network of artificial neurons arranged in layers - one input layer, one output layer and one or more hidden layers in between.

Each neuron with the exception of those in the input layer, receives and processes stimuli (inputs) from other neurons. The processed information is available at the output end of the neuron.

Every connection has a weight (positive or negative) attached to it. Positive weights activate the neuron while negative weights inhibit it. The figure shows a network structure with inputs (x1, x2, … xm) being connected to a neuron with weights (w1, w2, … wm) on each connection. The neuron sums all the signals it receives, with each signal being multiplied by its associated weights on the connection.

This output is then passed through a transfer (activation) function g(y) that is normally non-linear to give the final output y. By comparing this output to actual value we determine the error. The process is repeated over several iterations until the error is within acceptable limits.

• What are some key terms used in describing ANN?

We note the following terms:

• Perceptron - A linear binary classifier used in supervised learning. It acts as a single-node neural network. A neural network consists of multi-layer perceptrons (MLPs). Perceptrons consist of 4 parts – inputs, weights & bias, weighted sum and activation function. All inputs x are multiplied with their weights w, then added to get a weighted sum. This sum is applied to the activation function, to get output y of the perceptron.
• Activation Function - A transfer function used to determine output of a neural network. It maps the resulting values into ranges (0 to 1), (-1 to 1) etc. They may be linear or non-linear, the sigmoid function being one of them.
• Weights and Biases - Weights assigned to an input indicate its strength. Higher the weight, greater the influence of that input on the outcome. Bias value allows you to shift the activation function curve up or down.
• Loss Function - A way to evaluate "goodness" of predictions. It quantifies the gap between predicted and actual values. Sum Of Squares Error is an example loss function.
• What are the most common types of Artificial Neural Networks used in ML?

We note the following types of ANNs:

• Feed Forward Networks - Simplest form of ANN where data travels in one direction through a network of perceptrons, from input layer towards output layer. Error in prediction is fed back to update weights and biases using backpropagation. Over multiple iterations, weights and bias values are tuned so that the predicted values converge on the actuals. Activation functions commonly used are sigmoid, tanh or RELU.
• Recurrent Neural Networks - Works on the principle of saving the output of a layer and feeding this back to the input to help in predicting the outcome of the layer. From one time-step to the next, each neuron remembers some information it had in the previous time-step. This makes each neuron act like a memory cell in performing computations (LSTM).
• Radial Basis Function - Functions that have a distance criterion with respect to a center. It has an input layer, radial basis hidden layer and output layer. Commonly used activation functions are Gaussian and multi-quadratic. Applied in Power Restoration Systems when restoration happens from core priority areas to the periphery.
• How does an ANN learn?

The ANN learns through an iterative training routine where weights and biases are continually adjusted to improve strength of the prediction. Steps are:

• Identify input values, assign random initial values as weights to these inputs.
• For supervised learning, segregate training and test data sets.
• Finalise the hidden layers and nodes in them.
• Identify number of epochs and batch sizes. Training data set is generally sliced into batches. An epoch is one complete pass through the whole training set. Training typically requires several epochs.
• Pass the training set through the layers. Determine predicted outcome at the end of the iteration.
• Study extent of miscalculation, now adjust the weights (W = W + ΔW) through backpropagation. Amount of change in W per iteration is called Learning Rate. Large values of learning rate would train the network faster, but may result in overshooting the optimal solution. We need a balance, determined by the gradient descent method of optimising the loss/cost function.
• After several iterations, network achieves acceptable reliability in predicted outcome. Verify by applying the test data set, check for model accuracy.

Now the ANN is set to have ‘learned’ and is ready for deployment.

• What are the key differentiators between traditional ML techniques and ANN?

ANN and traditional ML techniques like logistic regression are algorithms meant to do the same thing - classification of data. However, while logistic regression is a statistical method, ANN is a heuristic method modelled on the human brain.

In many cases, simple neural network configurations yield the same solution as many traditional statistical applications. For example, a single-layer, feed forward neural network with linear activation for its output perceptron is equivalent to a general linear regression fit. When you use sigmoid activation function in ANN, it behaves like logistic regression.

However, one of the unique aspects of an ANN is the presence of its hidden layers. Since movement of data between these layers is automatic, these steps cannot be statistically expressed. Hence debugging or tracing data values through these intermediate steps isn't possible. Whereas in regular ML algorithms, the input to output transition can be traced entirely.

The ability of data to reiterate through hidden layers allows hierarchical processing. This is the reason ANN forms the basis of Deep Learning, while other ML techniques are unsuitable.

• When to use and not to use Artificial Neural Networks?

When data is well structured, free of inconsistencies and somewhat linear in nature, traditional ML models such as linear regression, classification or PCA work remarkably well.

But in applications such as text validation or speech recognition, data tends to be non-linear and incomplete. It is also subject to human error and variations of language, dialect or handwriting. In such applications, ANN works with good accuracy.

For large and complex data where a small amount of time series data is available or where large amounts of noise exist, standard ML approaches can become difficult, even impossible. They may require weeks of a statistician’s time to build, as the data clean up and pre-processing effort is huge.

Neural networks accommodate circumstances where the existing data has useful information to offer, but it might be clouded by factors mentioned above. Neural networks can also account for mixtures of continuous and categorical data.

To build a predictive model for a complex system, ANN does not require a statistician and domain expert to screen through every possible combination of variables. Thus, the neural network approach can dramatically reduce the time required to build a model.

## Sample Code

• # Python sample to build a 2 layer feed forward neural network with back propagation
# Source: https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6
# Accessed: 2019-08-04

class NeuralNetwork:
def __init__(self, x, y):
self.input      = x
self.weights1   = np.random.rand(self.input.shape[1],4)
self.weights2   = np.random.rand(4,1)
self.y          = y
self.output     = np.zeros(self.y.shape)

def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2))

def backprop(self):
# application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
d_weights2 = np.dot(self.layer1.T,
(2*(self.y - self.output) * sigmoid_derivative(self.output)))
d_weights1 = np.dot(self.input.T,
(np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T)
* sigmoid_derivative(self.layer1)))

# update the weights with the derivative (slope) of the loss function
self.weights1 += d_weights1
self.weights2 += d_weights2

## Milestones

1943

Neurophysiologist Warren McCulloch and mathematician Walter Pitts write a paper on how neurons might work. In order to describe how neurons in the brain might work, they model a simple neural network using electrical circuits.

1948

Donald Hebb in his book The Organization of Behaviour proposes a model of learning based on neural plasticity. Later called Hebbian Learning it's often summarized by the phrase "cells that fire together, wire together".

1958

Frank Rosenblatt, a psychologist at Cornell proposes the idea of a Perceptron, modeled on the McCulloch-Pitts neuron.

1960

Widrow and Hoff develop a learning procedure that examines weight values and determine output of perceptrons accordingly.

1975

The first multilayered network is developed. It's an unsupervised network.

1975

A key trigger for renewed interest in neural networks and learning is Werbos's backpropagation algorithm that made the training of multi-layer networks feasible and efficient.

1985

American Institute of Physics, establishes a Neural Networks in Computing annual meeting.

1987

The first International Conference on Neural Networks is organized by the Institute of Electrical and Electronics Engineers (IEEE).

2009

Recurrent neural networks and deep feedforward neural networks developed in Schmidhuber's research group win eight international competitions in pattern recognition and machine learning.

2010

Backpropagation training through max-pooling is accelerated by GPUs and shown to perform better than other pooling variants.

2012

Ng and Dean create a network that learns to recognize higher level concepts such as cats, only from watching unlabeled images taken from YouTube videos.

Author
No. of Edits
No. of Chats
DevCoins
4
1
1315
4
2
319
1
0
11
1762
Words
4
Chats
9
Edits
4
Likes
1672
Hits

## Cite As

Devopedia. 2020. "Artificial Neural Network." Version 9, August 17. Accessed 2020-09-22. https://devopedia.org/artificial-neural-network
• Site Map