Artificial Neural Network
 Summary

Discussion
 In what ways do ANN replicate the functioning of the human neural network?
 What is the structure and function of an ANN?
 What are some key terms used in describing ANN?
 What are the most common types of Artificial Neural Networks used in ML?
 How does an ANN learn?
 What are the key differentiators between traditional ML techniques and ANN?
 When to use and not to use Artificial Neural Networks?
 Milestones
 Sample Code
 References
 Further Reading
 Article Stats
 Cite As
Artificial Neural Network (ANN) belongs to the field of Machine Learning. It consists of computational models inspired from the human brain and biological neural networks. The goal is to simulate human intelligence, reasoning and memory to solve forecasting, pattern recognition and classification problems.^{}
ANN is effective in scenarios where traditional ML methods such as regression, time series analysis or PCA cannot perform or forecast accurately. This could be because of data bias, mix of continuous and categorical data, unclean or uncertain data.
Complex networks with multiple layers, nodes and neurons are possible today, thanks to dramatic increase in computing power, superefficient GPUs and Big Data. A modern approach to ANN known as Deep Learning which processes and transforms data cascading through hierarchical layers, has gained immense prominence. Computer vision, character and image recognition, speech detection and NLP are the popular applications of ANN.
Discussion
In what ways do ANN replicate the functioning of the human neural network? In a biological neural network, nerve cells (neurons) are interconnected by signal transmitters (synapses) which pass on electrical/chemical signals to a target neuron. By summation of potentials, either a signal to excite (positive) or inhibit (negative) is transmitted. Human brain contains about 86 billion neurons on average.^{}
When stimulated by an electrical pulse, neurotransmitters are released. They cross into the synaptic gap between neurons and bind to chemical receptors in the receiving neuron. This affects the potential charge of the receiving neuron, and starts up a new electrical signal in the receiving neuron. The whole process takes less than 1/500th of a second. As a message moves from one neuron to another, it is converted from an electrical signal to a chemical signal and back in an ongoing chain of events which is the basis of all brain activity.^{}
To draw parallels in ANN, a network of data elements called artificial neurons receive input and change their internal state (activation) based on that input. They then produce a result depending on the input value and the activation function.
What is the structure and function of an ANN? The basic structure of an ANN involves a network of artificial neurons arranged in layers  one input layer, one output layer and one or more hidden layers in between.
Each neuron with the exception of those in the input layer, receives and processes stimuli (inputs) from other neurons. The processed information is available at the output end of the neuron.^{}
Every connection has a weight (positive or negative) attached to it. Positive weights activate the neuron while negative weights inhibit it. The figure shows a network structure with inputs (x1, x2, … xm) being connected to a neuron with weights (w1, w2, … wm) on each connection. The neuron sums all the signals it receives, with each signal being multiplied by its associated weights on the connection.
This output is then passed through a transfer (activation) function g(y) that is normally nonlinear to give the final output y. By comparing this output to actual value we determine the error. The process is repeated over several iterations until the error is within acceptable limits.^{}
What are some key terms used in describing ANN? We note the following terms:^{}
 Perceptron  A linear binary classifier used in supervised learning. It acts as a singlenode neural network. A neural network consists of multilayer perceptrons (MLPs). Perceptrons consist of 4 parts – inputs, weights & bias, weighted sum and activation function. All inputs x are multiplied with their weights w, then added to get a weighted sum. This sum is applied to the activation function, to get output y of the perceptron.
 Activation Function  A transfer function used to determine output of a neural network. It maps the resulting values into ranges (0 to 1), (1 to 1) etc. They may be linear or nonlinear, the sigmoid function being one of them.
 Weights and Biases  Weights assigned to an input indicate its strength. Higher the weight, greater the influence of that input on the outcome. Bias value allows you to shift the activation function curve up or down.
 Loss Function  A way to evaluate "goodness" of predictions. It quantifies the gap between predicted and actual values. Sum Of Squares Error is an example loss function.
What are the most common types of Artificial Neural Networks used in ML? We note the following types of ANNs:^{}
 Feed Forward Networks  Simplest form of ANN where data travels in one direction through a network of perceptrons, from input layer towards output layer. Error in prediction is fed back to update weights and biases using backpropagation. Over multiple iterations, weights and bias values are tuned so that the predicted values converge on the actuals. Activation functions commonly used are sigmoid, tanh or RELU.
 Recurrent Neural Networks  Works on the principle of saving the output of a layer and feeding this back to the input to help in predicting the outcome of the layer. From one timestep to the next, each neuron remembers some information it had in the previous timestep. This makes each neuron act like a memory cell in performing computations (LSTM).
 Radial Basis Function  Functions that have a distance criterion with respect to a center. It has an input layer, radial basis hidden layer and output layer. Commonly used activation functions are Gaussian and multiquadratic. Applied in Power Restoration Systems when restoration happens from core priority areas to the periphery.
How does an ANN learn? The ANN learns through an iterative training routine where weights and biases are continually adjusted to improve strength of the prediction. Steps are:
 Identify input values, assign random initial values as weights to these inputs.
 For supervised learning, segregate training and test data sets.
 Finalise the hidden layers and nodes in them.
 Identify number of epochs and batch sizes. Training data set is generally sliced into batches. An epoch is one complete pass through the whole training set. Training typically requires several epochs.^{}
 Pass the training set through the layers. Determine predicted outcome at the end of the iteration.
 Study extent of miscalculation, now adjust the weights (W = W + ΔW) through backpropagation. Amount of change in W per iteration is called Learning Rate. Large values of learning rate would train the network faster, but may result in overshooting the optimal solution. We need a balance, determined by the gradient descent method of optimising the loss/cost function.^{}
 After several iterations, network achieves acceptable reliability in predicted outcome. Verify by applying the test data set, check for model accuracy.
Now the ANN is set to have ‘learned’ and is ready for deployment.^{}
What are the key differentiators between traditional ML techniques and ANN? ANN and traditional ML techniques like logistic regression are algorithms meant to do the same thing  classification of data. However, while logistic regression is a statistical method, ANN is a heuristic method modelled on the human brain.
In many cases, simple neural network configurations yield the same solution as many traditional statistical applications. For example, a singlelayer, feed forward neural network with linear activation for its output perceptron is equivalent to a general linear regression fit. When you use sigmoid activation function in ANN, it behaves like logistic regression.^{}
However, one of the unique aspects of an ANN is the presence of its hidden layers. Since movement of data between these layers is automatic, these steps cannot be statistically expressed. Hence debugging or tracing data values through these intermediate steps isn't possible. Whereas in regular ML algorithms, the input to output transition can be traced entirely.
The ability of data to reiterate through hidden layers allows hierarchical processing. This is the reason ANN forms the basis of Deep Learning, while other ML techniques are unsuitable.
When to use and not to use Artificial Neural Networks? When data is well structured, free of inconsistencies and somewhat linear in nature, traditional ML models such as linear regression, classification or PCA work remarkably well.
But in applications such as text validation or speech recognition, data tends to be nonlinear and incomplete. It is also subject to human error and variations of language, dialect or handwriting. In such applications, ANN works with good accuracy.^{}
For large and complex data where a small amount of time series data is available or where large amounts of noise exist, standard ML approaches can become difficult, even impossible. They may require weeks of a statistician’s time to build, as the data clean up and preprocessing effort is huge.
Neural networks accommodate circumstances where the existing data has useful information to offer, but it might be clouded by factors mentioned above. Neural networks can also account for mixtures of continuous and categorical data.
To build a predictive model for a complex system, ANN does not require a statistician and domain expert to screen through every possible combination of variables. Thus, the neural network approach can dramatically reduce the time required to build a model.^{}
Milestones
Neurophysiologist Warren McCulloch and mathematician Walter Pitts write a paper on how neurons might work. In order to describe how neurons in the brain might work, they model a simple neural network using electrical circuits.^{}
Donald Hebb in his book The Organization of Behaviour proposes a model of learning based on neural plasticity. Later called Hebbian Learning it's often summarized by the phrase "cells that fire together, wire together".^{}
Frank Rosenblatt, a psychologist at Cornell proposes the idea of a Perceptron, modeled on the McCullochPitts neuron.^{}
Widrow and Hoff develop a learning procedure that examines weight values and determine output of perceptrons accordingly.^{}
The first multilayered network is developed. It's an unsupervised network.^{}
A key trigger for renewed interest in neural networks and learning is Werbos's backpropagation algorithm that made the training of multilayer networks feasible and efficient.^{}
American Institute of Physics, establishes a Neural Networks in Computing annual meeting.^{}
The first International Conference on Neural Networks is organized by the Institute of Electrical and Electronics Engineers (IEEE).^{}
Recurrent neural networks and deep feedforward neural networks developed in Schmidhuber's research group win eight international competitions in pattern recognition and machine learning.^{}
Backpropagation training through maxpooling is accelerated by GPUs and shown to perform better than other pooling variants.^{}
Ng and Dean create a network that learns to recognize higher level concepts such as cats, only from watching unlabeled images taken from YouTube videos.^{}
Sample Code
References
 Ahire, Jayesh Bapu. 2018. "The Artificial Neural Networks Handbook: Part 4." Accessed 20190731.
 Chigali, Nikhil. 2018. "Simple Perceptron Training Algorithm:Explained." Accessed 20190731.
 DACS. 2019. "3.0 History of Neural Networks." DoD DACS. Accessed 20190620.
 Escontrela, Alejandro. 2018. "Convolutional Neural Networks from the ground up" Accessed 20190620.
 Fukushima, Kunihiko. 1975. "Cognitron: A selforganizing multilayered neural network" Accessed 20190620.
 Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks  A Whitepaper." Accessed 20190620.
 Jiaconda. 2016. "A Concise History of Neural Networks." Accessed 20190620.
 Jones, Edward. 2004. "An Introduction to Neural Networks  A White Paper." Accessed 20190620.
 Juergen. 2019. "Juergen Schmidhuber's Home Page" Accessed 20190620.
 Keysers, Christian. Valeria Gazzola. 2014. "Hebbian learning and predictive mirror neurons for actions, sensations and emotions" National Center for Biotechnology Information, U.S. National Library of Medicine. Accessed 20190620.
 Loiseau, JeanChristophe B. 2019. "Rosenblatt’s perceptron, the first modern neural network" Accessed 20190620.
 Loy, James. 2018. "How to build your own Neural Network from scratch in Python." Accessed 20190731.
 Maladkar, Kishan. 2018. "6 Types of Artificial Neural Networks Currently Being Used in Machine Learning." Analytics India Magazine, January 15. Accessed 20200817.
 Mastin, Luke. 2019. "Neurons & Synapses." The Human Memory. Accessed 20190731.
 ML Glossary. 2020. "Neural Networks: Concepts." ML Glossary, January 25. Accessed 20200817.
 Roell, Jason. 2017. "From Fiction to Reality: A Beginner’s Guide to Artificial Neural Networks." Accessed 20190620.
 Sharma, Sagar. 2017. "What the Hell is Perceptron?" Accessed 20190731.
 Sharma, Sagar. 2017b. "Activation Functions in Neural Networks." Accessed 20190731.
 Stanford. 2019. "Neural Networks History: The 1940's to the 1970's." Accessed 20190620.
 Vieira. 2017. "Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications." Neuroscience & Biobehavioral Reviews. Accessed 20190731.
 Werbos, Paul J. 1990. "Backpropagation Through Time: What It Does and How to do it." Accessed 20190620.
 Widrow, Bernard. Michael A. Lehr. 2003. "Perceptrons, Adalines, and Backpropagation" MIT Press. Accessed 20190620.
 Wikipedia. 2020. "Artificial neural network." Wikipedia, August 14. Accessed 20200817.
 Zhang, QJ. 2000. "Neural Networks for RF and Microwave Design." Accessed 20190731.
Further Reading
 Chiarandini, Marco. 2019. "Machine Learning: Linear Regression and Neural Networks." Accessed 20190620.
 Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks  A Whitepaper." Accessed 20190620.
 Jones, Edward. 2004. "An Introduction to Neural Networks  A White Paper." Accessed 20190620.
 Hansen, Casper Bøgeskov. 2019. "Neural Networks: Feedforward and Backpropagation Explained & Optimization." Machine Learning From Scratch, August 05. Accessed 20190820.
 Moawad, Assaad. 2018. "Neural networks and backpropagation explained in a simple way." Medium, February 01. Accessed 20190820.
 Mehta, Anukrati. 2019. "A Complete Guide to Types of Neural Networks." Blog, Digital Vidya, January 25. Accessed 20190820.