# Artificial Neural Network

## Summary

Artificial Neural Network (ANN) belongs to the field of Machine Learning. It consists of computational models inspired from the human brain and biological neural networks. The goal is to simulate human intelligence, reasoning and memory to solve forecasting, pattern recognition and classification problems.^{}

ANN is effective in scenarios where traditional ML methods such as regression, time series analysis or PCA cannot perform or forecast accurately. This could be because of data bias, mix of continuous and categorical data, unclean or uncertain data.

Complex networks with multiple layers, nodes and neurons are possible today, thanks to dramatic increase in computing power, super-efficient GPUs and Big Data. A modern approach to ANN known as Deep Learning which processes and transforms data cascading through hierarchical layers, has gained immense prominence. Computer vision, character and image recognition, speech detection and NLP are the popular applications of ANN.

## Milestones

## Discussion

In what ways do ANN replicate the functioning of the human neural network? In a biological neural network, nerve cells (neurons) are interconnected by signal transmitters (synapses) which pass on electrical/chemical signals to a target neuron. By summation of potentials, either a signal to excite (positive) or inhibit (negative) is transmitted. Human brain contains about 86 billion neurons on average.

^{}When stimulated by an electrical pulse, neurotransmitters are released. They cross into the synaptic gap between neurons and bind to chemical receptors in the receiving neuron. This affects the potential charge of the receiving neuron, and starts up a new electrical signal in the receiving neuron. The whole process takes less than 1/500th of a second. As a message moves from one neuron to another, it is converted from an electrical signal to a chemical signal and back in an ongoing chain of events which is the basis of all brain activity.

^{}To draw parallels in ANN, a network of data elements called artificial neurons receive input and change their internal state (activation) based on that input. They then produce a result depending on the input value and the activation function.

What is the structure and function of an ANN? The basic structure of an ANN involves a network of artificial neurons arranged in layers - one input layer, one output layer and one or more hidden layers in between.

Each neuron with the exception of those in the input layer, receives and processes stimuli (inputs) from other neurons. The processed information is available at the output end of the neuron.

^{}Every connection has a weight (positive or negative) attached to it. Positive weights activate the neuron while negative weights inhibit it. The figure shows a network structure with inputs (x1, x2, … xm) being connected to a neuron with weights (w1, w2, … wm) on each connection. The neuron sums all the signals it receives, with each signal being multiplied by its associated weights on the connection.

This output is then passed through a transfer (activation) function g(y) that is normally non-linear to give the final output y. By comparing this output to actual value we determine the error. The process is repeated over several iterations until the error is within acceptable limits.

^{}What are some key terms used in describing ANN? We note the following terms:

^{}**Perceptron**- A linear binary classifier used in supervised learning. It acts as a single-node neural network. A neural network consists of multi-layer perceptrons (MLPs). Perceptrons consist of 4 parts – inputs, weights & bias, weighted sum and activation function. All inputs x are multiplied with their weights w, then added to get a weighted sum. This sum is applied to the activation function, to get output y of the perceptron.**Activation Function**- A transfer function used to determine output of a neural network. It maps the resulting values into ranges (0 to 1), (-1 to 1) etc. They may be linear or non-linear, the sigmoid function being one of them.**Weights and Biases**- Weights assigned to an input indicate its strength. Higher the weight, greater the influence of that input on the outcome. Bias value allows you to shift the activation function curve up or down.**Loss Function**- A way to evaluate "goodness" of predictions. It quantifies the gap between predicted and actual values.*Sum Of Squares Error*is an example loss function.

What are the most common types of Artificial Neural Networks used in ML? We note the following types of ANNs:

^{}**Feed Forward Networks**- Simplest form of ANN where data travels in one direction through a network of perceptrons, from input layer towards output layer. Error in prediction is fed back to update weights and biases using backpropagation. Over multiple iterations, weights and bias values are tuned so that the predicted values converge on the actuals. Activation functions commonly used are sigmoid, tanh or RELU.**Recurrent Neural Networks**- Works on the principle of saving the output of a layer and feeding this back to the input to help in predicting the outcome of the layer. From one time-step to the next, each neuron remembers some information it had in the previous time-step. This makes each neuron act like a memory cell in performing computations (LSTM).**Radial Basis Function**- Functions that have a distance criterion with respect to a center. It has an input layer, radial basis hidden layer and output layer. Commonly used activation functions are Gaussian and multi-quadratic. Applied in Power Restoration Systems when restoration happens from core priority areas to the periphery.

How does an ANN learn? The ANN learns through an iterative training routine where weights and biases are continually adjusted to improve strength of the prediction. Steps are:

- Identify input values, assign random initial values as weights to these inputs.
- For supervised learning, segregate training and test data sets.
- Finalise the hidden layers and nodes in them.
- Identify number of epochs and batch sizes. Training data set is generally sliced into
**batches**. An**epoch**is one complete pass through the whole training set. Training typically requires several epochs.^{} - Pass the training set through the layers. Determine predicted outcome at the end of the iteration.
- Study extent of miscalculation, now adjust the weights (W = W + ΔW) through backpropagation. Amount of change in W per iteration is called
**Learning Rate**. Large values of learning rate would train the network faster, but may result in overshooting the optimal solution. We need a balance, determined by the gradient descent method of optimising the loss/cost function.^{} - After several iterations, network achieves acceptable reliability in predicted outcome. Verify by applying the test data set, check for model accuracy.

Now the ANN is set to have ‘learned’ and is ready for deployment.

^{}What are the key differentiators between traditional ML techniques and ANN? ANN and traditional ML techniques like logistic regression are algorithms meant to do the same thing - classification of data. However, while logistic regression is a statistical method, ANN is a heuristic method modelled on the human brain.

In many cases, simple neural network configurations yield the same solution as many traditional statistical applications. For example, a single-layer, feed forward neural network with linear activation for its output perceptron is equivalent to a general linear regression fit. When you use sigmoid activation function in ANN, it behaves like logistic regression.

^{}However, one of the unique aspects of an ANN is the presence of its hidden layers. Since movement of data between these layers is automatic, these steps cannot be statistically expressed. Hence debugging or tracing data values through these intermediate steps isn't possible. Whereas in regular ML algorithms, the input to output transition can be traced entirely.

The ability of data to reiterate through hidden layers allows hierarchical processing. This is the reason ANN forms the basis of Deep Learning, while other ML techniques are unsuitable.

When to use and not to use Artificial Neural Networks? When data is well structured, free of inconsistencies and somewhat linear in nature, traditional ML models such as linear regression, classification or PCA work remarkably well.

But in applications such as text validation or speech recognition, data tends to be non-linear and incomplete. It is also subject to human error and variations of language, dialect or handwriting. In such applications, ANN works with good accuracy.

^{}For large and complex data where a small amount of time series data is available or where large amounts of noise exist, standard ML approaches can become difficult, even impossible. They may require weeks of a statistician’s time to build, as the data clean up and pre-processing effort is huge.

Neural networks accommodate circumstances where the existing data has useful information to offer, but it might be clouded by factors mentioned above. Neural networks can also account for mixtures of continuous and categorical data.

To build a predictive model for a complex system, ANN does not require a statistician and domain expert to screen through every possible combination of variables. Thus, the neural network approach can dramatically reduce the time required to build a model.

^{}

## Sample Code

## References

- Ahire, Jayesh Bapu. 2018. "The Artificial Neural Networks Handbook: Part 4." Accessed 2019-07-31.
- Chigali, Nikhil. 2018. "Simple Perceptron Training Algorithm:Explained." Accessed 2019-07-31.
- DACS. 2019. "3.0 History of Neural Networks." DoD DACS. Accessed 2019-06-20.
- Escontrela, Alejandro. 2018. "Convolutional Neural Networks from the ground up" Accessed 2019-06-20.
- Fukushima, Kunihiko. 1975. "Cognitron: A self-organizing multilayered neural network" Accessed 2019-06-20.
- Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks - A Whitepaper." Accessed 2019-06-20.
- Jiaconda. 2016. "A Concise History of Neural Networks." Accessed 2019-06-20.
- Jones, Edward. 2004. "An Introduction to Neural Networks - A White Paper." Accessed 2019-06-20.
- Juergen. 2019. "Juergen Schmidhuber's Home Page" Accessed 2019-06-20.
- Keysers, Christian. Valeria Gazzola. 2014. "Hebbian learning and predictive mirror neurons for actions, sensations and emotions" National Center for Biotechnology Information, U.S. National Library of Medicine. Accessed 2019-06-20.
- Loiseau, Jean-Christophe B. 2019. "Rosenblatt’s perceptron, the first modern neural network" Accessed 2019-06-20.
- Loy, James. 2018. "How to build your own Neural Network from scratch in Python." Accessed 2019-07-31.
- ML Glossary. 2020. "Neural Networks: Concepts." ML Glossary, January 25. Accessed 2020-08-17.
- Maladkar, Kishan. 2018. "6 Types of Artificial Neural Networks Currently Being Used in Machine Learning." Analytics India Magazine, January 15. Accessed 2020-08-17.
- Mastin, Luke. 2019. "Neurons & Synapses." The Human Memory. Accessed 2019-07-31.
- Roell, Jason. 2017. "From Fiction to Reality: A Beginner’s Guide to Artificial Neural Networks." Accessed 2019-06-20.
- Sharma, Sagar. 2017. "What the Hell is Perceptron?" Accessed 2019-07-31.
- Sharma, Sagar. 2017b. "Activation Functions in Neural Networks." Accessed 2019-07-31.
- Stanford. 2019. "Neural Networks History: The 1940's to the 1970's." Accessed 2019-06-20.
- Vieira. 2017. "Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications." Neuroscience & Biobehavioral Reviews. Accessed 2019-07-31.
- Werbos, Paul J. 1990. "Backpropagation Through Time: What It Does and How to do it." Accessed 2019-06-20.
- Widrow, Bernard. Michael A. Lehr. 2003. "Perceptrons, Adalines, and Backpropagation" MIT Press. Accessed 2019-06-20.
- Wikipedia. 2020. "Artificial neural network." Wikipedia, August 14. Accessed 2020-08-17.
- Zhang, QJ. 2000. "Neural Networks for RF and Microwave Design." Accessed 2019-07-31.

## Milestones

## Tags

## See Also

- Machine Learning
- Artificial Intelligence
- Natural Language Processing
- Linear Regression
- Deep Learning
- TensorFlow

## Further Reading

- Chiarandini, Marco. 2019. "Machine Learning: Linear Regression and Neural Networks." Accessed 2019-06-20.
- Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks - A Whitepaper." Accessed 2019-06-20.
- Jones, Edward. 2004. "An Introduction to Neural Networks - A White Paper." Accessed 2019-06-20.
- Hansen, Casper Bøgeskov. 2019. "Neural Networks: Feedforward and Backpropagation Explained & Optimization." Machine Learning From Scratch, August 05. Accessed 2019-08-20.
- Moawad, Assaad. 2018. "Neural networks and back-propagation explained in a simple way." Medium, February 01. Accessed 2019-08-20.
- Mehta, Anukrati. 2019. "A Complete Guide to Types of Neural Networks." Blog, Digital Vidya, January 25. Accessed 2019-08-20.