Artificial Neural Network

Learning through ANN. Source: Sharma. 2017b.
Learning through ANN. Source: Sharma. 2017b.

Artificial Neural Network (ANN) belongs to the field of Machine Learning. It consists of computational models inspired from the human brain and biological neural networks. The goal is to simulate human intelligence, reasoning and memory to solve forecasting, pattern recognition and classification problems.

ANN is effective in scenarios where traditional ML methods such as regression, time series analysis or PCA cannot perform or forecast accurately. This could be because of data bias, mix of continuous and categorical data, unclean or uncertain data.

Complex networks with multiple layers, nodes and neurons are possible today, thanks to dramatic increase in computing power, super-efficient GPUs and Big Data. A modern approach to ANN known as Deep Learning which processes and transforms data cascading through hierarchical layers, has gained immense prominence. Computer vision, character and image recognition, speech detection and NLP are the popular applications of ANN.


  • In what ways do ANN replicate the functioning of the human neural network?
    Biological Vs Artificial Neural Network. Source: Roell. 2017.
    Biological Vs Artificial Neural Network. Source: Roell. 2017.

    In a biological neural network, nerve cells (neurons) are interconnected by signal transmitters (synapses) which pass on electrical/chemical signals to a target neuron. By summation of potentials, either a signal to excite (positive) or inhibit (negative) is transmitted. Human brain contains about 86 billion neurons on average.

    When stimulated by an electrical pulse, neurotransmitters are released. They cross into the synaptic gap between neurons and bind to chemical receptors in the receiving neuron. This affects the potential charge of the receiving neuron, and starts up a new electrical signal in the receiving neuron. The whole process takes less than 1/500th of a second. As a message moves from one neuron to another, it is converted from an electrical signal to a chemical signal and back in an ongoing chain of events which is the basis of all brain activity.

    To draw parallels in ANN, a network of data elements called artificial neurons receive input and change their internal state (activation) based on that input. They then produce a result depending on the input value and the activation function.

  • What is the structure and function of an ANN?
    Structure of a Neuron and ANN. Source: Vieira. 2017.
    Structure of a Neuron and ANN. Source: Vieira. 2017.

    The basic structure of an ANN involves a network of artificial neurons arranged in layers - one input layer, one output layer and one or more hidden layers in between.

    Each neuron with the exception of those in the input layer, receives and processes stimuli (inputs) from other neurons. The processed information is available at the output end of the neuron.

    Every connection has a weight (positive or negative) attached to it. Positive weights activate the neuron while negative weights inhibit it. The figure shows a network structure with inputs (x1, x2, … xm) being connected to a neuron with weights (w1, w2, … wm) on each connection. The neuron sums all the signals it receives, with each signal being multiplied by its associated weights on the connection.

    This output is then passed through a transfer (activation) function g(y) that is normally non-linear to give the final output y. By comparing this output to actual value we determine the error. The process is repeated over several iterations until the error is within acceptable limits.

  • What are some key terms used in describing ANN?
    Perceptron example with weights. Source: Sharma. 2017.
    Perceptron example with weights. Source: Sharma. 2017.

    We note the following terms:

    • Perceptron - A linear binary classifier used in supervised learning. It acts as a single-node neural network. A neural network consists of multi-layer perceptrons (MLPs). Perceptrons consist of 4 parts – inputs, weights & bias, weighted sum and activation function. All inputs x are multiplied with their weights w, then added to get a weighted sum. This sum is applied to the activation function, to get output y of the perceptron.
    • Activation Function - A transfer function used to determine output of a neural network. It maps the resulting values into ranges (0 to 1), (-1 to 1) etc. They may be linear or non-linear, the sigmoid function being one of them.
    • Weights and Biases - Weights assigned to an input indicate its strength. Higher the weight, greater the influence of that input on the outcome. Bias value allows you to shift the activation function curve up or down.
    • Loss Function - A way to evaluate "goodness" of predictions. It quantifies the gap between predicted and actual values. Sum Of Squares Error is an example loss function.
  • What are the most common types of Artificial Neural Networks used in ML?

    We note the following types of ANNs:

    • Feed Forward Networks - Simplest form of ANN where data travels in one direction through a network of perceptrons, from input layer towards output layer. Error in prediction is fed back to update weights and biases using backpropagation. Over multiple iterations, weights and bias values are tuned so that the predicted values converge on the actuals. Activation functions commonly used are sigmoid, tanh or RELU.
    • Recurrent Neural Networks - Works on the principle of saving the output of a layer and feeding this back to the input to help in predicting the outcome of the layer. From one time-step to the next, each neuron remembers some information it had in the previous time-step. This makes each neuron act like a memory cell in performing computations (LSTM).
    • Radial Basis Function - Functions that have a distance criterion with respect to a center. It has an input layer, radial basis hidden layer and output layer. Commonly used activation functions are Gaussian and multi-quadratic. Applied in Power Restoration Systems when restoration happens from core priority areas to the periphery.
  • How does an ANN learn?

    The ANN learns through an iterative training routine where weights and biases are continually adjusted to improve strength of the prediction. Steps are:

    • Identify input values, assign random initial values as weights to these inputs.
    • For supervised learning, segregate training and test data sets.
    • Finalise the hidden layers and nodes in them.
    • Identify number of epochs and batch sizes. Training data set is generally sliced into batches. An epoch is one complete pass through the whole training set. Training typically requires several epochs.
    • Pass the training set through the layers. Determine predicted outcome at the end of the iteration.
    • Study extent of miscalculation, now adjust the weights (W = W + ΔW) through backpropagation. Amount of change in W per iteration is called Learning Rate. Large values of learning rate would train the network faster, but may result in overshooting the optimal solution. We need a balance, determined by the gradient descent method of optimising the loss/cost function.
    • After several iterations, network achieves acceptable reliability in predicted outcome. Verify by applying the test data set, check for model accuracy.

    Now the ANN is set to have ‘learned’ and is ready for deployment.

  • What are the key differentiators between traditional ML techniques and ANN?

    ANN and traditional ML techniques like logistic regression are algorithms meant to do the same thing - classification of data. However, while logistic regression is a statistical method, ANN is a heuristic method modelled on the human brain.

    In many cases, simple neural network configurations yield the same solution as many traditional statistical applications. For example, a single-layer, feed forward neural network with linear activation for its output perceptron is equivalent to a general linear regression fit. When you use sigmoid activation function in ANN, it behaves like logistic regression.

    However, one of the unique aspects of an ANN is the presence of its hidden layers. Since movement of data between these layers is automatic, these steps cannot be statistically expressed. Hence debugging or tracing data values through these intermediate steps isn't possible. Whereas in regular ML algorithms, the input to output transition can be traced entirely.

    The ability of data to reiterate through hidden layers allows hierarchical processing. This is the reason ANN forms the basis of Deep Learning, while other ML techniques are unsuitable.

  • When to use and not to use Artificial Neural Networks?

    When data is well structured, free of inconsistencies and somewhat linear in nature, traditional ML models such as linear regression, classification or PCA work remarkably well.

    But in applications such as text validation or speech recognition, data tends to be non-linear and incomplete. It is also subject to human error and variations of language, dialect or handwriting. In such applications, ANN works with good accuracy.

    For large and complex data where a small amount of time series data is available or where large amounts of noise exist, standard ML approaches can become difficult, even impossible. They may require weeks of a statistician’s time to build, as the data clean up and pre-processing effort is huge.

    Neural networks accommodate circumstances where the existing data has useful information to offer, but it might be clouded by factors mentioned above. Neural networks can also account for mixtures of continuous and categorical data.

    To build a predictive model for a complex system, ANN does not require a statistician and domain expert to screen through every possible combination of variables. Thus, the neural network approach can dramatically reduce the time required to build a model.



Neurophysiologist Warren McCulloch and mathematician Walter Pitts write a paper on how neurons might work. In order to describe how neurons in the brain might work, they model a simple neural network using electrical circuits.


Donald Hebb in his book The Organization of Behaviour proposes a model of learning based on neural plasticity. Later called Hebbian Learning it's often summarized by the phrase "cells that fire together, wire together".


Frank Rosenblatt, a psychologist at Cornell proposes the idea of a Perceptron, modeled on the McCulloch-Pitts neuron.


Widrow and Hoff develop a learning procedure that examines weight values and determine output of perceptrons accordingly.


The first multilayered network is developed. It's an unsupervised network.


A key trigger for renewed interest in neural networks and learning is Werbos's backpropagation algorithm that made the training of multi-layer networks feasible and efficient.


American Institute of Physics, establishes a Neural Networks in Computing annual meeting.


The first International Conference on Neural Networks is organized by the Institute of Electrical and Electronics Engineers (IEEE).


Recurrent neural networks and deep feedforward neural networks developed in Schmidhuber's research group win eight international competitions in pattern recognition and machine learning.


Backpropagation training through max-pooling is accelerated by GPUs and shown to perform better than other pooling variants.


Ng and Dean create a network that learns to recognize higher level concepts such as cats, only from watching unlabeled images taken from YouTube videos.

Sample Code

  • # Python sample to build a 2 layer feed forward neural network with back propagation
    # Source:
    # Accessed: 2019-08-04
    class NeuralNetwork:
        def __init__(self, x, y):
            self.input      = x
            self.weights1   = np.random.rand(self.input.shape[1],4)
            self.weights2   = np.random.rand(4,1)
            self.y          = y
            self.output     = np.zeros(self.y.shape)
        def feedforward(self):
            self.layer1 = sigmoid(, self.weights1))
            self.output = sigmoid(, self.weights2))
        def backprop(self):
            # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
            d_weights2 =,
                                (2*(self.y - self.output) * sigmoid_derivative(self.output)))
            d_weights1 =,
                                (*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T)
                                * sigmoid_derivative(self.layer1)))
            # update the weights with the derivative (slope) of the loss function
            self.weights1 += d_weights1
            self.weights2 += d_weights2


  1. Ahire, Jayesh Bapu. 2018. "The Artificial Neural Networks Handbook: Part 4." Accessed 2019-07-31.
  2. Chigali, Nikhil. 2018. "Simple Perceptron Training Algorithm:Explained." Accessed 2019-07-31.
  3. DACS. 2019. "3.0 History of Neural Networks." DoD DACS. Accessed 2019-06-20.
  4. Escontrela, Alejandro. 2018. "Convolutional Neural Networks from the ground up" Accessed 2019-06-20.
  5. Fukushima, Kunihiko. 1975. "Cognitron: A self-organizing multilayered neural network" Accessed 2019-06-20.
  6. Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks - A Whitepaper." Accessed 2019-06-20.
  7. Jiaconda. 2016. "A Concise History of Neural Networks." Accessed 2019-06-20.
  8. Jones, Edward. 2004. "An Introduction to Neural Networks - A White Paper." Accessed 2019-06-20.
  9. Juergen. 2019. "Juergen Schmidhuber's Home Page" Accessed 2019-06-20.
  10. Keysers, Christian. Valeria Gazzola. 2014. "Hebbian learning and predictive mirror neurons for actions, sensations and emotions" National Center for Biotechnology Information, U.S. National Library of Medicine. Accessed 2019-06-20.
  11. Loiseau, Jean-Christophe B. 2019. "Rosenblatt’s perceptron, the first modern neural network" Accessed 2019-06-20.
  12. Loy, James. 2018. "How to build your own Neural Network from scratch in Python." Accessed 2019-07-31.
  13. Maladkar, Kishan. 2018. "6 Types of Artificial Neural Networks Currently Being Used in Machine Learning." Analytics India Magazine, January 15. Accessed 2020-08-17.
  14. Mastin, Luke. 2019. "Neurons & Synapses." The Human Memory. Accessed 2019-07-31.
  15. ML Glossary. 2020. "Neural Networks: Concepts." ML Glossary, January 25. Accessed 2020-08-17.
  16. Roell, Jason. 2017. "From Fiction to Reality: A Beginner’s Guide to Artificial Neural Networks." Accessed 2019-06-20.
  17. Sharma, Sagar. 2017. "What the Hell is Perceptron?" Accessed 2019-07-31.
  18. Sharma, Sagar. 2017b. "Activation Functions in Neural Networks." Accessed 2019-07-31.
  19. Stanford. 2019. "Neural Networks History: The 1940's to the 1970's." Accessed 2019-06-20.
  20. Vieira. 2017. "Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications." Neuroscience & Biobehavioral Reviews. Accessed 2019-07-31.
  21. Werbos, Paul J. 1990. "Backpropagation Through Time: What It Does and How to do it." Accessed 2019-06-20.
  22. Widrow, Bernard. Michael A. Lehr. 2003. "Perceptrons, Adalines, and Backpropagation" MIT Press. Accessed 2019-06-20.
  23. Wikipedia. 2020. "Artificial neural network." Wikipedia, August 14. Accessed 2020-08-17.
  24. Zhang, QJ. 2000. "Neural Networks for RF and Microwave Design." Accessed 2019-07-31.

Further Reading

  1. Chiarandini, Marco. 2019. "Machine Learning: Linear Regression and Neural Networks." Accessed 2019-06-20.
  2. Goltsman, Kirill. 2017. "Introduction to Artificial Neural Networks - A Whitepaper." Accessed 2019-06-20.
  3. Jones, Edward. 2004. "An Introduction to Neural Networks - A White Paper." Accessed 2019-06-20.
  4. Hansen, Casper Bøgeskov. 2019. "Neural Networks: Feedforward and Backpropagation Explained & Optimization." Machine Learning From Scratch, August 05. Accessed 2019-08-20.
  5. Moawad, Assaad. 2018. "Neural networks and back-propagation explained in a simple way." Medium, February 01. Accessed 2019-08-20.
  6. Mehta, Anukrati. 2019. "A Complete Guide to Types of Neural Networks." Blog, Digital Vidya, January 25. Accessed 2019-08-20.

Article Stats

Author-wise Stats for Article Edits

No. of Edits
No. of Chats

Cite As

Devopedia. 2020. "Artificial Neural Network." Version 9, August 17. Accessed 2020-11-24.
Contributed by
3 authors

Last updated on
2020-08-17 05:50:31