Artificial Neural Network

Article Info

Contributed by
3 authors

Last updated on
2020-08-17 05:50:31

Improve this article

Article Versions

9 2020-08-17 05:50:31
2212,2023 9,2212

By arvindpdmn

Added missing citations to remove warnings.
8 2020-04-26 17:16:42
2023,1561 8,2023

By arvindpdmn

Minor correction of acronym IEEE
7 2019-08-20 09:55:07
1561,1560 7,1561 1

By anuradhac

Updated citations. Rework for comments.
6 2019-08-20 08:17:43
1560,1525 6,1560

By gurumoorthyP

Three more further reading entries.
5 2019-08-04 05:44:35
1525,1524 5,1525

By arvindpdmn

Formatting changes include long line in Sample Code. Highlighting key phrases in Milestones. More citations needed.

Chat Room

Submitting ...

You are editing an existing chat message.
2020-08-17 05:51:22
-

By devbot5S

[URL Check] The following URLs in this article are outdated. Please update.

Redirected URLs:
References: https://towardsdatascience.com/from-fiction-to-reality-a-beginners-guide-to-artificial-neural-networks-d0411777571b → https://towardsdatascience.com/from-fiction-to-reality-a-beginners-guide-to-artificial-neural-networks-d0411777571b?gi=a08a11cbde54
References: http://www.human-memory.net/brain_neurons.html → https://human-memory.net/brain-neurons-synapses/
References: https://towardsdatascience.com/what-the-hell-is-perceptron-626217814f53 → https://towardsdatascience.com/what-the-hell-is-perceptron-626217814f53?gi=1e8d81ba85af
References: https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6 → https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6?gi=8370acc8f6a6
References: https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6 → https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6?gi=44c7793a00df
References: https://datascience.foundation/sciencewhitepaper/introduction-to-artificial-neural-networks-(anns → https://datascience.foundation/sciencewhitepaper
References: https://www.roguewave.com/sites/rw/files/attachments/NeuralNetworkWP2.pdf → https://www.roguewave.com/resources
References: https://towardsdatascience.com/a-concise-history-of-neural-networks-2070655d3fec → https://towardsdatascience.com/a-concise-history-of-neural-networks-2070655d3fec?gi=a8b0352a0668
References: https://towardsdatascience.com/rosenblatts-perceptron-the-very-first-neural-network-37a3ec09038a → https://towardsdatascience.com/rosenblatts-perceptron-the-very-first-neural-network-37a3ec09038a?gi=d7ae194bf1d4
References: https://towardsdatascience.com/convolutional-neural-networks-from-the-ground-up-c67bb41454e1 → https://towardsdatascience.com/convolutional-neural-networks-from-the-ground-up-c67bb41454e1?gi=6cc79f0576d
2019-08-04 05:41:52
-

By arvindpdmn

1. Article is good. Can improve on the number of citations. At least one citation for each milestone. Summary and some answers don't have any answers.

2. Summary "is gaining prominence": I would say DL is already established and widely used. The field is changing so fast that we need to look at recently published sources.

3. Structure and function of ANN nicely explained.

4. Another area that's changing fast: "sigmoid function being most commonly used". RELU seems to be common in recent research. Better to say that sigmoid is an example of activation function without mentioning if it's commonly used.

5. On types of ANN, the description is good. Check out this page: https://towardsdatascience.com/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464
Obviously, we can't explain all of these. Just give an external link in answer for reader to read about the rest. Anyway, I think we'll have a more detailed article later on types of ANN.

6. Will be editing for other minor changes. You can implement the above comments. Thx
2019-08-02 18:51:44
-

By anuradhac

Thank you for the question prompts in your review. Helped a lot in structuring the write up.

Sample code Line 21 exceeds 120 chars. If that cannot be accommodated, I have to split the code into multiple lines. Just check and confirm.

Please review and publish.
2019-07-02 04:18:08
-

By arvindpdmn

Thanks for the questions. For developers, from a technical standpoint, some key ones are missing:

1. What's the structure of an ANN?
Answer would give an overview of input layer, output layer, hidden layer and their interconnections. Introduce terms such as "fully connected". It's possible that you are covering this as part of question one.

2. How does an ANN learn?
Very important for developers to understand this.

3. What's the role of activation functions in the context of ANN?
Just as biological neurons are activated, there are mechanisms to do the same in ANN. We'll have a separate article on activation functions but here we should introduce the concept. Why should they be non-linear? What layers use them? Example activation functions.

4. Why do we use batches and epochs when training an ANN?
Define batches and epochs. Explain the same.

5. What are some essential terms used in ANN?
I include a question like this if I can't introduce the terms as part of other answers. Terms for ANN: backpropagation, forward propagation, hidden layer, activation function, weights and bias, cost function, etc. May not be required if already covered in other answers.

Not sure about these two questions: open source platforms and applications of ANN. Other articles are probably already addressing them unless your answers are unique to ANN. You can give the answers and we can see if they fit here or should be merged into another article: AI, DL, CNN, CNN Architectures, etc.

Learning through ANN. Source: Sharma. 2017b.

Artificial Neural Network (ANN) belongs to the field of Machine Learning. It consists of computational models inspired from the human brain and biological neural networks. The goal is to simulate human intelligence, reasoning and memory to solve forecasting, pattern recognition and classification problems.

ANN is effective in scenarios where traditional ML methods such as regression, time series analysis or PCA cannot perform or forecast accurately. This could be because of data bias, mix of continuous and categorical data, unclean or uncertain data.

Complex networks with multiple layers, nodes and neurons are possible today, thanks to dramatic increase in computing power, super-efficient GPUs and Big Data. A modern approach to ANN known as Deep Learning which processes and transforms data cascading through hierarchical layers, has gained immense prominence. Computer vision, character and image recognition, speech detection and NLP are the popular applications of ANN.

Discussion

In what ways do ANN replicate the functioning of the human neural network?
Biological Vs Artificial Neural Network. Source: Roell. 2017.
In a biological neural network, nerve cells (neurons) are interconnected by signal transmitters (synapses) which pass on electrical/chemical signals to a target neuron. By summation of potentials, either a signal to excite (positive) or inhibit (negative) is transmitted. Human brain contains about 86 billion neurons on average.
When stimulated by an electrical pulse, neurotransmitters are released. They cross into the synaptic gap between neurons and bind to chemical receptors in the receiving neuron. This affects the potential charge of the receiving neuron, and starts up a new electrical signal in the receiving neuron. The whole process takes less than 1/500th of a second. As a message moves from one neuron to another, it is converted from an electrical signal to a chemical signal and back in an ongoing chain of events which is the basis of all brain activity.
To draw parallels in ANN, a network of data elements called artificial neurons receive input and change their internal state (activation) based on that input. They then produce a result depending on the input value and the activation function.
What is the structure and function of an ANN?
Structure of a Neuron and ANN. Source: Vieira. 2017.
The basic structure of an ANN involves a network of artificial neurons arranged in layers - one input layer, one output layer and one or more hidden layers in between.
Each neuron with the exception of those in the input layer, receives and processes stimuli (inputs) from other neurons. The processed information is available at the output end of the neuron.
Every connection has a weight (positive or negative) attached to it. Positive weights activate the neuron while negative weights inhibit it. The figure shows a network structure with inputs (x1, x2, … xm) being connected to a neuron with weights (w1, w2, … wm) on each connection. The neuron sums all the signals it receives, with each signal being multiplied by its associated weights on the connection.
This output is then passed through a transfer (activation) function g(y) that is normally non-linear to give the final output y. By comparing this output to actual value we determine the error. The process is repeated over several iterations until the error is within acceptable limits.
What are some key terms used in describing ANN?
Perceptron example with weights. Source: Sharma. 2017.
We note the following terms:
- Perceptron - A linear binary classifier used in supervised learning. It acts as a single-node neural network. A neural network consists of multi-layer perceptrons (MLPs). Perceptrons consist of 4 parts – inputs, weights & bias, weighted sum and activation function. All inputs x are multiplied with their weights w, then added to get a weighted sum. This sum is applied to the activation function, to get output y of the perceptron.
- Activation Function - A transfer function used to determine output of a neural network. It maps the resulting values into ranges (0 to 1), (-1 to 1) etc. They may be linear or non-linear, the sigmoid function being one of them.
- Weights and Biases - Weights assigned to an input indicate its strength. Higher the weight, greater the influence of that input on the outcome. Bias value allows you to shift the activation function curve up or down.
- Loss Function - A way to evaluate "goodness" of predictions. It quantifies the gap between predicted and actual values. Sum Of Squares Error is an example loss function.
What are the most common types of Artificial Neural Networks used in ML?
We note the following types of ANNs:
- Feed Forward Networks - Simplest form of ANN where data travels in one direction through a network of perceptrons, from input layer towards output layer. Error in prediction is fed back to update weights and biases using backpropagation. Over multiple iterations, weights and bias values are tuned so that the predicted values converge on the actuals. Activation functions commonly used are sigmoid, tanh or RELU.
- Recurrent Neural Networks - Works on the principle of saving the output of a layer and feeding this back to the input to help in predicting the outcome of the layer. From one time-step to the next, each neuron remembers some information it had in the previous time-step. This makes each neuron act like a memory cell in performing computations (LSTM).
- Radial Basis Function - Functions that have a distance criterion with respect to a center. It has an input layer, radial basis hidden layer and output layer. Commonly used activation functions are Gaussian and multi-quadratic. Applied in Power Restoration Systems when restoration happens from core priority areas to the periphery.
How does an ANN learn?
The ANN learns through an iterative training routine where weights and biases are continually adjusted to improve strength of the prediction. Steps are:
- Identify input values, assign random initial values as weights to these inputs.
- For supervised learning, segregate training and test data sets.
- Finalise the hidden layers and nodes in them.
- Identify number of epochs and batch sizes. Training data set is generally sliced into batches. An epoch is one complete pass through the whole training set. Training typically requires several epochs.
- Pass the training set through the layers. Determine predicted outcome at the end of the iteration.
- Study extent of miscalculation, now adjust the weights (W = W + ΔW) through backpropagation. Amount of change in W per iteration is called Learning Rate. Large values of learning rate would train the network faster, but may result in overshooting the optimal solution. We need a balance, determined by the gradient descent method of optimising the loss/cost function.
- After several iterations, network achieves acceptable reliability in predicted outcome. Verify by applying the test data set, check for model accuracy.
Now the ANN is set to have ‘learned’ and is ready for deployment.
What are the key differentiators between traditional ML techniques and ANN?
ANN and traditional ML techniques like logistic regression are algorithms meant to do the same thing - classification of data. However, while logistic regression is a statistical method, ANN is a heuristic method modelled on the human brain.
In many cases, simple neural network configurations yield the same solution as many traditional statistical applications. For example, a single-layer, feed forward neural network with linear activation for its output perceptron is equivalent to a general linear regression fit. When you use sigmoid activation function in ANN, it behaves like logistic regression.
However, one of the unique aspects of an ANN is the presence of its hidden layers. Since movement of data between these layers is automatic, these steps cannot be statistically expressed. Hence debugging or tracing data values through these intermediate steps isn't possible. Whereas in regular ML algorithms, the input to output transition can be traced entirely.
The ability of data to reiterate through hidden layers allows hierarchical processing. This is the reason ANN forms the basis of Deep Learning, while other ML techniques are unsuitable.
When to use and not to use Artificial Neural Networks?
When data is well structured, free of inconsistencies and somewhat linear in nature, traditional ML models such as linear regression, classification or PCA work remarkably well.
But in applications such as text validation or speech recognition, data tends to be non-linear and incomplete. It is also subject to human error and variations of language, dialect or handwriting. In such applications, ANN works with good accuracy.
For large and complex data where a small amount of time series data is available or where large amounts of noise exist, standard ML approaches can become difficult, even impossible. They may require weeks of a statistician’s time to build, as the data clean up and pre-processing effort is huge.
Neural networks accommodate circumstances where the existing data has useful information to offer, but it might be clouded by factors mentioned above. Neural networks can also account for mixtures of continuous and categorical data.
To build a predictive model for a complex system, ANN does not require a statistician and domain expert to screen through every possible combination of variables. Thus, the neural network approach can dramatically reduce the time required to build a model.

Milestones

1943

Neurophysiologist Warren McCulloch and mathematician Walter Pitts write a paper on how neurons might work. In order to describe how neurons in the brain might work, they model a simple neural network using electrical circuits.

1948

Donald Hebb in his book The Organization of Behaviour proposes a model of learning based on neural plasticity. Later called Hebbian Learning it's often summarized by the phrase "cells that fire together, wire together".

1958

Frank Rosenblatt, a psychologist at Cornell proposes the idea of a Perceptron, modeled on the McCulloch-Pitts neuron.

1960

Widrow and Hoff develop a learning procedure that examines weight values and determine output of perceptrons accordingly.

1975

The first multilayered network is developed. It's an unsupervised network.

1975

A key trigger for renewed interest in neural networks and learning is Werbos's backpropagation algorithm that made the training of multi-layer networks feasible and efficient.

1985

American Institute of Physics, establishes a Neural Networks in Computing annual meeting.

1987

The first International Conference on Neural Networks is organized by the Institute of Electrical and Electronics Engineers (IEEE).

2009

Recurrent neural networks and deep feedforward neural networks developed in Schmidhuber's research group win eight international competitions in pattern recognition and machine learning.

2010

Backpropagation training through max-pooling is accelerated by GPUs and shown to perform better than other pooling variants.

2012

Ng and Dean create a network that learns to recognize higher level concepts such as cats, only from watching unlabeled images taken from YouTube videos.

Sample Code

Python

# Python sample to build a 2 layer feed forward neural network with back propagation
# Source: https://towardsdatascience.com/how-to-build-your-own-neural-network-from-scratch-in-python-68998a08e4f6
# Accessed: 2019-08-04
 
class NeuralNetwork:
    def __init__(self, x, y):
        self.input      = x
        self.weights1   = np.random.rand(self.input.shape[1],4) 
        self.weights2   = np.random.rand(4,1)                 
        self.y          = y
        self.output     = np.zeros(self.y.shape)
 
    def feedforward(self):
        self.layer1 = sigmoid(np.dot(self.input, self.weights1))
        self.output = sigmoid(np.dot(self.layer1, self.weights2))
 
    def backprop(self):
        # application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
        d_weights2 = np.dot(self.layer1.T, 
                            (2*(self.y - self.output) * sigmoid_derivative(self.output)))
        d_weights1 = np.dot(self.input.T,
                            (np.dot(2*(self.y - self.output) * sigmoid_derivative(self.output), self.weights2.T) 
                            * sigmoid_derivative(self.layer1)))
 
        # update the weights with the derivative (slope) of the loss function
        self.weights1 += d_weights1
        self.weights2 += d_weights2

References

Article Stats

1762

Words

Authors

Edits

Chats

Likes

6048

Hits

Cite As

Devopedia. 2020. "Artificial Neural Network." Version 9, August 17. Accessed 2023-11-12. https://devopedia.org/artificial-neural-network

Contributed by
3 authors

Last updated on
2020-08-17 05:50:31

Improve this article

design data algorithms machine learning artificial intelligence deep learning

Artificial Neural Network

Discussion

Milestones

Sample Code

References

Further Reading

Article Stats

Cite As

See Also

Artificial Neural Network

Discussion

Milestones

Sample Code

References

Further Reading

Article Stats

Author-wise Stats for Article Edits

Cite As

See Also

Article Warnings

Login