• A selection of Deep Learning frameworks, some with corporate backing. Source: den Bakker 2017.
• Mention of DL frameworks in arXiv research papers from 2012-2018. Source: Neuromation 2018.
• A chart from 2018 showing some popular DL frameworks. Source: Hale 2018.
• A 2017 comparison of some DL frameworks. Source: Rubashkin 2017.
• Mobile libraries for DL inference. Source: Koul 2017, slide 25.

# Deep Learning Frameworks

gurumoorthyP
1316 DevCoins

gops75
666 DevCoins

arvindpdmn
39 DevCoins
Last updated by arvindpdmn
on 2019-09-23 13:16:18
Created by gops75
on 2018-10-07 17:40:48

## Summary

Deep Learning (DL) is a neural network approach to Machine Learning (ML). While it's possible to build DL solutions from scratch, DL frameworks are a convenient way to build them quickly. Such frameworks provide different neural network architectures out of the box in popular languages so that developers can use them across multiple platforms.

Choosing a framework for your problem depends on a number of factors. Therefore, it's not possible to name just one framework that should be preferred over another. Many frameworks are open source. Cloud providers also provide easy ways to deploy and execute a framework on their infrastructure.

Sometimes the term "framework" is used interchangeably with terms "toolkit" or "library".

## Milestones

1965

While the theory of neural networks inspired by the human brain was first formulated in 1943, Alexey Ivakhnenko and his team create the first working Deep Learning network in 1965. In 1971, they succeed in building an 8-layer network.

2000

The term Deep Learning (DL) is used for the first time to mean a neural network with many layers. By 2017, a DL network is as much as 1000 layers deep.

2007

At the Montreal Institute for Learning Algorithms (MITA), Theano is developed in Python for efficient math operations that can run on either CPU or GPU architectures. It enables developers to run rapid experiments in Deep Learning. In later years, Theano goes on to inspire other frameworks. Ten years later (in 2017), it's announced that Theano will no longer be actively maintained.

2013

Invented by Berkeley Vision and Learning Center at UC Berkeley, Caffe framework is released. In 2017, Facebook open sources an evolution of Caffe called Caffe2.

2015

This year may be considered the turning point for DL frameworks. A number of DL frameworks are released: Chainer, Keras, Apache MXNet and TensorFlow.

Jan
2016

Microsoft releases and open sources Computational Network Toolkit (CNTK), a DL toolkit it's been using internally for speech and image recognition. CNTK supports multiple GPUs across multiple machines.

Apr
2016

Python-based open source library Keras 1.0 is released. It's a major re-write of Keras while being backward compatible. It's been said that Keras has an "API designed for human beings, not machines".

2017

Among the frameworks to come out in 2017 is Caffe2 from Facebook. Deeplearning4j for the Java and Scala communities becomes part of Eclipse Foundation although it's origins can be traced to 2014. PyTorch is open sourced by Facebook and it starts to become popular when DL course fast.ai adopts it.

Dec
2017

With so many DL frameworks, the landscape can look very fragmented for developers. Open Neural Network Exchange (ONNX) is released as an open format that enables developers to export/import models from/to frameworks. For example, you can build a PyTorch model, export it in ONNX format, and import into MXNet where it can be used for inference.

Feb
2018

An analysis of arXiv 45,000 ML papers show the popularity of TensorFlow, having overtaken Caffe in 2017. Adoption of Keras and PyTorch also appear to be growing.

May
2018

Caffe2 and PyTorch, both out of Facebook, plan to come together into a single platform. The intent is to "combine the flexible user experience of the PyTorch frontend with scaling, deployment and embedding capabilities of the Caffe2 backend".

Oct
2018

Based on PyTorch, v1.0 of fastai is released as a free open source library for DL.

## Discussion

• Could you mention some popular DL frameworks?

Among the popular open source DL frameworks are TensorFlow, Caffe, Keras, PyTorch, Caffe2, CNTK, MXNet, Deeplearning4j (DL4J), and many more. Many of these frameworks support Python as the programming language of choice. DL4J is for Java programmers but models written in Keras can be imported into DL4J. Among those supporting C++ include Caffe, DL4J, CNTK, MXNet and TensorFlow. Torch was written in Lua and C, and PyTorch extends and improves on it with Python support. Paddle is a framework from Baidu.

These frameworks are not all at the same level of abstraction. For example, Keras provides a simpler API for developers and sits on top of TensorFlow, Theano or CNTK. Likewise, Gluon is an API that can work with MXNet and CNTK. Gluon can be seen as competition to Keras.

Among those not open sourced are Intel Math Kernel Library, MATLAB Neural Network Toolbox, Neural Designer, and Wolfram Mathematica NeuralNetworks.

A curated list of ML frameworks is available online.

• How should I go about selecting a suitable DL framework?

Some obvious factors to consider are licensing, documentation, active community, adoption, programming language, modularity, ease of use, and performance. Keras and PyTorch are said to be easy to use but you can also consider TensorFlow for its popularity.

More specifically, you should check the following:

• Style: Imperative or symbolic.
• Core Development Environment: Programming language, intuitive API, fast compile times, tools, debugger support, abstracting the computational graph, graph visualization (TensorBoard), etc.
• Neural Network Architecture: Support for Deep Autoencoders, Restricted Boltzmann Machines (RBMs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTMs), Generative Adversarial Networks (GANs), etc.
• Targeted Application Areas: Image recognition, video detection, voice/audio recognition, text analytics, Natural Language Processing (NLP), timeseries forecasting, etc.
• Hardware Extensibility: Support for multiple CPUs, GPUs, GPGPUs or TPUs across multiple machines or clusters.
• Optimized for Hardware: Execute in optimized low-level code by supporting CUDA, BLAS, etc.
• Deployment: Framework should be easy to deploy in production (TensorFlow Serving).
• Which DL frameworks are symbolic and which ones are imperative?

Let's briefly understand the difference between the two. Imperative programs perform computations as they are encountered along the program flow. Symbolic programs define symbols and how they should be combined. They result in what we call a computational graph. Symbols themselves might not have initial values. Symbols acquire values after the graph is compiled and invoked with particular values.

Torch, Chainer and Minerva are examples of imperative-style DL frameworks. Symbolic-style DL frameworks include TensorFlow, Theano and CGT. Also in symbolic style are CXXNet and Caffe that define the graph in configuration files.

Imperative frameworks are more flexible since you're closer to the language. In symbolic frameworks, there's less flexibility since you write in a domain-specific language. However, symbolic frameworks tend to be more efficient, both in terms of memory and speed. At times, it might make sense to use a mix of both framework styles. For example, parameter updates are done imperatively and gradient calculations are done symbolically. MXNet allows a mix of both styles. Gluon uses an imperative style for easier model development while also supporting dynamic graphs.

• From a mathematical perspective, what are the main features expected of a DL framework?

All data is represented as tensors (multi-dimensional arrays). A DL framework must therefore support tensors and operations on them.

The ability to define dynamic computational graphs is desired by developers. To be dynamic means that graph nodes can be added or removed at runtime. With PyTorch and Chainer, graphs can be defined dynamically. With TensorFlow, you have to define the entire computation graph before you can run it. TensorFlow more recently has TensorFlow Fold for dynamic graphs and eager execution for immediate execution.

An essential operation in a DL network during learning is function differentiation. Automatic Differentiation is a feature that must be supported by a DL framework. This is straightforward when the framework uses a computational graph.

• In terms of hardware support for DL frameworks, what should I look for?

Deep Learning involves training and inference. Training is often done on cloud clusters. Inference can at times happen on constrained IoT devices, embedded systems or smartphones. Thus, trained models should be able to run on ARM-based hardware. For example, Caffe2 is suitable for smartphones while TensorFlow is for research and server-side deployment. However, there's TensorFlow Lite for inference on constrained devices.

When running on NVIDIA GPUs, you should check if there's support for CUDA that enables most efficient use of GPUs. An alternative to CUDA is OpenCL but most DL frameworks don't support it. OpenMP is more widely supported and it enables multiplatform shared memory multiprocessing.

In addition, hardware vendors provide a number of libraries and optimizers so that DL frameworks make the best use of their hardware. Among these are NVIDIA's cuDNN, cuBLAS, cuSPARSE, TensorRT, DeepStream, and many more. Intel offers Math Kernel Library for Deep Neural Networks (MKL-DNN), Data Analytics Acceleration Library (DAAL), OpenVINO, nGraph, and others.

• How do the popular DL frameworks compare in terms of performance?

Since DL frameworks are always improving, what's mentioned here should be only a starting point for your study.

One study from early 2018 compared some DL frameworks on three datasets. Nvidia GPUs were used along with cuDNN, an Nvidia library tuned for common DL computations. Among the faster frameworks are TensorFlow, PyTorch, MXNet, CNTK and Julia-Knet. It was seen that CNTK and TensorFlow are much faster than Keras-CNTK and Keras-TF respectively. While Keras simplifies development, the tradeoff is performance.

In other studies, TensorFlow is said to be slower than CNTK and MXNet. MXNet stands out in terms of scalability and performance.

Ultimately, the framework itself may not matter much when cuDNN is used to provide acceleration on NVIDIA's GPUs. Any difference in performance will come from how frameworks scale to multiple GPUs and machines.

## Sample Code

• # Source: https://mxnet.incubator.apache.org/versions/master/architecture/program_model.html
# Accessed: 2019-01-21

# Imperative style in NumPy
import numpy as np
a = np.ones(10)
b = np.ones(10) * 2
c = b * a
d = c + 1

# Symbolic style
# Expressions such as B * A define the computational graph
# but they are executed only after graph is compiled and invoked
A = Variable('A')
B = Variable('B')
C = B * A
D = C + Constant(1)
f = compile(D)
d = f(A=np.ones(10), B=np.ones(10)*2)

## Milestones

1965

While the theory of neural networks inspired by the human brain was first formulated in 1943, Alexey Ivakhnenko and his team create the first working Deep Learning network in 1965. In 1971, they succeed in building an 8-layer network.

2000

The term Deep Learning (DL) is used for the first time to mean a neural network with many layers. By 2017, a DL network is as much as 1000 layers deep.

2007

At the Montreal Institute for Learning Algorithms (MITA), Theano is developed in Python for efficient math operations that can run on either CPU or GPU architectures. It enables developers to run rapid experiments in Deep Learning. In later years, Theano goes on to inspire other frameworks. Ten years later (in 2017), it's announced that Theano will no longer be actively maintained.

2013

Invented by Berkeley Vision and Learning Center at UC Berkeley, Caffe framework is released. In 2017, Facebook open sources an evolution of Caffe called Caffe2.

2015

This year may be considered the turning point for DL frameworks. A number of DL frameworks are released: Chainer, Keras, Apache MXNet and TensorFlow.

Jan
2016

Microsoft releases and open sources Computational Network Toolkit (CNTK), a DL toolkit it's been using internally for speech and image recognition. CNTK supports multiple GPUs across multiple machines.

Apr
2016

Python-based open source library Keras 1.0 is released. It's a major re-write of Keras while being backward compatible. It's been said that Keras has an "API designed for human beings, not machines".

2017

Among the frameworks to come out in 2017 is Caffe2 from Facebook. Deeplearning4j for the Java and Scala communities becomes part of Eclipse Foundation although it's origins can be traced to 2014. PyTorch is open sourced by Facebook and it starts to become popular when DL course fast.ai adopts it.

Dec
2017

With so many DL frameworks, the landscape can look very fragmented for developers. Open Neural Network Exchange (ONNX) is released as an open format that enables developers to export/import models from/to frameworks. For example, you can build a PyTorch model, export it in ONNX format, and import into MXNet where it can be used for inference.

Feb
2018

An analysis of arXiv 45,000 ML papers show the popularity of TensorFlow, having overtaken Caffe in 2017. Adoption of Keras and PyTorch also appear to be growing.

May
2018

Caffe2 and PyTorch, both out of Facebook, plan to come together into a single platform. The intent is to "combine the flexible user experience of the PyTorch frontend with scaling, deployment and embedding capabilities of the Caffe2 backend".

Oct
2018

Based on PyTorch, v1.0 of fastai is released as a free open source library for DL.

Author
No. of Edits
No. of Chats
DevCoins
2
0
1316
5
0
666
3
2
39
1598
Words
3
Chats
10
Edits
6
Likes
3978
Hits

## Cite As

Devopedia. 2019. "Deep Learning Frameworks." Version 10, September 23. Accessed 2019-10-17. https://devopedia.org/deep-learning-frameworks
• Site Map