• A selection of Deep Learning frameworks, some with corporate backing. Source: den Bakker 2017.
    image
  • Mention of DL frameworks in arXiv research papers from 2012-2018. Source: Neuromation 2018.
    image
  • A chart from 2018 showing some popular DL frameworks. Source: Hale 2018.
    image
  • A 2017 comparison of some DL frameworks. Source: Rubashkin 2017.
    image
  • Mobile libraries for DL inference. Source: Koul 2017, slide 25.
    image

Deep Learning Frameworks

Summary

 image
A selection of Deep Learning frameworks, some with corporate backing. Source: den Bakker 2017.

Deep Learning (DL) is a neural network approach to Machine Learning (ML). While it's possible to build DL solutions from scratch, DL frameworks are a convenient way to build them quickly. Such frameworks provide different neural network architectures out of the box in popular languages so that developers can use them across multiple platforms.

Choosing a framework for your problem depends on a number of factors. Therefore, it's not possible to name just one framework that should be preferred over another. Many frameworks are open source. Cloud providers also provide easy ways to deploy and execute a framework on their infrastructure.

Sometimes the term "framework" is used interchangeably with terms "toolkit" or "library".

Milestones

1965

While the theory of neural networks inspired by the human brain was first formulated in 1943, Alexey Ivakhnenko and his team create the first working Deep Learning network in 1965. In 1971, they succeed in building a 8-layer network.

2000

The term Deep Learning (DL) is used for the first time to mean a neural network with many layers. By 2017, a DL network is as much as 1000 layers deep.

2007

At the Montreal Institute for Learning Algorithms (MITA), Theano is developed in Python for efficient math operations that can run on either CPU or GPU architectures. It enables developers to run rapid experiments in Deep Learning. In later years, Theano goes on to inspire other frameworks. Ten years later (in 2017), it's announced that Theano will no longer be actively maintained.

2013

Invented by Berkeley Vision and Learning Center at UC Berkeley, Caffe framework is released. In 2017, Facebook open sources an evolution of Caffe called Caffe2.

2015

This year may be considered the turning point for DL frameworks. A number of DL frameworks are released: Chainer, Keras, Apache MXNet and TensorFlow.

Jan
2016

Microsoft releases and open sources Computational Network Toolkit (CNTK), a DL toolkit it's been using internally for speech and image recognition. CNTK supports multiple GPUs across multiple machines.

Apr
2016

Python-based open source library Keras 1.0 is released. It's a major re-write of Keras while being backward compatible. It's been said that Keras has an "API designed for human beings, not machines".

2017

Among the frameworks to come out in 2017 is Caffe2 from Facebook. Deeplearning4j for the Java and Scala communities becomes part of Eclipse Foundation although it's origins can be traced to 2014. PyTorch is open sourced by Facebook and it starts to become popular when DL course fast.ai adopts it.

Dec
2017

With so many DL frameworks, the landscape can look very fragmented for developers. Open Neural Network Exchange (ONNX) is released as an open format that enables developers to export/import models from/to frameworks. For example, you can build a PyTorch model, export it in ONNX format, and import into MXNet where it can be used for inference.

Feb
2018
image

An analysis of arXiv 45,000 ML papers show the popularity of TensorFlow, having overtaken Caffe in 2017. Adoption of Keras and PyTorch also appear to be growing.

May
2018

Caffe2 and PyTorch, both out of Facebook, plan to come together into a single platform. The intent is to "combine the flexible user experience of the PyTorch frontend with scaling, deployment and embedding capabilities of the Caffe2 backend".

Oct
2018

Based on PyTorch, v1.0 of fastai is released as a free open source library for DL.

Discussion

  • Could you mention some popular DL frameworks?
     image
    A chart from 2018 showing some popular DL frameworks. Source: Hale 2018.

    Among the popular open source DL frameworks are TensorFlow, Caffe, Keras, PyTorch, Caffe2, CNTK, MXNet, Deeplearning4j (DL4J), and many more. Many of these frameworks support Python as the programming language of choice. DL4J is for Java programmers but models written in Keras can be imported into DL4J. Among those supporting C++ include Caffe, DL4J, CNTK, MXNet and TensorFlow. Torch was written in Lua and C, and PyTorch extends and improves on it with Python support. Paddle is a framework from Baidu.

    These frameworks are not all at the same level of abstraction. For example, Keras provides a simpler API for developers and sits on top of TensorFlow, Theano or CNTK. Likewise, Gluon is an API that can work with MXNet and CNTK. Gluon can be seen as competition to Keras.

    Among those not open sourced are Intel Math Kernel Library, MATLAB Neural Network Toolbox, Neural Designer, and Wolfram Mathematica NeuralNetworks.

    A curated list of ML frameworks is available online.

  • How should I go about selecting a suitable DL framework?
     image
    A 2017 comparison of some DL frameworks. Source: Rubashkin 2017.

    Some obvious factors to consider are licensing, documentation, active community, adoption, programming language, modularity, ease of use, and performance. Keras and PyTorch are said to be easy to use but you can also consider TensorFlow for its popularity.

    More specifically, you should check the following:

    • Style: Imperative or symbolic.
    • Core Development Environment: Programming language, intuitive API, fast compile times, tools, debugger support, abstracting the computational graph, graph visualization (TensorBoard), etc.
    • Neural Network Architecture: Support for Deep Autoencoders, Restricted Boltzmann Machines (RBMs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTMs), Generative Adversarial Networks (GANs), etc.
    • Optimization Algorithms: Gradient Descent (GD), Momentum-based GD, AdaGrad, RMSProp and Adam, etc.
    • Targeted Application Areas: Image recognition, video detection, voice/audio recognition, text analytics, Natural Language Processing (NLP), timeseries forecasting, etc.
    • Hardware Extensibility: Support for multiple CPUs, GPUs, GPGPUs or TPUs across multiple machines or clusters.
    • Optimized for Hardware: Execute in optimized low-level code by supporting CUDA, BLAS, etc.
    • Deployment: Framework should be easy to deploy in production (TensorFlow Serving).
  • Which DL frameworks are symbolic and which ones are imperative?

    Let's briefly understand the difference between the two. Imperative programs perform computations as they are encountered along the program flow. Symbolic programs define symbols and how they should be combined. They result in what we call a computational graph. Symbols themselves might not have initial values. Symbols acquire values after the graph is compiled and invoked with particular values.

    Torch, Chainer and Minerva are examples of imperative-style DL frameworks. Symbolic-style DL frameworks include TensorFlow, Theano and CGT. Also in symbolic style are CXXNet and Caffe that define the graph in configuration files.

    Imperative frameworks are more flexible since you're closer to the language. In symbolic frameworks, there's less flexibility since you write in a domain-specific language. However, symbolic frameworks tend to be more efficient, both in terms of memory and speed. At times, it might make sense to use a mix of both framework styles. For example, parameter updates are done imperatively and gradient calculations are done symbolically. MXNet allows a mix of both styles. Gluon uses an imperative style for easier model development while also supporting dynamic graphs.

  • From a mathematical perspective, what are the main features expected of a DL framework?

    All data is represented as tensors (multi-dimensional arrays). A DL framework must therefore support tensors and operations on them.

    The ability to define dynamic computational graphs is desired by developers. To be dynamic means that graph nodes can be added or removed at runtime. With PyTorch and Chainer, graphs can be defined dynamically. With TensorFlow, you have to define the entire computation graph before you can run it. TensorFlow more recently has TensorFlow Fold for dynamic graphs and eager execution for immediate execution.

    An essential operation in a DL network during learning is function differentiation. Automatic Differentiation is a feature that must be supported by a DL framework. This is straightforward when the framework uses a computational graph.

  • In terms of hardware support for DL frameworks, what should I look for?
     image
    Mobile libraries for DL inference. Source: Koul 2017, slide 25.

    Deep Learning involves training and inference. Training is often done on cloud clusters. Inference can at times happen on constrained IoT devices, embedded systems or smartphones. Thus, trained models should be able to run on ARM-based hardware. For example, Caffe2 is suitable for smartphones while TensorFlow is for research and server-side deployment. However, there's TensorFlow Lite for inference on constrained devices.

    When running on NVIDIA GPUs, you should check if there's support for CUDA that enables most efficient use of GPUs. An alternative to CUDA is OpenCL but most DL frameworks don't support it. OpenMP is more widely supported and it enables multiplatform shared memory multiprocessing.

    In addition, hardware vendors provide a number of libraries and optimizers so that DL frameworks make the best use of their hardware. Among these are NVIDIA's cuDNN, cuBLAS, cuSPARSE, TensorRT, DeepStream, and many more. Intel offers Math Kernel Library for Deep Neural Networks (MKL-DNN), Data Analytics Acceleration Library (DAAL), OpenVINO, nGraph, and others.

  • How do the popular DL frameworks compare in terms of performance?

    Since DL frameworks are always improving, what's mentioned here should be only a starting point for your study.

    One study from early 2018 compared some DL frameworks on three datasets. Nvidia GPUs were used along with cuDNN, an Nvidia library tuned for common DL computations. Among the faster frameworks are TensorFlow, PyTorch, MXNet, CNTK and Julia-Knet. It was seen that CNTK and TensorFlow are much faster than Keras-CNTK and Keras-TF respectively. While Keras simplifies development, the tradeoff is performance.

    In other studies, TensorFlow is said to be slower than CNTK and MXNet. MXNet stands out in terms of scalability and performance.

    Ultimately, the framework itself may not matter much when cuDNN is used to provide acceleration on NVIDIA's GPUs. Any difference in performance will come from how frameworks scale to multiple GPUs and machines.

Sample Code

  • # Source: https://mxnet.incubator.apache.org/versions/master/architecture/program_model.html
    # Accessed: 2019-01-21
     
    # Imperative style in NumPy
    import numpy as np
    a = np.ones(10)
    b = np.ones(10) * 2
    c = b * a
    d = c + 1
     
    # Symbolic style
    # Expressions such as B * A define the computational graph
    # but they are executed only after graph is compiled and invoked
    A = Variable('A')
    B = Variable('B')
    C = B * A
    D = C + Constant(1)
    f = compile(D)
    d = f(A=np.ones(10), B=np.ones(10)*2)

References

  1. Bhatia, Richa. 2018. "TensorFlow Vs Caffe: Which Machine Learning Framework Should You Opt For?" Analytics India Magazine, August 07. Accessed 2019-01-21.
  2. Caffe2. 2018. "Caffe2 and PyTorch join forces to create a Research + Production platform PyTorch 1.0." Caffe2 Blog, May 02. Accessed 2019-01-21.
  3. Chollet, Francois. 2016. "Introducing Keras 1.0." The Keras Blog, April 11. Accessed 2019-01-20.
  4. Fogg, Andrew. 2018. "A History of Deep Learning." Import.io, May 30. Accessed 2019-01-20.
  5. Hale, Jeff. 2018. "Deep Learning Framework Power Scores 2018." Towards Data Science, September 20. Accessed 2019-01-20.
  6. Heller, Martin. 2018. "What is CUDA? Parallel programming for GPUs." InfoWorld, August 30. Accessed 2019-01-21.
  7. Howard, Jeremy. 2018. "fastai v1 for PyTorch: Fast and accurate neural nets using modern best practices." fast.ai, October 02. Accessed 2019-01-21.
  8. Intel AI Academy. 2019. "Tools, Libraries, and SDKs." Intel AI Academy, Intel Software. Accessed 2019-01-21.
  9. Koul, Anirudh. 2017. "Squeezing Deep Learning Into Mobile Phones." SlideShare, March 15. Accessed 2019-01-21.
  10. Linn, Allison. 2016. "Microsoft releases CNTK, its open source deep learning toolkit, on GitHub." The AI Blog, Microsoft, January 25. Accessed 2019-01-20.
  11. MXNet. 2019a. "Deep Learning Programming Style." Accessed 2019-01-21.
  12. MXNet. 2019b. "About Gluon." Accessed 2019-01-21.
  13. Makadia, Mitul. 2018. "Top 8 Deep Learning Frameworks." DZone, March 29. Accessed 2019-01-20.
  14. Maladkar, Kishan. 2018. "Evaluation Of Major Deep Learning Frameworks." Analytics India Magazine, April 17. Accessed 2019-01-20.
  15. Mannes, John. 2017. "Facebook open sources Caffe2, its flexible deep learning framework of choice." TechCrunch, April 18. Accessed 2019-01-20.
  16. Mwiti, Derrick. 2018. "Introduction to PyTorch for Deep Learning." Hearbeat, October 05. Accessed 2019-01-21.
  17. NVIDIA Developer. 2016. "Deep Learning Frameworks." April 05. Accessed 2019-01-20.
  18. Neuromation. 2018. "NeuroNuggets: An Overview of Deep Learning Frameworks." Neuromation, May 24. Accessed 2019-01-20.
  19. Peng, Tony. 2017. "RIP Theano." Synced, September 29. Accessed 2019-01-20.
  20. Rubashkin, Matthew. 2017. "Getting Started with Deep Learning." KDnuggets, March. Accessed 2019-01-20.
  21. Santhanam, Gokula Krishnan. 2017. "The Anatomy of Deep Learning Frameworks." KDnuggets, February. Accessed 2019-01-20.
  22. Skymind Wiki. 2019a. "Deeplearning4j." AI Wiki, Skymind. Accessed 2019-01-20.
  23. Skymind Wiki. 2019b. "Comparison of AI Frameworks" AI Wiki, Skymind. Accessed 2019-01-20.
  24. Wikipedia. 2019a. "Comparison of deep learning software." Wikipedia, January 17. Accessed 2019-01-20.
  25. Wikipedia. 2019b. "Torch (machine learning)." Wikipedia, January 05. Accessed 2019-01-20.
  26. Wikipedia. 2019c. "Theano (software)." Wikipedia, January 17. Accessed 2019-01-20.
  27. den Bakker, Indra. 2017. "Battle of the Deep Learning frameworks — Part I: 2017, even more frameworks and interfaces." Towards Data Science, December 19. Accessed 2019-01-20.

Milestones

1965

While the theory of neural networks inspired by the human brain was first formulated in 1943, Alexey Ivakhnenko and his team create the first working Deep Learning network in 1965. In 1971, they succeed in building a 8-layer network.

2000

The term Deep Learning (DL) is used for the first time to mean a neural network with many layers. By 2017, a DL network is as much as 1000 layers deep.

2007

At the Montreal Institute for Learning Algorithms (MITA), Theano is developed in Python for efficient math operations that can run on either CPU or GPU architectures. It enables developers to run rapid experiments in Deep Learning. In later years, Theano goes on to inspire other frameworks. Ten years later (in 2017), it's announced that Theano will no longer be actively maintained.

2013

Invented by Berkeley Vision and Learning Center at UC Berkeley, Caffe framework is released. In 2017, Facebook open sources an evolution of Caffe called Caffe2.

2015

This year may be considered the turning point for DL frameworks. A number of DL frameworks are released: Chainer, Keras, Apache MXNet and TensorFlow.

Jan
2016

Microsoft releases and open sources Computational Network Toolkit (CNTK), a DL toolkit it's been using internally for speech and image recognition. CNTK supports multiple GPUs across multiple machines.

Apr
2016

Python-based open source library Keras 1.0 is released. It's a major re-write of Keras while being backward compatible. It's been said that Keras has an "API designed for human beings, not machines".

2017

Among the frameworks to come out in 2017 is Caffe2 from Facebook. Deeplearning4j for the Java and Scala communities becomes part of Eclipse Foundation although it's origins can be traced to 2014. PyTorch is open sourced by Facebook and it starts to become popular when DL course fast.ai adopts it.

Dec
2017

With so many DL frameworks, the landscape can look very fragmented for developers. Open Neural Network Exchange (ONNX) is released as an open format that enables developers to export/import models from/to frameworks. For example, you can build a PyTorch model, export it in ONNX format, and import into MXNet where it can be used for inference.

Feb
2018
image

An analysis of arXiv 45,000 ML papers show the popularity of TensorFlow, having overtaken Caffe in 2017. Adoption of Keras and PyTorch also appear to be growing.

May
2018

Caffe2 and PyTorch, both out of Facebook, plan to come together into a single platform. The intent is to "combine the flexible user experience of the PyTorch frontend with scaling, deployment and embedding capabilities of the Caffe2 backend".

Oct
2018

Based on PyTorch, v1.0 of fastai is released as a free open source library for DL.

Tags

See Also

Further Reading

  1. Skymind Wiki. 2019b. "Comparison of AI Frameworks" AI Wiki, Skymind. Accessed 2019-01-20.
  2. Hale, Jeff. 2018. "Deep Learning Framework Power Scores 2018." Towards Data Science, September 20. Accessed 2019-01-20.
  3. Fogg, Andrew. 2018. "A History of Deep Learning." Import.io, May 30. Accessed 2019-01-20.

Top Contributors

Last update: 2019-01-22 03:03:38 by gurumoorthyP
Creation: 2018-10-07 17:40:48 by gops75

Article Stats

1598
Words
3
Chats
3
Authors
9
Edits
3
Likes
596
Hits

Cite As

Devopedia. 2019. "Deep Learning Frameworks." Version 9, January 22. Accessed 2019-02-22. https://devopedia.org/deep-learning-frameworks