Deep Learning Frameworks
Deep Learning (DL) is a neural network approach to Machine Learning (ML). While it's possible to build DL solutions from scratch, DL frameworks are a convenient way to build them quickly. Such frameworks provide different neural network architectures out of the box in popular languages so that developers can use them across multiple platforms.
Choosing a framework for your problem depends on a number of factors. Therefore, it's not possible to name just one framework that should be preferred over another. Many frameworks are open source. Cloud providers also provide easy ways to deploy and execute a framework on their infrastructure.
Sometimes the term "framework" is used interchangeably with terms "toolkit" or "library".
Could you mention some popular DL frameworks?
Among the popular open source DL frameworks are TensorFlow, Caffe, Keras, PyTorch, Caffe2, CNTK, MXNet, Deeplearning4j (DL4J), and many more. Many of these frameworks support Python as the programming language of choice. DL4J is for Java programmers but models written in Keras can be imported into DL4J. Among those supporting C++ include Caffe, DL4J, CNTK, MXNet and TensorFlow. Torch was written in Lua and C, and PyTorch extends and improves on it with Python support. Paddle is a framework from Baidu.
These frameworks are not all at the same level of abstraction. For example, Keras provides a simpler API for developers and sits on top of TensorFlow, Theano or CNTK. Likewise, Gluon is an API that can work with MXNet and CNTK. Gluon can be seen as competition to Keras.
A curated list of ML frameworks is available online.
How should I go about selecting a suitable DL framework?
Some obvious factors to consider are licensing, documentation, active community, adoption, programming language, modularity, ease of use, and performance. Keras and PyTorch are said to be easy to use but you can also consider TensorFlow for its popularity.
- Style: Imperative or symbolic.
- Core Development Environment: Programming language, intuitive API, fast compile times, tools, debugger support, abstracting the computational graph, graph visualization (TensorBoard), etc.
- Neural Network Architecture: Support for Deep Autoencoders, Restricted Boltzmann Machines (RBMs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTMs), Generative Adversarial Networks (GANs), etc.
- Optimization Algorithms: Gradient Descent (GD), Momentum-based GD, AdaGrad, RMSProp and Adam, etc.
- Targeted Application Areas: Image recognition, video detection, voice/audio recognition, text analytics, Natural Language Processing (NLP), timeseries forecasting, etc.
- Hardware Extensibility: Support for multiple CPUs, GPUs, GPGPUs or TPUs across multiple machines or clusters.
- Optimized for Hardware: Execute in optimized low-level code by supporting CUDA, BLAS, etc.
- Deployment: Framework should be easy to deploy in production (TensorFlow Serving).
Which DL frameworks are symbolic and which ones are imperative?
Let's briefly understand the difference between the two. Imperative programs perform computations as they are encountered along the program flow. Symbolic programs define symbols and how they should be combined. They result in what we call a computational graph. Symbols themselves might not have initial values. Symbols acquire values after the graph is compiled and invoked with particular values.
Torch, Chainer and Minerva are examples of imperative-style DL frameworks. Symbolic-style DL frameworks include TensorFlow, Theano and CGT. Also in symbolic style are CXXNet and Caffe that define the graph in configuration files.
Imperative frameworks are more flexible since you're closer to the language. In symbolic frameworks, there's less flexibility since you write in a domain-specific language. However, symbolic frameworks tend to be more efficient, both in terms of memory and speed. At times, it might make sense to use a mix of both framework styles. For example, parameter updates are done imperatively and gradient calculations are done symbolically. MXNet allows a mix of both styles. Gluon uses an imperative style for easier model development while also supporting dynamic graphs.
From a mathematical perspective, what are the main features expected of a DL framework?
The ability to define dynamic computational graphs is desired by developers. To be dynamic means that graph nodes can be added or removed at runtime. With PyTorch and Chainer, graphs can be defined dynamically. With TensorFlow, you have to define the entire computation graph before you can run it. TensorFlow more recently has TensorFlow Fold for dynamic graphs and eager execution for immediate execution.
An essential operation in a DL network during learning is function differentiation. Automatic Differentiation is a feature that must be supported by a DL framework. This is straightforward when the framework uses a computational graph.
In terms of hardware support for DL frameworks, what should I look for?
Deep Learning involves training and inference. Training is often done on cloud clusters. Inference can at times happen on constrained IoT devices, embedded systems or smartphones. Thus, trained models should be able to run on ARM-based hardware. For example, Caffe2 is suitable for smartphones while TensorFlow is for research and server-side deployment. However, there's TensorFlow Lite for inference on constrained devices.
When running on NVIDIA GPUs, you should check if there's support for CUDA that enables most efficient use of GPUs. An alternative to CUDA is OpenCL but most DL frameworks don't support it. OpenMP is more widely supported and it enables multiplatform shared memory multiprocessing.
In addition, hardware vendors provide a number of libraries and optimizers so that DL frameworks make the best use of their hardware. Among these are NVIDIA's cuDNN, cuBLAS, cuSPARSE, TensorRT, DeepStream, and many more. Intel offers Math Kernel Library for Deep Neural Networks (MKL-DNN), Data Analytics Acceleration Library (DAAL), OpenVINO, nGraph, and others.
How do the popular DL frameworks compare in terms of performance?
Since DL frameworks are always improving, what's mentioned here should be only a starting point for your study.
One study from early 2018 compared some DL frameworks on three datasets. Nvidia GPUs were used along with cuDNN, an Nvidia library tuned for common DL computations. Among the faster frameworks are TensorFlow, PyTorch, MXNet, CNTK and Julia-Knet. It was seen that CNTK and TensorFlow are much faster than Keras-CNTK and Keras-TF respectively. While Keras simplifies development, the tradeoff is performance.
Ultimately, the framework itself may not matter much when cuDNN is used to provide acceleration on NVIDIA's GPUs. Any difference in performance will come from how frameworks scale to multiple GPUs and machines.
At the Montreal Institute for Learning Algorithms (MITA), Theano is developed in Python for efficient math operations that can run on either CPU or GPU architectures. It enables developers to run rapid experiments in Deep Learning. In later years, Theano goes on to inspire other frameworks. Ten years later (in 2017), it's announced that Theano will no longer be actively maintained.
With so many DL frameworks, the landscape can look very fragmented for developers. Open Neural Network Exchange (ONNX) is released as an open format that enables developers to export/import models from/to frameworks. For example, you can build a PyTorch model, export it in ONNX format, and import into MXNet where it can be used for inference.
- Bhatia, Richa. 2018. "TensorFlow Vs Caffe: Which Machine Learning Framework Should You Opt For?" Analytics India Magazine, August 07. Accessed 2019-01-21.
- Caffe2. 2018. "Caffe2 and PyTorch join forces to create a Research + Production platform PyTorch 1.0." Caffe2 Blog, May 02. Accessed 2019-01-21.
- Chollet, Francois. 2016. "Introducing Keras 1.0." The Keras Blog, April 11. Accessed 2019-01-20.
- Fogg, Andrew. 2018. "A History of Deep Learning." Import.io, May 30. Accessed 2019-01-20.
- Hale, Jeff. 2018. "Deep Learning Framework Power Scores 2018." Towards Data Science, September 20. Accessed 2019-01-20.
- Heller, Martin. 2018. "What is CUDA? Parallel programming for GPUs." InfoWorld, August 30. Accessed 2019-01-21.
- Howard, Jeremy. 2018. "fastai v1 for PyTorch: Fast and accurate neural nets using modern best practices." fast.ai, October 02. Accessed 2019-01-21.
- Intel AI Academy. 2019. "Tools, Libraries, and SDKs." Intel AI Academy, Intel Software. Accessed 2019-01-21.
- Koul, Anirudh. 2017. "Squeezing Deep Learning Into Mobile Phones." SlideShare, March 15. Accessed 2019-01-21.
- Linn, Allison. 2016. "Microsoft releases CNTK, its open source deep learning toolkit, on GitHub." The AI Blog, Microsoft, January 25. Accessed 2019-01-20.
- MXNet. 2019a. "Deep Learning Programming Style." Accessed 2019-01-21.
- MXNet. 2019b. "About Gluon." Accessed 2019-01-21.
- Makadia, Mitul. 2018. "Top 8 Deep Learning Frameworks." DZone, March 29. Accessed 2019-01-20.
- Maladkar, Kishan. 2018. "Evaluation Of Major Deep Learning Frameworks." Analytics India Magazine, April 17. Accessed 2019-01-20.
- Mannes, John. 2017. "Facebook open sources Caffe2, its flexible deep learning framework of choice." TechCrunch, April 18. Accessed 2019-01-20.
- Mwiti, Derrick. 2018. "Introduction to PyTorch for Deep Learning." Hearbeat, October 05. Accessed 2019-01-21.
- NVIDIA Developer. 2016. "Deep Learning Frameworks." April 05. Accessed 2019-01-20.
- Neuromation. 2018. "NeuroNuggets: An Overview of Deep Learning Frameworks." Neuromation, May 24. Accessed 2019-01-20.
- Peng, Tony. 2017. "RIP Theano." Synced, September 29. Accessed 2019-01-20.
- Rubashkin, Matthew. 2017. "Getting Started with Deep Learning." KDnuggets, March. Accessed 2019-01-20.
- Santhanam, Gokula Krishnan. 2017. "The Anatomy of Deep Learning Frameworks." KDnuggets, February. Accessed 2019-01-20.
- Sharma, Pulkit. 2019. "5 Amazing Deep Learning Frameworks Every Data Scientist Must Know! (with Illustrated Infographic)." Analytics Vidhya, March 14. Accessed 2020-07-23.
- Skymind Wiki. 2019a. "Deeplearning4j." AI Wiki, Skymind. Accessed 2019-01-20.
- Skymind Wiki. 2019b. "Comparison of AI Frameworks" AI Wiki, Skymind. Accessed 2019-01-20.
- Wikipedia. 2019a. "Comparison of deep learning software." Wikipedia, January 17. Accessed 2019-01-20.
- Wikipedia. 2019b. "Torch (machine learning)." Wikipedia, January 05. Accessed 2019-01-20.
- Wikipedia. 2019c. "Theano (software)." Wikipedia, January 17. Accessed 2019-01-20.
- den Bakker, Indra. 2017. "Battle of the Deep Learning frameworks — Part I: 2017, even more frameworks and interfaces." Towards Data Science, December 19. Accessed 2019-01-20.