• A typical PyTorch workflow showing important PyTorch modules. Source: Shrimali 2019.
• Illustrating the use of modules in Torch. Source: Collobert et al. 2002, fig. 1.
• TorchScript and PyTorch Intermediate Representation (IR) for optimized code. Source: He 2019b.
• Research papers at conferences feature more of PyTorch than TensorFlow. Source: He 2019b.
• Some PyTorch packages. Source: El Aidouni 2019.
• Sample PyTorch construction of a neural network model. Source: Paszke et al. 2019, listing 1.

# PyTorch

arvindpdmn
1218 DevCoins
Last updated by arvindpdmn
on 2020-03-07 15:39:28
Created by arvindpdmn
on 2020-03-07 15:15:39

## Summary

To build an artificial neural network from scratch for machine learning models is not a trivial task. One approach is to use a library that simplifies many of the common tasks. PyTorch is a Python-based library for machine learning.

PyTorch was designed to be both user friendly and performant. Python programmers will find it easy to learn PyTorch since the programming style is pythonic. While PyTorch provides many ready-to-use packages and modules, developers can also customize them. It's been said that,

Every aspect of PyTorch is a regular Python program under the full control of its user.

PyTorch is open source with an active community developing it. It's a popular choice for research work, as seen in its growing adoption in 2019.

## Milestones

Oct
2002

As a modular machine learning library, Torch is released under free BSD license. With new ML algorithms being proposed and presented at conferences, a tool such as Torch can help in implementing and comparing them. Torch is implemented in C++. It's modular because simple modules can be combined to create complex models. For example, a multi-layer perceptron can be realized with two linear modules (for input and output) with a non-linear hidden layer (such as tanh) in between.

2011

Torch7 is released as a framework for numeric computing and machine learning. It's based on Lua, which is a fast interpreted language with a good C API. Since Lua is written in ANSI C, it can be compiled for various target platforms. Torch7 comes with 8 built-in packages. It also supports parallelization with OpenMP and CUDA.

Mar
2015

Version 1.0 of autograd is released on GitHub. First commit of this package is from November 2014. Maclaurin, in his doctoral thesis, describes autograd in detail. This is a Python implementation of Automatic Differentiation (AD). It can compute gradients on any NumPy code. He also notes that autograd has become quite popular within the machine learning community. A 2017 survey paper credits autograd, Chainer and PyTorch for popularizing AD.

Sep
2016

A week after alpha-0, alpha-1 release of PyTorch appears on GitHub. This includes torch.nn, torch.autograd, torch.optim, torch.load and torch.save. On MNIST dataset, PyTorch runs as fast as Torch-Lua. PyTorch uses 1500MB of system memory whereas Torch-Lua uses 2300MB.

Jan
2017

PyTorch gets its first public beta release. It's early development is guided by Soumith Chintala, a core developer of Torch. It starts as a fork of Chainer that has dynamic graphs and interpretable development environment. Though Torch is popular and even accepted by organizations such as Facebook AI Research, Lua is not as mainstream as Python. Other reasons to move to Python are easy debugging and an imperative-style framework.

Apr
2018

Caffe2 is merged into PyTorch codebase. Caffe2 and PyTorch are both open source ML frameworks from Facebook.

Dec
2018

Version 1.0.0 of PyTorch is released. This comes with a JIT compiler and TorchScript, which is a subset of Python. Due to PyTorch's Intermediate Representation (IR), models can be optimized and deployed in non-Python environments. This release introduces nn.distributed package. This release also includes Torch Hub, a repository for pre-trained models.

2019

PyTorch adoption grows a lot within the research community. Among the many deep learning frameworks, only PyTorch and TensorFlow seem to matter most. This year also sees a number of regular PyTorch releases from v1.0.1 (February) to v1.3.1 (November). By now, while PyTorch is growing, Torch community is defunct.

Jan
2020

PyTorch 1.4 is released. This allows customized builds for PyTorch Mobile. As an experimental feature, there's an RPC framework for distributed model parallel training. Also experimental are Java bindings. In February, NVIDIA releases a container image for this version of PyTorch.

## Discussion

• Why should I learn PyTorch and what are its benefits?

An essential Python package for numerical and scientific computing is NumPy. However, NumPy is unable to use the power of GPUs. PyTorch allows us to use GPUs while also giving a number of useful machine learning architectures and utilities out of the box.

For Python programmers, writing and reading PyTorch code is very much like that of Python code. It also integrates easily with other Python packages. PyTorch API is intuitive and easy to learn. It's easy to experiment and learn with PyTorch. There's no need to spend hours reading its documentation.

PyTorch combines the best of usability and speed. It's imperative, Pythonic and easy to debug. It offers GPU acceleration with automatic differentiation. Complexity of ML modelling is hidden behind intuitive APIs. For performance, most of it is written in C++. Via YAML metadata files, new language bindings can be quickly created.

• How does PyTorch compare with other deep learning libraries?

An alternative to PyTorch is TensorFlow, although PyTorch remains a popular choice for research. It's been said that TensorFlow is better for production. With eager execution (inspired by Chainer) and Keras integration, TensorFlow is becoming acceptable for research as well. In other words, with respect to ease of use and debugging, TensorFlow is becoming as good as PyTorch.

Some production environments may not have a Python runtime. Sometimes ML models have to be embedded in constrained devices such as smartphones. Updates to models should be seamless without any downtime. TensorFlow has addressed all these considerations. PyTorch is addressing these production considerations via a subset of Python called TorchScript. TorchScript captures the structure of PyTorch programs whereas a JIT compiler uses that structure to optimize. In addition, PyTorch has announced experimental support for quantization and mobile.

• What are main features of PyTorch?

We note these features of PyTorch:

• Usability and Speed: With eager mode, PyTorch provides flexibility for research. Via TorchScript, models can be converted to graph mode for speed and optimization. For efficiency, most of PyTorch is implemented in C++.
• Automatic Differentiation: Gradient computations are easy to perform. Via operator overloading, PyTorch builds up a representation of the computed function every time it is executed.
• Distributed Training: There's native support for asynchronous execution and peer-to-peer communication.
• Mobile Deployment: ML models can be deployed in mobile applications.
• Interoperability: PyTorch can export models in ONNX format that can then be imported into ONNX-compatible platforms, runtimes and visualizers. Moreover, it's easy convert between PyTorch tensors and NumPy arrays.
• Extensibility: It's easy to add custom behaviour. For example, automatic differentiation can be customized by deriving from torch.autograd.Function and implementing forward() and backward() methods.
• C++ Frontend: For bare metal C++ applications, this provides an alternative to Python frontend. It enables higher performance and lower latency.
• Quantization: Store and manipulate tensors at lower bit-widths instead of floating-point precision. It's done mainly to improve performance during inference.
• Cloud Support: Popular cloud platforms support PyTorch including AWS, GCP and Azure.
• Which are main PyTorch packages and modules?

The main package is called torch. It contains torch.Tensor, which is the main data structure to store multi-dimensional tensors. Operations on tensors are also defined in this package.

The package torch.nn is useful for building neural networks. To create a NN model, we need to subclass torch.nn.Module. NN layers that are available include torch.nn.Conv1d, torch.nn.MaxPool1d, torch.nn.Sigmoid, torch.nn.BatchNorm1d, torch.nn.LSTM, torch.nn.Linear, nn.Dropout, nn.MSELoss, and many more. Functional equivalents of these are in torch.nn.functional.

Neural network weights and biases are adjusted via optimizers to minimize the loss. The package torch.optim provide many optimizers including torch.optim.SGD and torch.optim.Adam. Learning rates can be adjusted via the module torch.optim.lr_scheduler.

Another useful module is torch.utils.data in which Dataset and DataLoader are important classes. Among other things, they're useful for batching data and distributing them to multiple workers.

Three application-specific packages are torchvision, torchaudio and torchtext. For example, torchvision.datasets gives easy access to image datasets and torchvision.transforms encapsulates useful transforms.

• Could you give an example of a neural network model in PyTorch?

The example in the figure shows how to build a NN model by creating a subclass of torch.nn.Model class. The model is initialized with a convolutional layer and a linear layer. While PyTorch has torch.nn.Linear, this example shows how easy it is to build a custom linear layer. Essentially, the model is implemented as a class whose members are the model's layers.

The model itself is evaluated on an input activation by calling the forward() method. We specify in this method how the layers are connected. In this example, there's a non-linear ReLU activation between the two layers. There's a softmax layer at the end. The backward pass happens in backward() method but this is supplied automatically by PyTorch's autograd module.

Loss is computed by the caller based on the output of forward(). Gradients are calculated next during the backward pass. Finally, optimizer is invoked to adjust the model's parameters.

• How do I use GPUs with PyTorch?

To keep track of and use GPUs, torch.cuda is the module to use. We can check if GPUs are available using torch.cuda.is_available(). If so, call torch.device('cuda') to select a GPU.

When tensors are allocated to a GPU, operations and their results will be on the same GPU device. Cross-GPU operations are not allowed by default but this is possible when peer-to-peer memory access is enabled. Data can be copied or moved from one GPU to another. To use multiple GPUs for distributed training, consider using nn.DataParallel.

GPU operations are asynchronous. Operations are queued up for a particular device. CPU can continue running and queueing operations to other GPUs. All of this is transparent to the programmer. It's possible to force synchronous GPU calls, which can be useful to debug errors.

A linear sequence of execution on a particular device is called a CUDA Stream. Every device has a default stream but we can create new streams. Operations within a stream are executed in the order they were queued. Since streams execute concurrently, operations across streams can be in any order. It's possible to synchronize across streams using for example synchronize() or wait_stream().

• What are some useful resources to get started with PyTorch?

The official PyTorch documentation is an essential reference. Beginners can start with step-by-step tutorials. The official PyTorch Resources page includes links to discussion forums, Slack channels. and how to get started with PyTorch on cloud platforms.

Beginners may also refer to a handy cheatsheet of PyTorch modules and how to use them.

A curated list of applications and models written in PyTorch can be useful for developers who like to learn by examples.

TechRepublic has published a useful list of books on PyTorch.

NVIDIA provides PyTorch container images that include NVIDIA CUDA, NVIDIA cuDNN, TensorBoard, TensorRT and optimized examples of well-known models.

## Milestones

Oct
2002

As a modular machine learning library, Torch is released under free BSD license. With new ML algorithms being proposed and presented at conferences, a tool such as Torch can help in implementing and comparing them. Torch is implemented in C++. It's modular because simple modules can be combined to create complex models. For example, a multi-layer perceptron can be realized with two linear modules (for input and output) with a non-linear hidden layer (such as tanh) in between.

2011

Torch7 is released as a framework for numeric computing and machine learning. It's based on Lua, which is a fast interpreted language with a good C API. Since Lua is written in ANSI C, it can be compiled for various target platforms. Torch7 comes with 8 built-in packages. It also supports parallelization with OpenMP and CUDA.

Mar
2015

Version 1.0 of autograd is released on GitHub. First commit of this package is from November 2014. Maclaurin, in his doctoral thesis, describes autograd in detail. This is a Python implementation of Automatic Differentiation (AD). It can compute gradients on any NumPy code. He also notes that autograd has become quite popular within the machine learning community. A 2017 survey paper credits autograd, Chainer and PyTorch for popularizing AD.

Sep
2016

A week after alpha-0, alpha-1 release of PyTorch appears on GitHub. This includes torch.nn, torch.autograd, torch.optim, torch.load and torch.save. On MNIST dataset, PyTorch runs as fast as Torch-Lua. PyTorch uses 1500MB of system memory whereas Torch-Lua uses 2300MB.

Jan
2017

PyTorch gets its first public beta release. It's early development is guided by Soumith Chintala, a core developer of Torch. It starts as a fork of Chainer that has dynamic graphs and interpretable development environment. Though Torch is popular and even accepted by organizations such as Facebook AI Research, Lua is not as mainstream as Python. Other reasons to move to Python are easy debugging and an imperative-style framework.

Apr
2018

Caffe2 is merged into PyTorch codebase. Caffe2 and PyTorch are both open source ML frameworks from Facebook.

Dec
2018

Version 1.0.0 of PyTorch is released. This comes with a JIT compiler and TorchScript, which is a subset of Python. Due to PyTorch's Intermediate Representation (IR), models can be optimized and deployed in non-Python environments. This release introduces nn.distributed package. This release also includes Torch Hub, a repository for pre-trained models.

2019

PyTorch adoption grows a lot within the research community. Among the many deep learning frameworks, only PyTorch and TensorFlow seem to matter most. This year also sees a number of regular PyTorch releases from v1.0.1 (February) to v1.3.1 (November). By now, while PyTorch is growing, Torch community is defunct.

Jan
2020

PyTorch 1.4 is released. This allows customized builds for PyTorch Mobile. As an experimental feature, there's an RPC framework for distributed model parallel training. Also experimental are Java bindings. In February, NVIDIA releases a container image for this version of PyTorch.

Author
No. of Edits
No. of Chats
DevCoins
3
0
1218
1804
Words
0
Chats
3
Edits
2
Likes
540
Hits

## Cite As

Devopedia. 2020. "PyTorch." Version 3, March 7. Accessed 2020-09-18. https://devopedia.org/pytorch
• Site Map