A typical PyTorch workflow showing important PyTorch modules. Source: Shrimali 2019.
A typical PyTorch workflow showing important PyTorch modules. Source: Shrimali 2019.

To build an artificial neural network from scratch for machine learning models is not a trivial task. One approach is to use a library that simplifies many of the common tasks. PyTorch is a Python-based library for machine learning.

PyTorch was designed to be both user friendly and performant. Python programmers will find it easy to learn PyTorch since the programming style is pythonic. While PyTorch provides many ready-to-use packages and modules, developers can also customize them. It's been said that,

Every aspect of PyTorch is a regular Python program under the full control of its user.

PyTorch is open source with an active community developing it. It's a popular choice for research work, as seen in its growing adoption in 2019.


  • Why should I learn PyTorch and what are its benefits?

    An essential Python package for numerical and scientific computing is NumPy. However, NumPy is unable to use the power of GPUs. PyTorch allows us to use GPUs while also giving a number of useful machine learning architectures and utilities out of the box.

    For Python programmers, writing and reading PyTorch code is very much like that of Python code. It also integrates easily with other Python packages. PyTorch API is intuitive and easy to learn. It's easy to experiment and learn with PyTorch. There's no need to spend hours reading its documentation.

    PyTorch combines the best of usability and speed. It's imperative, Pythonic and easy to debug. It offers GPU acceleration with automatic differentiation. Complexity of ML modelling is hidden behind intuitive APIs. For performance, most of it is written in C++. Via YAML metadata files, new language bindings can be quickly created.

  • How does PyTorch compare with other deep learning libraries?

    An alternative to PyTorch is TensorFlow, although PyTorch remains a popular choice for research. It's been said that TensorFlow is better for production. With eager execution (inspired by Chainer) and Keras integration, TensorFlow is becoming acceptable for research as well. In other words, with respect to ease of use and debugging, TensorFlow is becoming as good as PyTorch.

    Some production environments may not have a Python runtime. Sometimes ML models have to be embedded in constrained devices such as smartphones. Updates to models should be seamless without any downtime. TensorFlow has addressed all these considerations. PyTorch is addressing these production considerations via a subset of Python called TorchScript. TorchScript captures the structure of PyTorch programs whereas a JIT compiler uses that structure to optimize. In addition, PyTorch has announced experimental support for quantization and mobile.

  • What are main features of PyTorch?

    We note these features of PyTorch:

    • Usability and Speed: With eager mode, PyTorch provides flexibility for research. Via TorchScript, models can be converted to graph mode for speed and optimization. For efficiency, most of PyTorch is implemented in C++.
    • Automatic Differentiation: Gradient computations are easy to perform. Via operator overloading, PyTorch builds up a representation of the computed function every time it is executed.
    • Distributed Training: There's native support for asynchronous execution and peer-to-peer communication.
    • Mobile Deployment: ML models can be deployed in mobile applications.
    • Interoperability: PyTorch can export models in ONNX format that can then be imported into ONNX-compatible platforms, runtimes and visualizers. Moreover, it's easy convert between PyTorch tensors and NumPy arrays.
    • Extensibility: It's easy to add custom behaviour. For example, automatic differentiation can be customized by deriving from torch.autograd.Function and implementing forward() and backward() methods.
    • C++ Frontend: For bare metal C++ applications, this provides an alternative to Python frontend. It enables higher performance and lower latency.
    • Quantization: Store and manipulate tensors at lower bit-widths instead of floating-point precision. It's done mainly to improve performance during inference.
    • Cloud Support: Popular cloud platforms support PyTorch including AWS, GCP and Azure.
  • Which are main PyTorch packages and modules?
    Some PyTorch packages. Source: El Aidouni 2019.
    Some PyTorch packages. Source: El Aidouni 2019.

    The main package is called torch. It contains torch.Tensor, which is the main data structure to store multi-dimensional tensors. Operations on tensors are also defined in this package.

    The package torch.nn is useful for building neural networks. To create a NN model, we need to subclass torch.nn.Module. NN layers that are available include torch.nn.Conv1d, torch.nn.MaxPool1d, torch.nn.Sigmoid, torch.nn.BatchNorm1d, torch.nn.LSTM, torch.nn.Linear, nn.Dropout, nn.MSELoss, and many more. Functional equivalents of these are in torch.nn.functional.

    Neural network weights and biases are adjusted via optimizers to minimize the loss. The package torch.optim provide many optimizers including torch.optim.SGD and torch.optim.Adam. Learning rates can be adjusted via the module torch.optim.lr_scheduler.

    Another useful module is in which Dataset and DataLoader are important classes. Among other things, they're useful for batching data and distributing them to multiple workers.

    Three application-specific packages are torchvision, torchaudio and torchtext. For example, torchvision.datasets gives easy access to image datasets and torchvision.transforms encapsulates useful transforms.

  • Could you give an example of a neural network model in PyTorch?
    Sample PyTorch construction of a neural network model. Source: Paszke et al. 2019, listing 1.
    Sample PyTorch construction of a neural network model. Source: Paszke et al. 2019, listing 1.

    The example in the figure shows how to build a NN model by creating a subclass of torch.nn.Model class. The model is initialized with a convolutional layer and a linear layer. While PyTorch has torch.nn.Linear, this example shows how easy it is to build a custom linear layer. Essentially, the model is implemented as a class whose members are the model's layers.

    The model itself is evaluated on an input activation by calling the forward() method. We specify in this method how the layers are connected. In this example, there's a non-linear ReLU activation between the two layers. There's a softmax layer at the end. The backward pass happens in backward() method but this is supplied automatically by PyTorch's autograd module.

    Loss is computed by the caller based on the output of forward(). Gradients are calculated next during the backward pass. Finally, optimizer is invoked to adjust the model's parameters.

  • How do I use GPUs with PyTorch?

    To keep track of and use GPUs, torch.cuda is the module to use. We can check if GPUs are available using torch.cuda.is_available(). If so, call torch.device('cuda') to select a GPU.

    When tensors are allocated to a GPU, operations and their results will be on the same GPU device. Cross-GPU operations are not allowed by default but this is possible when peer-to-peer memory access is enabled. Data can be copied or moved from one GPU to another. To use multiple GPUs for distributed training, consider using nn.DataParallel.

    GPU operations are asynchronous. Operations are queued up for a particular device. CPU can continue running and queueing operations to other GPUs. All of this is transparent to the programmer. It's possible to force synchronous GPU calls, which can be useful to debug errors.

    A linear sequence of execution on a particular device is called a CUDA Stream. Every device has a default stream but we can create new streams. Operations within a stream are executed in the order they were queued. Since streams execute concurrently, operations across streams can be in any order. It's possible to synchronize across streams using for example synchronize() or wait_stream().

  • What are some useful resources to get started with PyTorch?

    The official PyTorch documentation is an essential reference. Beginners can start with step-by-step tutorials. The official PyTorch Resources page includes links to discussion forums, Slack channels. and how to get started with PyTorch on cloud platforms.

    Beginners may also refer to a handy cheatsheet of PyTorch modules and how to use them.

    A curated list of applications and models written in PyTorch can be useful for developers who like to learn by examples.

    TechRepublic has published a useful list of books on PyTorch.

    NVIDIA provides PyTorch container images that include NVIDIA CUDA, NVIDIA cuDNN, TensorBoard, TensorRT and optimized examples of well-known models.


Illustrating the use of modules in Torch. Source: Collobert et al. 2002, fig. 1.

As a modular machine learning library, Torch is released under free BSD license. With new ML algorithms being proposed and presented at conferences, a tool such as Torch can help in implementing and comparing them. Torch is implemented in C++. It's modular because simple modules can be combined to create complex models. For example, a multi-layer perceptron can be realized with two linear modules (for input and output) with a non-linear hidden layer (such as tanh) in between.


Torch7 is released as a framework for numeric computing and machine learning. It's based on Lua, which is a fast interpreted language with a good C API. Since Lua is written in ANSI C, it can be compiled for various target platforms. Torch7 comes with 8 built-in packages. It also supports parallelization with OpenMP and CUDA.


Version 1.0 of autograd is released on GitHub. First commit of this package is from November 2014. Maclaurin, in his doctoral thesis, describes autograd in detail. This is a Python implementation of Automatic Differentiation (AD). It can compute gradients on any NumPy code. He also notes that autograd has become quite popular within the machine learning community. A 2017 survey paper credits autograd, Chainer and PyTorch for popularizing AD.


A week after alpha-0, alpha-1 release of PyTorch appears on GitHub. This includes torch.nn, torch.autograd, torch.optim, torch.load and On MNIST dataset, PyTorch runs as fast as Torch-Lua. PyTorch uses 1500MB of system memory whereas Torch-Lua uses 2300MB.


PyTorch gets its first public beta release. It's early development is guided by Soumith Chintala, a core developer of Torch. It starts as a fork of Chainer that has dynamic graphs and interpretable development environment. Though Torch is popular and even accepted by organizations such as Facebook AI Research, Lua is not as mainstream as Python. Other reasons to move to Python are easy debugging and an imperative-style framework.


Caffe2 is merged into PyTorch codebase. Caffe2 and PyTorch are both open source ML frameworks from Facebook.

TorchScript and PyTorch Intermediate Representation (IR) for optimized code. Source: He 2019b.

Version 1.0.0 of PyTorch is released. This comes with a JIT compiler and TorchScript, which is a subset of Python. Due to PyTorch's Intermediate Representation (IR), models can be optimized and deployed in non-Python environments. This release introduces nn.distributed package. This release also includes Torch Hub, a repository for pre-trained models.

Research papers at conferences feature more of PyTorch than TensorFlow. Source: He 2019b.

PyTorch adoption grows a lot within the research community. Among the many deep learning frameworks, only PyTorch and TensorFlow seem to matter most. This year also sees a number of regular PyTorch releases from v1.0.1 (February) to v1.3.1 (November). By now, while PyTorch is growing, Torch community is defunct.


PyTorch 1.4 is released. This allows customized builds for PyTorch Mobile. As an experimental feature, there's an RPC framework for distributed model parallel training. Also experimental are Java bindings. In February, NVIDIA releases a container image for this version of PyTorch.


  1. Baydin, Atılım Günes, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. "Automatic differentiation in machine learning: a survey." The Journal of Machine Learning Research, vol. 18, no. 1, pp. 1-43, January. Accessed 2020-03-05.
  2. Bharath GS. 2020. " bharathgs/Awesome-pytorch-list." GitHub, March 2. Accessed 2020-03-05.
  3. Collobert, Ronan, Samy Bengio, and Johnny Mariéthoz. 2002. "Torch: a modular machine learning software library." IDIAP Research Report, RR 02-46, IDIAP Publications, October 30. Accessed 2020-03-05.
  4. Collobert, Ronan, Koray Kavukcuoglu, and Clément Farabet. 2011. "Torch7: A Matlab-like Environment for Machine Learning." NIPS Workshop. Accessed 2020-03-05.
  5. DeVito, Zachary. 2019. "Torchscript: Optimized Execution of PyTorch Programs." NeurIPS, December 14. Accessed 2020-03-07.
  6. Doshi, Sanket. 2019. "Various Optimization Algorithms For Training Neural Network." Medium, January 13. Accessed 2020-03-07.
  7. Eckerle, Natalie. 2020. "PyTorch: A resources guide for developers." TechRepublic, February 20. Accessed 2020-03-05.
  8. El Aidouni, Manal. 2019. "Pytorch guide 101." May 26. Accessed 2020-03-05.
  9. Exxact Corporation. 2020. "PyTorch vs TensorFlow in 2020: What You Should Know About These Frameworks." Blog, Exxact Corporation, January 23. Accessed 2020-03-05.
  10. He, Hecate. 2019a. "PyTorch Deep Learning Framework: Speed + Usability." Synced, December 16. Accessed 2020-03-05.
  11. He, Horace. 2019b. "The State of Machine Learning Frameworks in 2019." The Gradient, October 10. Accessed 2020-03-05.
  12. HIPS GitHub. 2019. "HIPS/autograd." November 18. Accessed 2020-03-07.
  13. Maclaurin, Dougal. 2016. "Modeling, Inference and Optimization with Composable Differentiable Procedures." PhD thesis, Graduate School of Arts & Sciences, Harvard University, April. Accessed 2020-03-05.
  14. NVIDIA Docs. 2020. "PyTorch Release 20.02." Deep Learning Frameworks Documentation, February 24. Accessed 2020-03-05.
  15. Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. "PyTorch: An Imperative Style, High-Performance Deep Learning Library." arXiv, v1, December 3. Accessed 2020-03-05.
  16. Peng, Tony. 2018. "Caffe2 Merges With PyTorch." Synced, on Medium, April 3. Accessed 2020-03-07.
  17. PyTorch. 2020. "PyTorch 1.4 released, domain libraries updated." Blog, PyTorch, January 15. Accessed 2020-03-05.
  18. PyTorch. 2020b. "Features." Accessed 2020-03-07.
  19. PyTorch Docs. 2020a. "torch." PyTorch v1.4.0. Accessed 2020-03-05.
  20. PyTorch Docs. 2020b. "torch.Tensor." PyTorch v1.4.0. Accessed 2020-03-05.
  21. PyTorch Docs. 2020c. "torch.nn." PyTorch v1.4.0. Accessed 2020-03-05.
  22. PyTorch Docs. 2020d. "" PyTorch v1.4.0. Accessed 2020-03-05.
  23. PyTorch Docs. 2020e. "torch.optim." PyTorch v1.4.0. PyTorch v1.4.0. Accessed 2020-03-05.
  24. PyTorch Docs. 2020f. "CUDA Semantics." PyTorch v1.4.0. Accessed 2020-03-05.
  25. PyTorch Docs. 2020g. "Quantization." PyTorch v1.4.0. Accessed 2020-03-05.
  26. PyTorch Docs. 2020h. "torchvision." PyTorch v1.4.0. Accessed 2020-03-05.
  27. PyTorch Docs. 2020i. "torch.nn.functional." PyTorch v1.4.0. Accessed 2020-03-05.
  28. PyTorch GitHub. 2020. "Releases." pytorch/pytorch, on GitHub. Accessed 2020-03-05.
  29. Shrimali, Vishwesh. 2019. "PyTorch for Beginners: Basics." Learn OpenCV, May 31. Accessed 2020-03-05.
  30. Thomas, Sherin, and Sudhanshu Passi. 2019. "PyTorch Deep Learning Hands-On." Packt Publishing Limited, April. Accessed 2020-03-05.
  31. Torch GitHub. 2019. "torch/torch7." April 18. Accessed 2020-03-05.

Further Reading

  1. Shrimali, Vishwesh. 2019. "PyTorch for Beginners: Basics." Learn OpenCV, May 31. Accessed 2020-03-05.
  2. He, Horace. 2019. "The State of Machine Learning Frameworks in 2019." The Gradient, October 10. Accessed 2020-03-05.
  3. Amidi, Afshine and Shervine Amidi. 2018. "A detailed example of how to generate your data in parallel with PyTorch." Blog, March. Accessed 2020-03-05.
  4. DeVito, Zachary. 2019. "Torchscript: Optimized Execution of PyTorch Programs." NeurIPS, December 14. Accessed 2020-03-07.

Article Stats

Author-wise Stats for Article Edits

No. of Edits
No. of Chats

Cite As

Devopedia. 2020. "PyTorch." Version 3, March 7. Accessed 2020-11-25.
Contributed by
1 author

Last updated on
2020-03-07 15:39:28