PyTorch
- Summary
-
Discussion
- Why should I learn PyTorch and what are its benefits?
- How does PyTorch compare with other deep learning libraries?
- What are main features of PyTorch?
- Which are main PyTorch packages and modules?
- Could you give an example of a neural network model in PyTorch?
- How do I use GPUs with PyTorch?
- What are some useful resources to get started with PyTorch?
- Milestones
- References
- Further Reading
- Article Stats
- Cite As

To build an artificial neural network from scratch for machine learning models is not a trivial task. One approach is to use a library that simplifies many of the common tasks. PyTorch is a Python-based library for machine learning.
PyTorch was designed to be both user friendly and performant. Python programmers will find it easy to learn PyTorch since the programming style is pythonic. While PyTorch provides many ready-to-use packages and modules, developers can also customize them. It's been said that,
Every aspect of PyTorch is a regular Python program under the full control of its user.
PyTorch is open source with an active community developing it. It's a popular choice for research work, as seen in its growing adoption in 2019.
Discussion
-
Why should I learn PyTorch and what are its benefits? An essential Python package for numerical and scientific computing is NumPy. However, NumPy is unable to use the power of GPUs. PyTorch allows us to use GPUs while also giving a number of useful machine learning architectures and utilities out of the box.
For Python programmers, writing and reading PyTorch code is very much like that of Python code. It also integrates easily with other Python packages. PyTorch API is intuitive and easy to learn. It's easy to experiment and learn with PyTorch. There's no need to spend hours reading its documentation.
PyTorch combines the best of usability and speed. It's imperative, Pythonic and easy to debug. It offers GPU acceleration with automatic differentiation. Complexity of ML modelling is hidden behind intuitive APIs. For performance, most of it is written in C++. Via YAML metadata files, new language bindings can be quickly created.
-
How does PyTorch compare with other deep learning libraries? An alternative to PyTorch is TensorFlow, although PyTorch remains a popular choice for research. It's been said that TensorFlow is better for production. With eager execution (inspired by Chainer) and Keras integration, TensorFlow is becoming acceptable for research as well. In other words, with respect to ease of use and debugging, TensorFlow is becoming as good as PyTorch.
Some production environments may not have a Python runtime. Sometimes ML models have to be embedded in constrained devices such as smartphones. Updates to models should be seamless without any downtime. TensorFlow has addressed all these considerations. PyTorch is addressing these production considerations via a subset of Python called TorchScript. TorchScript captures the structure of PyTorch programs whereas a JIT compiler uses that structure to optimize. In addition, PyTorch has announced experimental support for quantization and mobile.
-
What are main features of PyTorch? We note these features of PyTorch:
- Usability and Speed: With eager mode, PyTorch provides flexibility for research. Via TorchScript, models can be converted to graph mode for speed and optimization. For efficiency, most of PyTorch is implemented in C++.
- Automatic Differentiation: Gradient computations are easy to perform. Via operator overloading, PyTorch builds up a representation of the computed function every time it is executed.
- Distributed Training: There's native support for asynchronous execution and peer-to-peer communication.
- Mobile Deployment: ML models can be deployed in mobile applications.
- Interoperability: PyTorch can export models in ONNX format that can then be imported into ONNX-compatible platforms, runtimes and visualizers. Moreover, it's easy convert between PyTorch tensors and NumPy arrays.
- Extensibility: It's easy to add custom behaviour. For example, automatic differentiation can be customized by deriving from
torch.autograd.Function
and implementingforward()
andbackward()
methods. - C++ Frontend: For bare metal C++ applications, this provides an alternative to Python frontend. It enables higher performance and lower latency.
- Quantization: Store and manipulate tensors at lower bit-widths instead of floating-point precision. It's done mainly to improve performance during inference.
- Cloud Support: Popular cloud platforms support PyTorch including AWS, GCP and Azure.
-
Which are main PyTorch packages and modules? The main package is called
torch
. It containstorch.Tensor
, which is the main data structure to store multi-dimensional tensors. Operations on tensors are also defined in this package.The package
torch.nn
is useful for building neural networks. To create a NN model, we need to subclasstorch.nn.Module
. NN layers that are available includetorch.nn.Conv1d
,torch.nn.MaxPool1d
,torch.nn.Sigmoid
,torch.nn.BatchNorm1d
,torch.nn.LSTM
,torch.nn.Linear
,nn.Dropout
,nn.MSELoss
, and many more. Functional equivalents of these are intorch.nn.functional
.Neural network weights and biases are adjusted via optimizers to minimize the loss. The package
torch.optim
provide many optimizers includingtorch.optim.SGD
andtorch.optim.Adam
. Learning rates can be adjusted via the moduletorch.optim.lr_scheduler
.Another useful module is
torch.utils.data
in whichDataset
andDataLoader
are important classes. Among other things, they're useful for batching data and distributing them to multiple workers.Three application-specific packages are
torchvision
,torchaudio
andtorchtext
. For example,torchvision.datasets
gives easy access to image datasets andtorchvision.transforms
encapsulates useful transforms. -
Could you give an example of a neural network model in PyTorch? The example in the figure shows how to build a NN model by creating a subclass of
torch.nn.Model
class. The model is initialized with a convolutional layer and a linear layer. While PyTorch hastorch.nn.Linear
, this example shows how easy it is to build a custom linear layer. Essentially, the model is implemented as a class whose members are the model's layers.The model itself is evaluated on an input activation by calling the
forward()
method. We specify in this method how the layers are connected. In this example, there's a non-linear ReLU activation between the two layers. There's a softmax layer at the end. The backward pass happens inbackward()
method but this is supplied automatically by PyTorch'sautograd
module.Loss is computed by the caller based on the output of
forward()
. Gradients are calculated next during the backward pass. Finally, optimizer is invoked to adjust the model's parameters. -
How do I use GPUs with PyTorch? To keep track of and use GPUs,
torch.cuda
is the module to use. We can check if GPUs are available usingtorch.cuda.is_available()
. If so, calltorch.device('cuda')
to select a GPU.When tensors are allocated to a GPU, operations and their results will be on the same GPU device. Cross-GPU operations are not allowed by default but this is possible when peer-to-peer memory access is enabled. Data can be copied or moved from one GPU to another. To use multiple GPUs for distributed training, consider using
nn.DataParallel
.GPU operations are asynchronous. Operations are queued up for a particular device. CPU can continue running and queueing operations to other GPUs. All of this is transparent to the programmer. It's possible to force synchronous GPU calls, which can be useful to debug errors.
A linear sequence of execution on a particular device is called a CUDA Stream. Every device has a default stream but we can create new streams. Operations within a stream are executed in the order they were queued. Since streams execute concurrently, operations across streams can be in any order. It's possible to synchronize across streams using for example
synchronize()
orwait_stream()
. -
What are some useful resources to get started with PyTorch? The official PyTorch documentation is an essential reference. Beginners can start with step-by-step tutorials. The official PyTorch Resources page includes links to discussion forums, Slack channels. and how to get started with PyTorch on cloud platforms.
Beginners may also refer to a handy cheatsheet of PyTorch modules and how to use them.
A curated list of applications and models written in PyTorch can be useful for developers who like to learn by examples.
TechRepublic has published a useful list of books on PyTorch.
NVIDIA provides PyTorch container images that include NVIDIA CUDA, NVIDIA cuDNN, TensorBoard, TensorRT and optimized examples of well-known models.
Milestones
2002

As a modular machine learning library, Torch is released under free BSD license. With new ML algorithms being proposed and presented at conferences, a tool such as Torch can help in implementing and comparing them. Torch is implemented in C++. It's modular because simple modules can be combined to create complex models. For example, a multi-layer perceptron can be realized with two linear modules (for input and output) with a non-linear hidden layer (such as tanh
) in between.
Torch7 is released as a framework for numeric computing and machine learning. It's based on Lua, which is a fast interpreted language with a good C API. Since Lua is written in ANSI C, it can be compiled for various target platforms. Torch7 comes with 8 built-in packages. It also supports parallelization with OpenMP and CUDA.
2015
Version 1.0 of autograd is released on GitHub. First commit of this package is from November 2014. Maclaurin, in his doctoral thesis, describes autograd in detail. This is a Python implementation of Automatic Differentiation (AD). It can compute gradients on any NumPy code. He also notes that autograd has become quite popular within the machine learning community. A 2017 survey paper credits autograd, Chainer and PyTorch for popularizing AD.
2016
2017
PyTorch gets its first public beta release. It's early development is guided by Soumith Chintala, a core developer of Torch. It starts as a fork of Chainer that has dynamic graphs and interpretable development environment. Though Torch is popular and even accepted by organizations such as Facebook AI Research, Lua is not as mainstream as Python. Other reasons to move to Python are easy debugging and an imperative-style framework.
2018
2018

Version 1.0.0 of PyTorch is released. This comes with a JIT compiler and TorchScript, which is a subset of Python. Due to PyTorch's Intermediate Representation (IR), models can be optimized and deployed in non-Python environments. This release introduces nn.distributed
package. This release also includes Torch Hub, a repository for pre-trained models.

PyTorch adoption grows a lot within the research community. Among the many deep learning frameworks, only PyTorch and TensorFlow seem to matter most. This year also sees a number of regular PyTorch releases from v1.0.1 (February) to v1.3.1 (November). By now, while PyTorch is growing, Torch community is defunct.
2020
2022
Meta announces the formation of PyTorch Foundation, under the Linux Foundation. Cloud companies (AWS, Google Cloud) and hardware vendors (AMD, NVDIA) are some of the other members of the Foundation. The Foundation will market PyTorch and ecosystem around it. The Foundation's motto is "to drive adoption of AI tooling by fostering and sustaining an ecosystem of open source, vendor-neutral projects with PyTorch."
References
- Alford, A. 2022. "PyTorch Becomes Linux Foundation Top-Level Project." InfoQ, October 18. Accessed 2022-12-08.
- Baydin, Atılım Günes, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2017. "Automatic differentiation in machine learning: a survey." The Journal of Machine Learning Research, vol. 18, no. 1, pp. 1-43, January. Accessed 2020-03-05.
- Bharath GS. 2020. " bharathgs/Awesome-pytorch-list." GitHub, March 2. Accessed 2020-03-05.
- Collobert, Ronan, Samy Bengio, and Johnny Mariéthoz. 2002. "Torch: a modular machine learning software library." IDIAP Research Report, RR 02-46, IDIAP Publications, October 30. Accessed 2020-03-05.
- Collobert, Ronan, Koray Kavukcuoglu, and Clément Farabet. 2011. "Torch7: A Matlab-like Environment for Machine Learning." NIPS Workshop. Accessed 2020-03-05.
- DeVito, Zachary. 2019. "Torchscript: Optimized Execution of PyTorch Programs." NeurIPS, December 14. Accessed 2020-03-07.
- Doshi, Sanket. 2019. "Various Optimization Algorithms For Training Neural Network." Medium, January 13. Accessed 2020-03-07.
- Eckerle, Natalie. 2020. "PyTorch: A resources guide for developers." TechRepublic, February 20. Accessed 2020-03-05.
- El Aidouni, Manal. 2019. "Pytorch guide 101." May 26. Accessed 2020-03-05.
- Exxact Corporation. 2020. "PyTorch vs TensorFlow in 2020: What You Should Know About These Frameworks." Blog, Exxact Corporation, January 23. Accessed 2020-03-05.
- HIPS GitHub. 2019. "HIPS/autograd." November 18. Accessed 2020-03-07.
- He, Hecate. 2019a. "PyTorch Deep Learning Framework: Speed + Usability." Synced, December 16. Accessed 2020-03-05.
- He, Horace. 2019b. "The State of Machine Learning Frameworks in 2019." The Gradient, October 10. Accessed 2020-03-05.
- Maclaurin, Dougal. 2016. "Modeling, Inference and Optimization with Composable Differentiable Procedures." PhD thesis, Graduate School of Arts & Sciences, Harvard University, April. Accessed 2020-03-05.
- Meta. 2022. "Announcing the PyTorch Foundation: A new era for the cutting-edge AI framework." MetaAI Research, September 12. Accessed 2022-12-08.
- NVIDIA Docs. 2020. "PyTorch Release 20.02." Deep Learning Frameworks Documentation, February 24. Accessed 2020-03-05.
- Paszke, Adam, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. "PyTorch: An Imperative Style, High-Performance Deep Learning Library." arXiv, v1, December 3. Accessed 2020-03-05.
- Peng, Tony. 2018. "Caffe2 Merges With PyTorch." Synced, on Medium, April 3. Accessed 2020-03-07.
- PyTorch. 2020. "PyTorch 1.4 released, domain libraries updated." Blog, PyTorch, January 15. Accessed 2020-03-05.
- PyTorch. 2020b. "Features." Accessed 2020-03-07.
- PyTorch Docs. 2020a. "torch." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020b. "torch.Tensor." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020c. "torch.nn." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020d. "torch.utils.data." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020e. "torch.optim." PyTorch v1.4.0. PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020f. "CUDA Semantics." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020g. "Quantization." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020h. "torchvision." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch Docs. 2020i. "torch.nn.functional." PyTorch v1.4.0. Accessed 2020-03-05.
- PyTorch GitHub. 2020. "Releases." pytorch/pytorch, on GitHub. Accessed 2020-03-05.
- Shrimali, Vishwesh. 2019. "PyTorch for Beginners: Basics." Learn OpenCV, May 31. Accessed 2020-03-05.
- Thomas, Sherin, and Sudhanshu Passi. 2019. "PyTorch Deep Learning Hands-On." Packt Publishing Limited, April. Accessed 2020-03-05.
- Torch GitHub. 2019. "torch/torch7." April 18. Accessed 2020-03-05.
Further Reading
- Shrimali, Vishwesh. 2019. "PyTorch for Beginners: Basics." Learn OpenCV, May 31. Accessed 2020-03-05.
- He, Horace. 2019. "The State of Machine Learning Frameworks in 2019." The Gradient, October 10. Accessed 2020-03-05.
- Amidi, Afshine and Shervine Amidi. 2018. "A detailed example of how to generate your data in parallel with PyTorch." Blog, March. Accessed 2020-03-05.
- DeVito, Zachary. 2019. "Torchscript: Optimized Execution of PyTorch Programs." NeurIPS, December 14. Accessed 2020-03-07.
Article Stats
Cite As
See Also
- PyTorch Tensor
- PyTorch Data Handling
- TorchScript
- Deep Learning Frameworks
- Python