NumPy
NumPy is an open source Python library that enables efficient manipulation of multidimensional numerical data structures. These are called arrays in NumPy. NumPy is an alternative to Interactive Data Language (IDL) and MATLAB.^{}
Since it's release in 2005, NumPy has become a fundamental package for numerical and scientific computing in Python. In addition to efficient data structures and operations on them, it provides many highlevel mathematical functions that aid scientific computation. Pandas, SciPy, Matplotlib, scikitlearn and scikitimage are just a few popular scientific packages that make use of NumPy.^{}
Discussion

What does NumPy do differently from core Python? Python is slower than compiled languages such as C but it's easy to learn. Python is suited for rapid prototyping and iterative development.^{}
While Python's
list
data type can be used to construct multidimensional data structures (lists containing lists), NumPy is faster and provides a better API for developers. Python's lists are general purpose. They can contain data of different types. This means that types are also stored, typedispatching code is invoked at runtime and types are checked. Lists are processed using loops or comprehensions and can't be vectorized to support elementwise operations. NumPy sacrifices some of Python's flexibility to improve performance.^{}Specifically, NumPy is better at these aspects:^{}
 Size: NumPy data structures take up less space. Each Python integer object takes 28 bytes whereas in NumPy an integer is just 8 bytes. A Python list of
n
items requires64+8n+28n
bytes whereas in NumPy it's96+8n
bytes.  Performance: NumPy code runs faster than Python code, particularly for large input data.
 Functionality: NumPy provides lots of functions and methods to simplify operations. Highlevel operations such as linear algebra are also included.
 Size: NumPy data structures take up less space. Each Python integer object takes 28 bytes whereas in NumPy an integer is just 8 bytes. A Python list of

What are some of the main features of NumPy? NumPy arrays are homogeneous, meaning that array elements are of the same type. Hence, no type checking is required at runtime. All elements of an array take up same amount of space.^{}
The spacing between elements along an axis is also constant. This is called striding. This is useful when the same data in memory can be used to create a new array without copying. Different arrays are therefore different views into memory. Thus, it's easier to modify data subsets in memory.^{}
Operations are vectorized, which means that the operation can be executed in parallel on multiple elements of the array. This speeds up computation. Developers need not write
for
loops.^{}NumPy provides APIs for easy manipulation of arrays. Some of these are indexing, slicing, reshaping, stacking and splitting. Broadcasting is a feature that allows operations between vectors and scalars, or vectors of different sizes.^{}
NumPy integrates easily with C/C++ or Fortran code that may provide optimized implementations. Useful functions covering linear algebra, Fourier transform, and random numbers are provided.^{}

Could you share some performance numbers comparing NumPy versus Python implementations? For a simple computation of mean and standard deviation of a million floating point numbers, NumPy was 30X faster than a pure Python implementation. However, optimized Cython and C implementations were even faster.^{} Another study showed that if input is small (less than 200 numbers), pure Python did better than NumPy. For inputs greater than about 15,000 numbers, NumPy outperformed C++.^{}
One experiment in Machine Learning compared pure Python, NumPy and TensorFlow (on CPU) implementations of gradient descent. Runtimes were 18.65, 0.32 and 1.20 seconds respectively. NumPy was 50X faster than pure Python. For more complex ML problems deployed on multiple GPUs, TensorFlow is likely to outperform NumPy.^{}
When evaluating NumPy performance, the underlying library for vector/matrix computations matters. NumPy comes with Default BLAS & Lapack. Depending on the distribution, alternatives may be included: OpenBLAS, Intel MKL, ATLAS, etc. In general, these alternatives are faster than the default library. For example, SVD is 10X faster on Intel MKL.^{}
Hardware platforms may provide further acceleration. For example, Intel AVX2 provides at least 20% improvement on top of OpenBLAS.^{}

Does NumPy automatically make use of GPU hardware? NumPy doesn't natively support GPUs. However, there are tools and libraries to run NumPy on GPUs.
Numba is a Python compiler that can compile Python code to run on multicore CPUs and CUDAenabled GPUs. Numba also understands NumPy and generates optimized compiled code. Developers specify type signatures for Python functions. Numba uses them towards justintime (JIT) compilation. Numba team also provides
pyculib
, which is a Python interface to CUDA libraries such as cuBLAS, cuFFT and cuRAND.^{}Grumpy has been proposed as a framework to seamlessly target multicore CPUs and GPUs. It does a mix of JIT compilation and offloading to optimized libraries such as cuBLAS or LAPACK.^{}
CuPy is a Python library that implements NumPy arrays for CUDAenabled GPUs and leverages CUDA GPU acceleration libraries. The code is mostly a dropin replacement to NumPy code since the APIs are very similar.^{} PyCUDA is a similar library from NVIDIA.^{}
MinPy is similar to CuPy and is meant to be a NumPy interface above MXNet for building artificial neural networks. It includes auto differentiation in addition to transparent CPU/GPU acceleration.^{}

What are some essential resources to learn NumPy? The main NumPy website is the definitive resource to consult.^{} Beginners can start by reading their Quickstart tutorial or the absolute beginner's guide. The latter includes the basics of installing NumPy.^{}
Rougier's book titled From Python to Numpy focuses on Python programmers who wish to learn NumPy and it's vectorization.^{} Perhaps a classic is the PhD thesis titled Guide to NumPy, by Travis E. Oliphant who created NumPy.^{}
MATLAB users might want to read NumPy for Matlab users. It maps MATLAB operations to NumPy equivalents.^{}
DataCamp blog has shared a handy NumPy cheatsheet.
Those who wish to contribute to the NumPy project or study it's source code can head to NumPy's GitHub repository.
Milestones
Numeric is released to enable numerical computations.^{} It's designed to provide homogeneous numeric arrays, that is, arrays whose elements all belong to the same data type, and therefore easier and faster to process.^{}
NumPy is released based on an older library named Numeric. It also combines features of another library named Numarray. NumPy is initially named SciPy Core but renamed to NumPy in January 2006.^{}
2006
NumPy v1.0 is released.^{}
2009
NumPy v1.3.0 is released.^{} This release includes experimental Windows 64bit support. Support for 64bit OpenBLAS comes a decade later in December 2019.^{}
2010
NumPy v1.5.0 is released.^{} This is the first release to support Python 3.^{}
2019
GitHub publishes a study of Machine Learning (ML) projects hosted on their platform. The study spans contributions from JanDec 2018. It's seen that 74% of ML Python projects import NumPy. This is followed by SciPy and Pandas.^{}
2019
NumPy v1.17.0 is released.^{} This release supports Python 3.53.7 but drops support for Python 2.7.^{} In fact, NumPy v1.16.x is the last series to support Python 2.7 but being a long term release, v1.16.x will be maintained till 2020.^{} NumPy v1.16.6 is released in December 2019.^{}
2020
Following the end of life of Python 2 in January 2020,^{} the number of downloads for older NumPy releases based on Python 2 falls sharply. By April 2020, 80% of NumPy downloads are based on Python 3.^{}
References
 Candido, Renato. 2018. "Pure Python vs NumPy vs TensorFlow Performance Comparison." Real Python, May 7. Updated 20180705. Accessed 20200427.
 Cohen, Ori. 2019. "Is your Numpy optimized for speed?" Towards Data Science, on Medium, September 27. Accessed 20200427.
 Cournapeau, David. 2018. "File:NumPy logo.svg." Wikipedia, August 29. Accessed 20200427.
 Elliott, Thomas. 2019. "The State of the Octoverse: machine learning." Blog, GitHub, January 24. Accessed 20200427.
 Fowler, Matt. 2016. "Speeding up Python and NumPy: C++ing the Way." Medium, March 20. Accessed 20200427.
 Harris, Mark. 2013. "Numba: HighPerformance Python with CUDA Acceleration." NVIDIA Developer Blog, September 19. Updated 20170919. Accessed 20200427.
 Jimenez, Athenas. 2016. "Improving Python performance for scientific tools and libraries." 01.org, Blog, Intel Open Source, Intel Corporation, May 13. Accessed 20200427.
 Konrad, Markus. 2018. "Vectorization and parallelization in Python with NumPy and Pandas." WZB Data Science Blog, February 02. Accessed 20200427.
 MinPy Docs. 2016. "NumPy under MinPy, with GPU." Distributed (Deep) Machine Learning Community, on Read the Docs, November 11. Accessed 20200427.
 NVIDIA Developer. 2011. "PyCUDA." NVIDIA, October 02. Updated 20181011. Accessed 20200427.
 NumPy. 2020a. "Older Array Packages." Accessed 20200427.
 NumPy. 2020b. "Homepage." NumPy. Accessed 20200427.
 NumPy DevDocs. 2020. "NumPy: the absolute basics for beginners." April 26. Accessed 20200427.
 NumPy Docs. 2020a. "Release Notes." NumPy, February 5. Accessed 20200427.
 NumPy Docs. 2020b. "NumPy 1.17.0 Release Notes." NumPy, February 5. Accessed 20200427.
 NumPy Docs. 2020c. "NumPy 1.16.0 Release Notes." NumPy, February 5. Accessed 20200427.
 NumPy Docs. 2020d. "NumPy 1.5.0 Release Notes." NumPy, February 5. Accessed 20200427.
 NumPy Docs. 2020e. "NumPy for Matlab users." NumPy, February 5. Accessed 20200427.
 PyPI. 2020. "Release history." numpy, 1.18.3, April 20. Accessed 20200427.
 PyPI Stats. 2020. "numpy." PyPI Stats, April 27. Accessed 20200427.
 Ravishankar, Mahesh, and Vinod Grover. 2019. "Automatic acceleration of Numpy applications on GPUs and multicore CPUs." arXiv, v1, January 11. Accessed 20200427.
 Ross, Paul. 2014. "The Performance of Python, Cython and C on a Vector." Notes on Cython, October 6. Accessed 20200427.
 Rougier, Nicolas P. 2017. "From Python to Numpy." May. Accessed 20200427.
 SciPy. 2020. "Frequently Asked Questions." SciPy. Accessed 20200427.
 SciPy GitHub. 2020. "SciPy: History_of_SciPy." Accessed 20200427.
 Seif, George. 2019. "Here’s How to Use CuPy to Make Numpy Over 10X Faster." Towards Data Science, on Medium, August 22. Accessed 20200427.
 UCF. 2020. "Python Lists vs. Numpy Arrays  What is the difference?" webcourses@UCF, IST Advanced Topics Primer, Univ. of Central Florida. Accessed 20200427.
 Waters, John K. 2020. "Python 2 Officially Hits End of Life, Final Few Fixes Coming April 2020." ADTMag, 1105 Media Inc., January 09. Accessed 20200427.
Further Reading
 NumPy DevDocs. 2020. "NumPy: the absolute basics for beginners." April 26. Accessed 20200427.
 Harris, Mark. 2013. "Numba: HighPerformance Python with CUDA Acceleration." NVIDIA Developer Blog, September 19. Updated 20170919. Accessed 20200427.
 Zelenka, Scott. 2018. "How to shrink NumPy, SciPy, Pandas, and Matplotlib for your data product." Towards Data Science, on Medium, September 25. Accessed 20200427.
Article Stats
Cite As
See Also
 NumPy Data Types
 NumPy Array Operations
 Python for Scientific Computing
 SciPy
 Pandas
 PyCUDA