Profiling Python Code
- Summary
-
Discussion
- What are the Python packages or modules that help with profiling?
- Are there visualization tools to simplify the analysis of profiler output?
- What are the types of profilers?
- What are the IPython magic commands that help with profiling?
- How to interpret the output of cProfile?
- What should I look for when analyzing any profiling output?
- Could you share some tips for optimizing profiled code?
- Are there Python IDEs that come with profilers?
- What's the decorator approach to profiling Python code?
- What's Pyflame and why may I want to use it?
- Milestones
- Sample Code
- References
- Further Reading
- Article Stats
- Cite As
Python documentation defines a profile as "a set of statistics that describes how often and for how long various parts of the program executed." In addition to measuring time, profiling can also tell us about memory usage.
The idea of profiling code is to identify bottlenecks in performance. It may be tempting to guess where the bottlenecks could be but profiling is more objective and quantitative. Profiling is a necessary step before attempting to optimize any program. Profiling can lead to restructuring code, implementing better algorithms, releasing unused memory, caching computational results, improving data handling, etc.
Discussion
-
What are the Python packages or modules that help with profiling? Python's standard distribution includes profiling modules by default:
cProfile
andprofile
.profile
is a pure Python module that adds significant overhead. HencecProfile
is preferred, which is implemented in C. Results that come out from this module can be formatted into reports using thepstats
module.cProfile
adds a reasonable overhead. Nonetheless, it must be used only for profiling and not for benchmarking. Benchmarking compares performance between different implementations on possibly different platforms. Because of the overhead introduced bycProfile
, to benchmark C code and Python code, use%time
magic command in IPython instead. Alternatively, we can use thetimeit
module.To trace memory leaks,
tracemalloc
can be used.Other useful modules to install include
line_profiler
,memory_profiler
andpsutil
. The modulepprofile
takes longer to profile but gives more detailed information thanline_profiler
. For memory profiling,heapy
andmeliea
may also help.PyCounters
can be useful in production. -
Are there visualization tools to simplify the analysis of profiler output? vprof
is a profiler that displays the output as a webpage. It launches a Node.js server, implying that Node.js must be installed. On the webpage, you can hover over elements to get more information interactively.The output of cProfile can be saved to a file and then converted using
pyprof2calltree
. This converted file can then be opened with KCachegrind (for Linux) and QCachegrind (for MAC and Windows). Likewise, gprof2dot and RunSnakeRun are alternative graphical viewers.PyCallGraph profiles and outputs the statistics in a format that can be opened by Graphviz, a graph visualization software.
nylas-perftools adds instrumentation around code, profile it and export the results in JSON format. This can be imported into Chrome Developer Tools to visualize the timeline of execution. This can also be used in production since the app stack is only sampled periodically.
-
What are the types of profilers? Three types of profilers exist:
- Event-based: These collect data when an event occurs. Such events may be entry/exit of functions, load/unload of classes, allocation/release of memory, exceptions, etc.
cProfile
is an example. - Instrumented: The application profiles itself by modifying the code or with built-in support from compiler. Using decorators to profile code is an example of this. However, instrumented code is not required in Python because the interpreter provides hooks to do profiling.
- Statistical: Data is collected periodically and therefore what the profiler sees is a sample. While less accurate, this has low overhead. Examples of this include Intel® VTune™ Amplifier, Nylas Perftools and Pyflame. On the contrary, we can also say that
cProfile
,line_profiler
andmemory_profiler
are Deterministic Profilers.
In general, deterministic profiling adds significant overhead whereas statistical profiling has low overhead since profiling is done by sampling. For this reason, the latter can also be used in production. We should note that the profiling overhead in Python is mostly due to the interpreter: overhead due to deterministic profiling is not that expensive.
We can also classify profilers as function profilers, line profilers or memory profilers.
- Event-based: These collect data when an event occurs. Such events may be entry/exit of functions, load/unload of classes, allocation/release of memory, exceptions, etc.
-
What are the IPython magic commands that help with profiling? IPython has the following magic commands for profiling:
%time
: Shows how long a one or more lines of code take to run.%timeit
: Like%time
but gives an average from multiple runs. Option-n
can be used to specify the number of runs. Depending on how long the program takes, the number of runs is limited automatically. This is unlike thetimeit
module.%prun
: Shows time taken by each function.%lprun
: Shows time taken line by line. Functions to profile can be specified with-f
option.%mprun
: Shows how much memory is used.%memit
: Like%mprun
but gives an average from multiple runs, which can be specified with-r
option.
Commands
%time
and%timeit
are available by default. Commands%lprun
,%mprun
and%memit
are available via modulesline-profiler
,memory-profiler
andpsutil
. But to use them within IPython as magic commands, mapping must be done via IPython extension files. Or use the%load_ext
magic command.When timing multiple lines of code, use
%%timeit
instead of%timeit
. This is available for%prun
as well. -
How to interpret the output of cProfile? The output of cProfile can be ordered using the
-s
option. In the accompanying image, we see that the results are ordered by cumulative time in descending order. Everything starts at the module level from wheremain()
function is called. We can also see that most of this is fromapi.py:request()
,sessions.py:request()
andsessions.py:send()
. However, we are unable to tell if these are part of the same call flow or parallel flows. A graphical viewer will be more useful.The column "tottime" is the time spent within a function without considering time spent in functions called within. We also see that
readline()
function fromsocket.py
is called many times. It takes 1.571 seconds of 4.481 seconds of cumulative time. However, the total time within this function is 0. This means that we need to optimize functions that are called byreadline()
. We suspect it might be the read operation of SSL Socket. The entry "4/3" indicates thatsessions.py:send()
is called 3 times plus 1 recursive call. -
What should I look for when analyzing any profiling output? You should identify which function (or lines of code) is taking most of the execution time. You should identify which function is getting called most often. Look for call stacks that you didn't expect. Know the difference between total time spent in function and cumulative time. The latter enables direct comparison of recursive implementations against iterative ones.
With respect to memory usage, see if there's a memory leak.
-
Could you share some tips for optimizing profiled code? Keep in mind that speeding up a function 100 times is irrelevant if that function takes up only a few percent of the program's total execution. This is the essence of profiling. Optimize only what matters.
If profiling shows that I/O is a bottleneck, threading can help. If your code uses a lot of regex, try to replace these with string equivalents, which will run faster. To avoid repeat computations, earlier results can be cached. This is called memoization. Where applicable, this can help in improving performance. If there are no obvious places to optimize the code, you can also consider an alternative runtime such as PyPy or moving critical parts of the code into Cython or C and calling the same from Python. Likewise, vectorize some operations using NumPy. Consider moving stuff from inner loops. Remove logging or make them conditional. If a particular function is called many times, code could be restructured.
Whether during profiling or after optimizing your code, don't ignore your unit tests. They must continue to pass. Also, keep in mind that optimizing code can compromise on readability and maintainability.
-
Are there Python IDEs that come with profilers? With IPython, Jupyter and Anaconda's Spyder,
%time
and%timeit
can be used by default. For line profiling or memory profiling, the necessary modules need to be installed before invoking them.PyCharm Community Edition does not support profiling but the professional version supports it. In fact, cProfile, yappi and vmprof are three profilers that are supported. vmprof is a statistical profiler.
Visual Studio comes with a built-in profiler. However, this works only for CPython and not for IronPython for which .NET profiler should be used.
-
What's the decorator approach to profiling Python code? It's possible to decorate a function for profiling purpose. We can timestamp the entries and exits and thereby calculate time spent within the function. The advantage of this approach is that there's no dependence on any profiling tool. Overhead can be kept low. We can choose to profile only specific parts of our program and ignore core modules and third-party modules. We'll also have better control of how results will be exported.
-
What's Pyflame and why may I want to use it? Pyflame is a profiler for Python that takes snapshots of the Python call stack. From these, it generates flame graphs. Pyflame is based on Linux's
ptrace
. Hence it can't be used on Windows systems. Pyflame overcomes the limitations of cProfile, which adds overhead and doesn't give a full stack trace. Pyflame also works with code not instrumented for profiling.Because Pyflame is statistical in nature, meaning that it doesn't profile every since function call, it can also be used in production.
Milestones
2008
Sample Code
References
- Bonzanini, Marco. 2015. "My Python Code is Slow? Tips for Profiling." Blog, January 5. Accessed 2018-04-20.
- CPython GitHub. 2013a. "History for cpython/Lib/profile.py." February 25. Accessed 2018-05-08.
- CPython GitHub. 2013b. "History for cpython/Lib/cProfile.py." February 25. Accessed 2018-05-08.
- CPython GitHub. 2017. "History for cpython/Lib/pstats.py." February 22. Accessed 2018-05-08.
- Clausing, Scot. 2013. "Timing and Profiling in IPython." PyNash, March 6. Accessed 2018-05-06.
- Danjou, Julien. 2015. "Profiling Python using cProfile: a concrete case." Julien Danjou Blog, November 16. Accessed 2018-04-20.
- Davis. 2015. "Profiling in Python." HumanGeo Blog, July 28. Accessed 2018-05-06.
- Freeman, Eben. 2016. "Profiling Python in Production." Nylas, December 6. Accessed 2018-05-06.
- Intel. 2017. "Analyzing Python Performance with Intel® VTune™ Amplifier XE." Accessed 2018-05-06.
- Jesse. 2015. "Speeding Up Python — Part 1: Profiling." Physical Modeling With Python Blog, September 24. Accessed 2018-05-06.
- Josserand, Sylvain. 2017. "Profiling and optimizing your Python code." Site, January 16. Accessed 2018-04-20.
- Klitzke, Evan. 2016. "Pyflame: Uber Engineering's Ptracing Profiler for Python." Uber Engineering Blog, September 27. Accessed 2018-04-20.
- Patrascu, Alecsandru. 2017. "Hunting Performance in Python Code – Part 3. CPU Profiling (scripts)." Python Files, June 1. Accessed 2018-05-06.
- PyCharm Help. 2018. "Optimizing Your Code Using Profilers." JetBrains, May 7. Accessed 2018-05-07.
- Python Docs. 2018. "The Python Standard Library." Version 3.6.5, May 6. Accessed 2018-05-06.
- Python For Beginners. 2013. "IPython a short introduction." February 26. Accessed 2018-05-07.
- Roskind, James. 2011. "25.4. The Python Profilers." Jython Docs, © 1994 InfoSeek Corporation. Accessed 2018-05-06.
- Rossant, Cyrille. 2012. "Profiling and optimizing Python code." October 10. Accessed 2018-05-06.
- Toubhans, Antoine. 2017. "Surgical Time Tracking in Python." Sicara Blog, September 12. Accessed 2018-05-06.
- Uber GitHub. 2018. "Pyflame: A Ptracing Profiler For Python." May. Accessed 2018-05-06.
- VanderPlas, Jake. 2016. "Profiling and Timing Code." Chapter 1, Python Data Science Handbook, O'Reilly Medi, November. Accessed 2018-05-06.
- Visual Studio Docs. 2018. "Profiling Python code." Microsoft, January 9. Accessed 2018-04-20.
- rkern GitHub. 2017. "Line-by-line profiling for Python." December 19. Accessed 2018-05-06.
- van Rees, Reinout. 2012. "Down the rabbit hole, profiling your Python code - Remco Wendt." Weblog, April 18. Accessed 2018-05-06.
Further Reading
- Klitzke, Evan. 2016. "Pyflame: Uber Engineering's Ptracing Profiler for Python." Uber Engineering Blog, September 27. Accessed 2018-04-20.
- Danjou, Julien. 2015. "Profiling Python using cProfile: a concrete case." Julien Danjou Blog, November 16. Accessed 2018-04-20.
- Toubhans, Antoine. 2017. "Surgical Time Tracking in Python." Sicara Blog, September 12. Accessed 2018-05-06.
Article Stats
Cite As
See Also
- Code Profiling
- Python
- Python for Scientific Computing
- Algorithmic Complexity
- Static Code Analysis
- Software Engineering