R Plotting Systems
Data visualization is an important task for a statistician, data analyst or data scientist. It's an essential part of exploratory data analysis. R has more than one plotting system to do this visualization. Each has a different syntax. R programmers prefer to master one of them and at least become familiar with the rest.
Default R packages support only 2D plots. However, third-party packages are available for 3D and interactive plots. Plots produced can be exposed via web interfaces. R also integrates well with many third-party visualization and analysis software.
Which are the plotting systems in R?
- Base: We start with a blank canvas and start adding elements to it one by one. We can create the main plot and then add labels, axes, lines, and so on. Base plots are said to be intuitive since the process of creating them closely mirrors the thought process. Once something's on the plot, we can't go back and correct it.
- Lattice: A plot is created with a single function call. Margins and spacing are set automatically since the entire plot is known when the function is called. Lattice is good for multivariate plots since it's easy to create many subplots.
- ggplot2: This is a cross between Base and Lattice systems. Like Lattice, many things are automatically set but like Base it allows us to add to the plot after it's created. Lots of customizations are possible.
Which of the R plotting systems should I learn?
Users on Quora have commented that Base plots are good for exploratory data analysis. The idea is to plot quickly without thinking about neatness. But if you need to create plots for publications, ggplot2 is preferred. Lattice plots are not that popular. Nathan Yau has compared both Base and ggplot2. He uses only Base.
Jeff Leek has echoed a similar sentiment that he prefers using Base for exploratory data analysis. Defaults available in ggplot2 can produce great plots with minimal code but can fool students into thinking that they're production ready.
Against the Base plot, David Robinson argues that ggplot2's pretty plots should be preferred over Base's ugly ones, even for exploratory data analysis. Creating legends, grouped lines and facets are cumbersome in Base system. With ggplot2, we don't need loops, grid statements or if statements because,
Base plotting is imperative, it’s about what you do. ggplot2 plotting is declarative, it’s about what your graph is.
We should note that
qplotcommand of ggplot2 offers a simplified syntax that's similar to the Base system. Hence, learning only the ggplot2 system thoroughly may be enough.
How can I make my R plots interactive?
Interactive plots enable users to zoom into areas of interest, highlight important data points or hide irrelevant data points. Extra information can be shown via tooltips when users hover the mouse on specific data points.
plotlypackage, we can make ggplot2 plots interactive. This becomes an easy learning path for those already familiar with ggplot2. However,
plotlycan also be used on its own without ggplot2. An alternative to this is
Shiny from RStudio enables interaction via a web interface. It supports both Base and ggplot2 systems. Called Shiny apps, they can be enhanced with
crosstalkcan be used.
What packages enable 3D plots in R?
The Base system has the function
perspthat draws perspective views of a surface over the x-y plane. The command
demo(persp)will show what's possible. Other R packages for 3D visualization include
Which third-party data visualization and analysis software integrate well with R?
There are plenty of data visualization and analysis software. Many of these are now able to integrate with R. Plotly integrates well with ggplot2 and Shiny but can also do plots without either of them. Highcharts integration is available via
highcharter, which uses
htmlwidgets, and works well with Shiny. Microsoft's Power BI can run integrate with R, run R script and display R plots within its Power BI Desktop software.
Could you list some useful plot commands in the Base system?
You can obtain a complete list by typing
library(help = "graphics")in the R console. Here we give a selection based on R version 3.5.0:
Could you list some useful plot commands in the Lattice system?
You can obtain a complete list by typing
library(help = "lattice")in the R console. Here we give a selection based on version v0.20-35. Bivariate plots can be generated using
bwplot. For 3D and wireframes, use
wireframerespectively. For histograms and density plots, use
densityplotrespectively. For level plots and contour plots, use
In any of the Lattice plots, panels can be created to handle multivariate data. For example, a scatterplot comparing height vs age can be done in separate panels for males and females. Functions that enable panels are many and these are typically named with prefix
panel.. These panel functions are implicitly called via the syntax
bare the variables by which panels are made. For example,
xyplot(mpg~wt|cyl*gear, data = mtcars)will give a scatterplot of cyl*gear number of panels.
Could you list some useful plot commands in the ggplot2 system?
ggplot: This creates a new blank plot that must be completed by calling other helper functions.
qplot: Also called Quick Plot, this offers a simplified syntax compared to
ggplot. This is an ideal starting point for those familiar with R's Base plots. For complex plots,
ggplotmay be required.
geom_*: These functions specify what type of geometric objects should be plotted. Examples include
geom_boxplot, and many more. Data, if specified here, will override data specified in
aes: This specifies the aesthetics, the mapping of variables to x and y axes. For data points, we can select shape, colour and size. This can be done when calling
geom_*functions. Aesthetics specified in individual
geom_*calls will override those specified in
What's the technique of creating a plot with ggplot2?
ggplot2 is an implementation of a modified Grammar of Graphics, which was first proposed by Leland Wilkinson in 1999 and later revised in 2005. It was created by Hadley Wickham, who calls it the Layered Grammar of Graphics.
The concept of layering is used; that is, ggplot2 combines multiple layers of visualizations to make a single plot. For example,
ggplotwill create the plot while each call to
geom_*creates a layer of geometric objects. Coordinates and facets are specified. Further calls can set the theme, add annotations, adjust the scale, and so on. When all these are combined, we get the complete plot.
- Default dataset and mappings from variables to aesthetics.
- Layers to specify geometric objects, statistical transformations and positions.
- Scale for each aesthetic mapping.
- Coordinate system.
- Facet specification.
What customizations can I do with ggplot2?
Without being exhaustive, the following customizations in ggplot2 are possible:
- Annotations: With
annotate, text, shaded rectangles, lines, labels, etc. can be added.
- Coordinates: With
coord_*functions, we can select coordinates (Cartesian vs Polar), transform coordinates, flip x and y axes, and so on.
- Facets: These allow visualization of multivariate data. Function with prefix
facet_*enable this. The syntax
a ~ .places the panels vertically;
. ~ aplaces the panels horizontally, side by side.
- Themes: Themes control colours, sizes, positions, borders and margins of background, panels, axes titles, axes ticks, axes labels, and so on. Two themes are available:
theme_bw()(sets background to white). You can create own custom themes.
- Scale: Scale for the axes can be customized using many functions:
scale_*(multiple functions), and so on.
- Position: Functions
position_*adjust the position of geoms.
- Statistics: Functions that produce statistical summaries before generating geoms.
- Annotations: With
What are ggplot2 extensions?
Third-party packages add extra functionality to the ggplot2 plotting system. These are called ggplot2 extensions and they are tracked at ggplot2-exts.org. In May 2018, this site listed a gallery of 40 extensions. As a sample, these include radar charts, animated charts, time series charts, alluvial diagrams, directed acyclic graphs, and more. Notably,
ggeditallows users to interactively edit the layers, scales and themes.
Deepayan Sarkar starts working on the Lattice system. He's inspired by Trellis Graphics that was first proposed by Bill Cleveland in 1993 and subsequently implemented in S/S+ languages. However, it's equivalent was missing in R. Sarkar uses
grid add-on package of Paul Murrell (2002) to develop Lattice.
Hadley Wickham releases
ggplot2 version 0.5. In February 2014, the package goes into maintenance mode (no new features). Version 2.2.1 of the package is released in December 2016. In a five-year span 2012-2017, the package is downloaded 10 million times. In May 2017 alone, it's gets 400,000 downloads.
- Barter, Rebecca. 2017. "Interactive visualization in R." Blog, April 20. Accessed 2018-05-09.
- Bauer, Brian. 2013. "QlikView and R Integration for Predictive Analytics Example." Qlik Community, April 2. Accessed 2018-05-10.
- Casillas, Joseph V. 2018. "Plotting in R: Intro to base, lattice and ggplot2." January 23. Accessed 2018-05-08.
- Cetinkaya-Rundel, Mine. 2016. "Creating Interactive Plots with R and Highcharts." R Views, RStudio. October 19. Accessed 2018-05-10.
- CRAN ggplot2 Archive. 2018. "Index of /src/contrib/Archive/ggplot2." Accessed 2018-05-10.
- CRAN plot3D Archive. 2018. "Index of /src/contrib/Archive/plot3D." Accessed 2018-05-10.
- ggplot1 GitHub. 2016. "Before there was ggplot2." June 8. Accessed 2018-05-10.
- ggplot2 extensions. 2018. "ggplot2 extensions - gallery." Accessed 2018-05-09.
- Ginolhac, A., E. Koncina, and R. Krause. 2017. "Data plotting: ggplot2." Lecture 7, R tidyverse workshop, University of Luxembourg, May 3. Accessed 2018-05-08.
- Johnston, Susane. 2013. "R Base Graphics: An Idiot's Guide." August 30. Accessed 2018-05-09.
- Kabacoff, Robert I. 2017. "Advanced Graphs." Quick-R. Accessed 2018-05-09.
- Kopf, Dan. 2017. "The program that brought data visualization to the masses." Quartz, June 18. Accessed 2018-05-10.
- Lattice GitHub. 2017. "Trellis Graphics for R." March 23. Accessed 2018-05-10.
- Leek, Jeff. 2016. "Why I don't use ggplot2." Simply Statistics, February 11. Accessed 2018-05-08.
- Microsoft Docs. 2018. "Create Power BI visuals using R." February 5. Accessed 2018-05-10.
- MicroStrategy. 2018. "Overview of the R Integration Pack." Accessed 2018-05-10.
- Misra, Marlon. 2014. "How do R programmers choose among plotting systems (base, lattice, ggplot2, etc.)?" Quora, October 19. Accessed 2018-05-09.
- Peng, Roger D. 2016a. "Exploratory Data Analysis: Plotting Systems in R." The Johns Hopkins Data Science Lab on YouTube, January 14. Accessed 2018-05-08.
- Peng, Roger D. 2016b. "Exploratory Data Analysis with R." Bookdown, September 14. Accessed 2018-05-08.
- Plotly. 2018a. "3D Surface Plots in R." Plotly. Accessed 2018-05-08.
- Plotly. 2018b. "Plotly ggplot2 Library." Plotly. Accessed 2018-05-10.
- R-core. 2018. "The R Graphics Package." v3.5.0, RDocumentation. Accessed 2018-05-08.
- Rickert, Joseph. 2014. "3D Plots in R." R-bloggers, February 13. Accessed 2018-05-08.
- Robinson, David. 2016. "Why I use ggplot2." Variance Explained, February 12. Accessed 2018-05-08.
- RStudio Shiny GitHub. 2018. "RStudio Shiny on GitHub." May 2. Accessed 2018-05-11.
- Sape Research Group. 2018. "ggplot2 Quick Reference." Software and Programmer Efficiency Research Group, Faculty of Informatics, University of Lugano, Switzerland. Accessed 2018-05-08.
- Sarkar, Deepayan. 2002. "Lattice: An Implementation of Trellis Graphics in R." R News, vol. 2, no. 2, pp. 19–23, June. Accessed 2018-05-10.
- Sarkar, Deepayan. 2003. "Some Notes on lattice." Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20–22, Vienna, Austria. Accessed 2018-05-10.
- Sarkar, Deepayan. 2011. "Lattice: trellis graphics for R." May 8. Accessed 2018-05-10.
- Sarkar, Deepayan. 2017. "Trellis Graphics for R." v0.20-35, RDocumentation, March 25. Accessed 2018-05-08.
- Skill Gaze. 2017. "Understanding different visualization layers of ggplot." October 31. Accessed 2018-05-08.
- Smith, David. 2016. "Over 16 years of R Project history." Revolution Analytics, March 4. Accessed 2018-05-11.
- STHDA. 2018a. "ggplot2 facet : split a plot into a matrix of panels." Statistical tools for high-throughput data analysis. Accessed 2018-05-08.
- STHDA. 2018b. " Impressive package for 3D and 4D graph - R software and data visualization." Statistical tools for high-throughput data analysis. Accessed 2018-05-08.
- Tableau. 2018. "R for Statistical Computing & Analysis." Accessed 2018-05-10.
- Wickham, Hadley. 2010. "A layered grammar of graphics." Journal of Computational and Graphical Statistics, vol. 19, no. 1, pp 3-28. Accessed 2018-05-08.
- Wickham, Hadley. 2014. "ggplot2 development." ggplot2 Google Group, February 25. Accessed 2018-05-10.
- Wickham, Hadley. 2016. "Create Elegant Data Visualisations Using the Grammar of Graphics." v2.2.1, RDocumentation, December 30. Accessed 2018-05-08.
- Yau, Nathan. 2016. "Comparing ggplot2 and R Base Graphics." FlowingData, March 22. Accessed 2018-05-08.
- Plotting in R: Intro to base, lattice and ggplot2
- The grammar of graphics
- IQSS. 2017. "Introduction to R graphics with ggplot2." Data Science Services, Institute for Quantitative Social Science, Harvard. Accessed 2018-05-08.
- Data Visualization with ggplot2: Cheat Sheet (from RStudio)
- Winston Chang's "R Graphics Cookbook"
- R Base Graphics: An Idiot's Guide