R Plotting Systems
- Summary
-
Discussion
- Which are the plotting systems in R?
- Which of the R plotting systems should I learn?
- How can I make my R plots interactive?
- What packages enable 3D plots in R?
- Which third-party data visualization and analysis software integrate well with R?
- Could you list some useful plot commands in the Base system?
- Could you list some useful plot commands in the Lattice system?
- Could you list some useful plot commands in the ggplot2 system?
- What's the technique of creating a plot with ggplot2?
- What customizations can I do with ggplot2?
- What are ggplot2 extensions?
- Milestones
- Sample Code
- References
- Further Reading
- Article Stats
- Cite As
Data visualization is an important task for a statistician, data analyst or data scientist. It's an essential part of exploratory data analysis. R has more than one plotting system to do this visualization. Each has a different syntax. R programmers prefer to master one of them and at least become familiar with the rest.
Default R packages support only 2D plots. However, third-party packages are available for 3D and interactive plots. Plots produced can be exposed via web interfaces. R also integrates well with many third-party visualization and analysis software.
Discussion
-
Which are the plotting systems in R? - Base: We start with a blank canvas and start adding elements to it one by one. We can create the main plot and then add labels, axes, lines, and so on. Base plots are said to be intuitive since the process of creating them closely mirrors the thought process. Once something's on the plot, we can't go back and correct it.
- Lattice: A plot is created with a single function call. Margins and spacing are set automatically since the entire plot is known when the function is called. Lattice is good for multivariate plots since it's easy to create many subplots.
- ggplot2: This is a cross between Base and Lattice systems. Like Lattice, many things are automatically set but like Base it allows us to add to the plot after it's created. Lots of customizations are possible.
-
Which of the R plotting systems should I learn? Users on Quora have commented that Base plots are good for exploratory data analysis. The idea is to plot quickly without thinking about neatness. But if you need to create plots for publications, ggplot2 is preferred. Lattice plots are not that popular. Nathan Yau has compared both Base and ggplot2. He uses only Base.
Jeff Leek has echoed a similar sentiment that he prefers using Base for exploratory data analysis. Defaults available in ggplot2 can produce great plots with minimal code but can fool students into thinking that they're production ready.
Against the Base plot, David Robinson argues that ggplot2's pretty plots should be preferred over Base's ugly ones, even for exploratory data analysis. Creating legends, grouped lines and facets are cumbersome in Base system. With ggplot2, we don't need loops, grid statements or if statements because,
Base plotting is imperative, it’s about what you do. ggplot2 plotting is declarative, it’s about what your graph is.
We should note that
qplot
command of ggplot2 offers a simplified syntax that's similar to the Base system. Hence, learning only the ggplot2 system thoroughly may be enough. -
How can I make my R plots interactive? Interactive plots enable users to zoom into areas of interest, highlight important data points or hide irrelevant data points. Extra information can be shown via tooltips when users hover the mouse on specific data points.
With
plotly
package, we can make ggplot2 plots interactive. This becomes an easy learning path for those already familiar with ggplot2. However,plotly
can also be used on its own without ggplot2. An alternative to this ishighcharter
package that wraps over HighCharts JavaScript library.Shiny from RStudio enables interaction via a web interface. It supports both Base and ggplot2 systems. Called Shiny apps, they can be enhanced with
shinythemes
,htmlwidgets
and JavaScript. To interact across widgets, add-oncrosstalk
can be used.D3 is an influential charting library from the JavaScript and web world. Similar plots can be created in R without using any JavaScript. Examples of this include rCharts, d3scatter and networkD3.
-
What packages enable 3D plots in R? The Base system has the function
persp
that draws perspective views of a surface over the x-y plane. The commanddemo(persp)
will show what's possible. Other R packages for 3D visualization includeplot3D
,scatter3d
,scatterplot3d
,rgl
.Packages
rgl
andscatter3d
are interactive whereasscatterplot3d
is non-interactive. There's also an extension ofplot3D
calledplot3Drgl
, which is based onrgl
.Plotly's R package called
plotly
can do interactive 3D plots. -
Which third-party data visualization and analysis software integrate well with R? There are plenty of data visualization and analysis software. Many of these are now able to integrate with R. Plotly integrates well with ggplot2 and Shiny but can also do plots without either of them. Highcharts integration is available via
highcharter
, which useshtmlwidgets
, and works well with Shiny. Microsoft's Power BI can run integrate with R, run R script and display R plots within its Power BI Desktop software.MicroStrategy has its own visualizations but it can integrate with R for scripting and data analysis. Something similar can be done with Tableau and QlikView.
-
Could you list some useful plot commands in the Base system? You can obtain a complete list by typing
library(help = "graphics")
in the R console. Here we give a selection based on R version 3.5.0:assocplot
,barplot
,boxplot
,cdplot
,coplot
,dotchart
,fourfoldplot
,hist
,matplot
,mosaicplot
,pie
,plot
,spineplot
,stem
,sunflowerplot
.Once the main plot is generated, other functions can be called to annotate and customize:
abline
,axis
,box
,grid
,legend
,lines
,mtext
,points
,rug
,text
,title
.To generate a plot containing subplots,
par
andlayout
can be used. To customize colours, lines, background, axes orientation and margins,par
is useful. -
Could you list some useful plot commands in the Lattice system? You can obtain a complete list by typing
library(help = "lattice")
in the R console. Here we give a selection based on version v0.20-35. Bivariate plots can be generated usingxyplot
,dotplot
,barchart
,stripplot
,bwplot
. For 3D and wireframes, usecloud
andwireframe
respectively. For histograms and density plots, usehistogram
anddensityplot
respectively. For level plots and contour plots, uselevelplot
andcontourplot
respectively.In any of the Lattice plots, panels can be created to handle multivariate data. For example, a scatterplot comparing height vs age can be done in separate panels for males and females. Functions that enable panels are many and these are typically named with prefix
panel.
. These panel functions are implicitly called via the syntaxy~x|a*b
, wherea
andb
are the variables by which panels are made. For example,xyplot(mpg~wt|cyl*gear, data = mtcars)
will give a scatterplot of cyl*gear number of panels. -
Could you list some useful plot commands in the ggplot2 system? You can obtain a complete list by typing
library(help = "ggplot2")
in the R console. Here we give a selection based on version 2.2.1. There are two main plotting functions:ggplot
: This creates a new blank plot that must be completed by calling other helper functions.qplot
: Also called Quick Plot, this offers a simplified syntax compared toggplot
. This is an ideal starting point for those familiar with R's Base plots. For complex plots,ggplot
may be required.
When using
ggplot
, the following functions are needed in completing the plot:geom_*
: These functions specify what type of geometric objects should be plotted. Examples includegeom_point
,geom_path
,geom_bar
,geom_boxplot
, and many more. Data, if specified here, will override data specified inggplot
.aes
: This specifies the aesthetics, the mapping of variables to x and y axes. For data points, we can select shape, colour and size. This can be done when callingggplot
orgeom_*
functions. Aesthetics specified in individualgeom_*
calls will override those specified inggplot
.
-
What's the technique of creating a plot with ggplot2? ggplot2 is an implementation of a modified Grammar of Graphics, which was first proposed by Leland Wilkinson in 1999 and later revised in 2005. It was created by Hadley Wickham, who calls it the Layered Grammar of Graphics.
The concept of layering is used; that is, ggplot2 combines multiple layers of visualizations to make a single plot. For example,
ggplot
will create the plot while each call togeom_*
creates a layer of geometric objects. Coordinates and facets are specified. Further calls can set the theme, add annotations, adjust the scale, and so on. When all these are combined, we get the complete plot.To generalize the concept, Wickham mentions the following components for a typical plot:
- Default dataset and mappings from variables to aesthetics.
- Layers to specify geometric objects, statistical transformations and positions.
- Scale for each aesthetic mapping.
- Coordinate system.
- Facet specification.
-
What customizations can I do with ggplot2? Without being exhaustive, the following customizations in ggplot2 are possible:
- Annotations: With
annotate
, text, shaded rectangles, lines, labels, etc. can be added. - Coordinates: With
coord_*
functions, we can select coordinates (Cartesian vs Polar), transform coordinates, flip x and y axes, and so on. - Facets: These allow visualization of multivariate data. Function with prefix
facet_*
enable this. The syntaxa ~ .
places the panels vertically;. ~ a
places the panels horizontally, side by side. - Themes: Themes control colours, sizes, positions, borders and margins of background, panels, axes titles, axes ticks, axes labels, and so on. Two themes are available:
theme_grey()
(default) andtheme_bw()
(sets background to white). You can create own custom themes. - Scale: Scale for the axes can be customized using many functions:
discrete_scale
,continuous_scale
,guides
,lims
,scale_*
(multiple functions), and so on. - Position: Functions
position_*
adjust the position of geoms. - Statistics: Functions that produce statistical summaries before generating geoms.
- Annotations: With
-
What are ggplot2 extensions? Third-party packages add extra functionality to the ggplot2 plotting system. These are called ggplot2 extensions and they are tracked at ggplot2-exts.org. In May 2018, this site listed a gallery of 40 extensions. As a sample, these include radar charts, animated charts, time series charts, alluvial diagrams, directed acyclic graphs, and more. Notably,
ggedit
allows users to interactively edit the layers, scales and themes.Incidentally,
latticeExtra
extends the capabilities of the Lattice system.
Milestones
2000
Deepayan Sarkar starts working on the Lattice system. He's inspired by Trellis Graphics that was first proposed by Bill Cleveland in 1993 and subsequently implemented in S/S+ languages. However, it's equivalent was missing in R. Sarkar uses grid
add-on package of Paul Murrell (2002) to develop Lattice.
2007
Hadley Wickham releases ggplot2
version 0.5. In February 2014, the package goes into maintenance mode (no new features). Version 2.2.1 of the package is released in December 2016. In a five-year span 2012-2017, the package is downloaded 10 million times. In May 2017 alone, it's gets 400,000 downloads.
2012
Sample Code
References
- Barter, Rebecca. 2017. "Interactive visualization in R." Blog, April 20. Accessed 2018-05-09.
- Bauer, Brian. 2013. "QlikView and R Integration for Predictive Analytics Example." Qlik Community, April 2. Accessed 2018-05-10.
- CRAN ggplot2 Archive. 2018. "Index of /src/contrib/Archive/ggplot2." Accessed 2018-05-10.
- CRAN plot3D Archive. 2018. "Index of /src/contrib/Archive/plot3D." Accessed 2018-05-10.
- Casillas, Joseph V. 2018. "Plotting in R: Intro to base, lattice and ggplot2." January 23. Accessed 2018-05-08.
- Cetinkaya-Rundel, Mine. 2016. "Creating Interactive Plots with R and Highcharts." R Views, RStudio. October 19. Accessed 2018-05-10.
- Ginolhac, A., E. Koncina, and R. Krause. 2017. "Data plotting: ggplot2." Lecture 7, R tidyverse workshop, University of Luxembourg, May 3. Accessed 2018-05-08.
- Johnston, Susane. 2013. "R Base Graphics: An Idiot's Guide." August 30. Accessed 2018-05-09.
- Kabacoff, Robert I. 2017. "Advanced Graphs." Quick-R. Accessed 2018-05-09.
- Kopf, Dan. 2017. "The program that brought data visualization to the masses." Quartz, June 18. Accessed 2018-05-10.
- Lattice GitHub. 2017. "Trellis Graphics for R." March 23. Accessed 2018-05-10.
- Leek, Jeff. 2016. "Why I don't use ggplot2." Simply Statistics, February 11. Accessed 2018-05-08.
- MicroStrategy. 2018. "Overview of the R Integration Pack." Accessed 2018-05-10.
- Microsoft Docs. 2018. "Create Power BI visuals using R." February 5. Accessed 2018-05-10.
- Misra, Marlon. 2014. "How do R programmers choose among plotting systems (base, lattice, ggplot2, etc.)?" Quora, October 19. Accessed 2018-05-09.
- Peng, Roger D. 2016a. "Exploratory Data Analysis: Plotting Systems in R." The Johns Hopkins Data Science Lab on YouTube, January 14. Accessed 2018-05-08.
- Peng, Roger D. 2016b. "Exploratory Data Analysis with R." Bookdown, September 14. Accessed 2018-05-08.
- Plotly. 2018a. "3D Surface Plots in R." Plotly. Accessed 2018-05-08.
- Plotly. 2018b. "Plotly ggplot2 Library." Plotly. Accessed 2018-05-10.
- R-core. 2018. "The R Graphics Package." v3.5.0, RDocumentation. Accessed 2018-05-08.
- RStudio Shiny GitHub. 2018. "RStudio Shiny on GitHub." May 2. Accessed 2018-05-11.
- Rickert, Joseph. 2014. "3D Plots in R." R-bloggers, February 13. Accessed 2018-05-08.
- Robinson, David. 2016. "Why I use ggplot2." Variance Explained, February 12. Accessed 2018-05-08.
- STHDA. 2018a. "ggplot2 facet : split a plot into a matrix of panels." Statistical tools for high-throughput data analysis. Accessed 2018-05-08.
- STHDA. 2018b. " Impressive package for 3D and 4D graph - R software and data visualization." Statistical tools for high-throughput data analysis. Accessed 2018-05-08.
- Sape Research Group. 2018. "ggplot2 Quick Reference." Software and Programmer Efficiency Research Group, Faculty of Informatics, University of Lugano, Switzerland. Accessed 2018-05-08.
- Sarkar, Deepayan. 2002. "Lattice: An Implementation of Trellis Graphics in R." R News, vol. 2, no. 2, pp. 19–23, June. Accessed 2018-05-10.
- Sarkar, Deepayan. 2003. "Some Notes on lattice." Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20–22, Vienna, Austria. Accessed 2018-05-10.
- Sarkar, Deepayan. 2011. "Lattice: trellis graphics for R." May 8. Accessed 2018-05-10.
- Sarkar, Deepayan. 2017. "Trellis Graphics for R." v0.20-35, RDocumentation, March 25. Accessed 2018-05-08.
- Skill Gaze. 2017. "Understanding different visualization layers of ggplot." October 31. Accessed 2018-05-08.
- Smith, David. 2016. "Over 16 years of R Project history." Revolution Analytics, March 4. Accessed 2018-05-11.
- Tableau. 2018. "R for Statistical Computing & Analysis." Accessed 2018-05-10.
- Wickham, Hadley. 2010. "A layered grammar of graphics." Journal of Computational and Graphical Statistics, vol. 19, no. 1, pp 3-28. Accessed 2018-05-08.
- Wickham, Hadley. 2014. "ggplot2 development." ggplot2 Google Group, February 25. Accessed 2018-05-10.
- Wickham, Hadley. 2016. "Create Elegant Data Visualisations Using the Grammar of Graphics." v2.2.1, RDocumentation, December 30. Accessed 2018-05-08.
- Yau, Nathan. 2016. "Comparing ggplot2 and R Base Graphics." FlowingData, March 22. Accessed 2018-05-08.
- ggplot1 GitHub. 2016. "Before there was ggplot2." June 8. Accessed 2018-05-10.
- ggplot2 extensions. 2018. "ggplot2 extensions - gallery." Accessed 2018-05-09.
Further Reading
- Plotting in R: Intro to base, lattice and ggplot2
- The grammar of graphics
- IQSS. 2017. "Introduction to R graphics with ggplot2." Data Science Services, Institute for Quantitative Social Science, Harvard. Accessed 2018-05-08.
- Data Visualization with ggplot2: Cheat Sheet (from RStudio)
- Winston Chang's "R Graphics Cookbook"
- R Base Graphics: An Idiot's Guide
Article Stats
Cite As
See Also
- Grammar of Graphics
- Shiny
- Statistical Graphics
- R (Language)
- R Data Structures
- Vectorization in R