R Plotting Systems

Article Info

Contributed by
2 authors

Last updated on
2022-02-15 11:49:50

Grammar of Graphics
Shiny
Statistical Graphics
R (Language)
R Data Structures
Vectorization in R

Article Versions

6 2022-02-15 11:49:50
3323,1628 6,3323

By devbot5S

Migrating blockquotes to markdown syntax
5 2019-09-24 05:50:32
1628,473 5,1628

By arvindpdmn

Corrected URL in Further Reading. Case correction.
4 2018-05-12 12:04:43
473,471 4,473

By arvindpdmn

See also updated based on title change elsewhere
3 2018-05-12 03:08:08
471,466 3,471

By arvindpdmn

Name corrected.
2 2018-05-11 02:18:34
466,105 2,466

By arvindpdmn

Adding content and images.

Chat Room

Submitting ...

You are editing an existing chat message.
2017-07-19 05:35:45
-

By arvindpdmn

Calling for contributions ...

Data visualization is an important task for a statistician, data analyst or data scientist. It's an essential part of exploratory data analysis. R has more than one plotting system to do this visualization. Each has a different syntax. R programmers prefer to master one of them and at least become familiar with the rest.

Default R packages support only 2D plots. However, third-party packages are available for 3D and interactive plots. Plots produced can be exposed via web interfaces. R also integrates well with many third-party visualization and analysis software.

Discussion

Which are the plotting systems in R?
Roger D. Peng explains the different plotting systems in R. Source: Peng 2016a.
R has three plotting systems:
- Base: We start with a blank canvas and start adding elements to it one by one. We can create the main plot and then add labels, axes, lines, and so on. Base plots are said to be intuitive since the process of creating them closely mirrors the thought process. Once something's on the plot, we can't go back and correct it.
- Lattice: A plot is created with a single function call. Margins and spacing are set automatically since the entire plot is known when the function is called. Lattice is good for multivariate plots since it's easy to create many subplots.
- ggplot2: This is a cross between Base and Lattice systems. Like Lattice, many things are automatically set but like Base it allows us to add to the plot after it's created. Lots of customizations are possible.
Which of the R plotting systems should I learn?
Users on Quora have commented that Base plots are good for exploratory data analysis. The idea is to plot quickly without thinking about neatness. But if you need to create plots for publications, ggplot2 is preferred. Lattice plots are not that popular. Nathan Yau has compared both Base and ggplot2. He uses only Base.
Jeff Leek has echoed a similar sentiment that he prefers using Base for exploratory data analysis. Defaults available in ggplot2 can produce great plots with minimal code but can fool students into thinking that they're production ready.
Against the Base plot, David Robinson argues that ggplot2's pretty plots should be preferred over Base's ugly ones, even for exploratory data analysis. Creating legends, grouped lines and facets are cumbersome in Base system. With ggplot2, we don't need loops, grid statements or if statements because,
Base plotting is imperative, it’s about what you do. ggplot2 plotting is declarative, it’s about what your graph is.
We should note that qplot command of ggplot2 offers a simplified syntax that's similar to the Base system. Hence, learning only the ggplot2 system thoroughly may be enough.
How can I make my R plots interactive?
Interactive plots enable users to zoom into areas of interest, highlight important data points or hide irrelevant data points. Extra information can be shown via tooltips when users hover the mouse on specific data points.
With plotly package, we can make ggplot2 plots interactive. This becomes an easy learning path for those already familiar with ggplot2. However, plotly can also be used on its own without ggplot2. An alternative to this is highcharter package that wraps over HighCharts JavaScript library.
Shiny from RStudio enables interaction via a web interface. It supports both Base and ggplot2 systems. Called Shiny apps, they can be enhanced with shinythemes, htmlwidgets and JavaScript. To interact across widgets, add-on crosstalk can be used.
D3 is an influential charting library from the JavaScript and web world. Similar plots can be created in R without using any JavaScript. Examples of this include rCharts, d3scatter and networkD3.
What packages enable 3D plots in R?
The Base system has the function persp that draws perspective views of a surface over the x-y plane. The command demo(persp) will show what's possible. Other R packages for 3D visualization include plot3D, scatter3d, scatterplot3d, rgl.
Packages rgl and scatter3d are interactive whereas scatterplot3d is non-interactive. There's also an extension of plot3D called plot3Drgl, which is based on rgl.
Plotly's R package called plotly can do interactive 3D plots.
Which third-party data visualization and analysis software integrate well with R?
There are plenty of data visualization and analysis software. Many of these are now able to integrate with R. Plotly integrates well with ggplot2 and Shiny but can also do plots without either of them. Highcharts integration is available via highcharter, which uses htmlwidgets, and works well with Shiny. Microsoft's Power BI can run integrate with R, run R script and display R plots within its Power BI Desktop software.
MicroStrategy has its own visualizations but it can integrate with R for scripting and data analysis. Something similar can be done with Tableau and QlikView.
Could you list some useful plot commands in the Base system?
A selection of Base plots. Source: Johnston 2013.
You can obtain a complete list by typing library(help = "graphics") in the R console. Here we give a selection based on R version 3.5.0: assocplot, barplot, boxplot, cdplot, coplot, dotchart, fourfoldplot, hist, matplot, mosaicplot, pie, plot, spineplot, stem, sunflowerplot.
Once the main plot is generated, other functions can be called to annotate and customize: abline, axis, box, grid, legend, lines, mtext, points, rug, text, title.
To generate a plot containing subplots, par and layout can be used. To customize colours, lines, background, axes orientation and margins, par is useful.
Could you list some useful plot commands in the Lattice system?
You can obtain a complete list by typing library(help = "lattice") in the R console. Here we give a selection based on version v0.20-35. Bivariate plots can be generated using xyplot, dotplot, barchart, stripplot, bwplot. For 3D and wireframes, use cloud and wireframe respectively. For histograms and density plots, use histogram and densityplot respectively. For level plots and contour plots, use levelplot and contourplot respectively.
In any of the Lattice plots, panels can be created to handle multivariate data. For example, a scatterplot comparing height vs age can be done in separate panels for males and females. Functions that enable panels are many and these are typically named with prefix panel.. These panel functions are implicitly called via the syntax y~x|a*b, where a and b are the variables by which panels are made. For example, xyplot(mpg~wt|cyl*gear, data = mtcars) will give a scatterplot of cyl*gear number of panels.
Could you list some useful plot commands in the ggplot2 system?
A selection of geom_* commands used in ggplot2. Source: Ginolhac et al. 2017, slide 24.
You can obtain a complete list by typing library(help = "ggplot2") in the R console. Here we give a selection based on version 2.2.1. There are two main plotting functions:
- ggplot: This creates a new blank plot that must be completed by calling other helper functions.
- qplot: Also called Quick Plot, this offers a simplified syntax compared to ggplot. This is an ideal starting point for those familiar with R's Base plots. For complex plots, ggplot may be required.
When using ggplot, the following functions are needed in completing the plot:
- geom_*: These functions specify what type of geometric objects should be plotted. Examples include geom_point, geom_path, geom_bar, geom_boxplot, and many more. Data, if specified here, will override data specified in ggplot.
- aes: This specifies the aesthetics, the mapping of variables to x and y axes. For data points, we can select shape, colour and size. This can be done when calling ggplot or geom_* functions. Aesthetics specified in individual geom_* calls will override those specified in ggplot.
What's the technique of creating a plot with ggplot2?
A plot in ggplot2 in created as a combination of layers. Source: Skill Gaze 2017.
ggplot2 is an implementation of a modified Grammar of Graphics, which was first proposed by Leland Wilkinson in 1999 and later revised in 2005. It was created by Hadley Wickham, who calls it the Layered Grammar of Graphics.
The concept of layering is used; that is, ggplot2 combines multiple layers of visualizations to make a single plot. For example, ggplot will create the plot while each call to geom_* creates a layer of geometric objects. Coordinates and facets are specified. Further calls can set the theme, add annotations, adjust the scale, and so on. When all these are combined, we get the complete plot.
To generalize the concept, Wickham mentions the following components for a typical plot:
- Default dataset and mappings from variables to aesthetics.
- Layers to specify geometric objects, statistical transformations and positions.
- Scale for each aesthetic mapping.
- Coordinate system.
- Facet specification.
What customizations can I do with ggplot2?
Anatomy of a plot in ggplot2. Source: Sape Research Group 2018.
Without being exhaustive, the following customizations in ggplot2 are possible:
- Annotations: With annotate, text, shaded rectangles, lines, labels, etc. can be added.
- Coordinates: With coord_* functions, we can select coordinates (Cartesian vs Polar), transform coordinates, flip x and y axes, and so on.
- Facets: These allow visualization of multivariate data. Function with prefix facet_* enable this. The syntax a ~ . places the panels vertically; . ~ a places the panels horizontally, side by side.
- Themes: Themes control colours, sizes, positions, borders and margins of background, panels, axes titles, axes ticks, axes labels, and so on. Two themes are available: theme_grey() (default) and theme_bw() (sets background to white). You can create own custom themes.
- Scale: Scale for the axes can be customized using many functions: discrete_scale, continuous_scale, guides, lims, scale_* (multiple functions), and so on.
- Position: Functions position_* adjust the position of geoms.
- Statistics: Functions that produce statistical summaries before generating geoms.
What are ggplot2 extensions?
Third-party packages add extra functionality to the ggplot2 plotting system. These are called ggplot2 extensions and they are tracked at ggplot2-exts.org. In May 2018, this site listed a gallery of 40 extensions. As a sample, these include radar charts, animated charts, time series charts, alluvial diagrams, directed acyclic graphs, and more. Notably, ggedit allows users to interactively edit the layers, scales and themes.
Incidentally, latticeExtra extends the capabilities of the Lattice system.

Milestones

Feb
2000

Although R started in 1993, first public release 1.0.0 is made in February 2000. Base plotting system is part of this release.

2001

Deepayan Sarkar starts working on the Lattice system. He's inspired by Trellis Graphics that was first proposed by Bill Cleveland in 1993 and subsequently implemented in S/S+ languages. However, it's equivalent was missing in R. Sarkar uses grid add-on package of Paul Murrell (2002) to develop Lattice.

2005

Leland Wilkinson publishes the second edition of his book titled The Grammar of Graphics. This was first published in 1999. This book inspires Hadley Wickham to create ggplot and ggplot2.

2006

Hadley Wickham releases ggplot. The final release 0.4.2 of this package is made in October 2008. Subsequently, it's replaced with ggplot2.

Jun
2007

Hadley Wickham releases ggplot2 version 0.5. In February 2014, the package goes into maintenance mode (no new features). Version 2.2.1 of the package is released in December 2016. In a five-year span 2012-2017, the package is downloaded 10 million times. In May 2017 alone, it's gets 400,000 downloads.

Aug
2012

For interactive web applications with R, including plots, Shiny v0.1.2 is released. Version v1.0.0 is released in 2017.

2013

Version 1.0 of plot3D package is released.

Sample Code

rsplus

# Base plotting: start with a plot and then annotate with lines
with(airquality, {
        plot(Temp, Ozone)
        lines(loess.smooth(Temp, Ozone))
})
title("Ozone vs Temperature")
 
 
# Lattice plotting: everything plotted at once
# Margins and spacing are automatically calculated
# Units in the labels are from the metadata
library(lattice)
xyplot(Ozone ~ Wind, data = airquality, 
       xlab = "Wind Speed (mph)", ylab = "Ozone Level (ppb)",
       main ="Are Ozone Levels Correlated With Wind Speed?")
 
 
# ggplot2: deals with margins and spacing automatically like Lattice
# but also allows you to annotate after plot is created
library(ggplot2)
qplot(Wind, Ozone, data = airquality) +
    ggtitle("Are Ozone Levels Correlated With Wind Speed?")

References

Article Stats

1861

Words

Authors

Edits

Chats

Likes

11K

Hits

Cite As

Devopedia. 2022. "R Plotting Systems." Version 6, February 15. Accessed 2023-11-12. https://devopedia.org/r-plotting-systems

Contributed by
2 authors

Last updated on
2022-02-15 11:49:50

languages r language visualization

Grammar of Graphics
Shiny
Statistical Graphics
R (Language)
R Data Structures
Vectorization in R

R Plotting Systems

Discussion

Milestones

Sample Code

References

Further Reading

Article Stats

Author-wise Stats for Article Edits

Cite As

See Also

Login