Loved by learners at thousands of companies
This course provides a comprehensive introduction on how to plot data with R’s default graphics system, base graphics.
After an introduction to base graphics, we look at a number of R plotting examples, from simple graphs such as scatterplots to plotting correlation matrices. The course finishes with exercises in plot customization. This includes using R plot colors effectively and creating and saving complex plots in R.
Base Graphics Background
R supports four different graphics systems: base graphics, grid graphics, lattice graphics, and ggplot2. Base graphics is the default graphics system in R, the easiest of the four systems to learn to use, and provides a wide variety of useful tools, especially for exploratory graphics where we wish to learn what is in an unfamiliar dataset.
A quick introduction to base R graphicsFree
This chapter gives a brief overview of some of the things you can do with base graphics in R. This graphics system is one of four available in R and it forms the basis for this course because it is both the easiest to learn and extremely useful both in preparing exploratory data visualizations to help you see what's in a dataset and in preparing explanatory data visualizations to help others see what we have found.The world of data visualization50 xpCreating an exploratory plot array100 xpCreating an explanatory scatterplot100 xpThe plot() function is generic100 xpA preview of some more and less useful techniques50 xpAdding details to a plot using point shapes, color, and reference lines100 xpCreating multiple plot arrays100 xpAvoid pie charts100 xp
Different plot types
This chapter introduces several Base R supported plot types that are particularly useful for visualizing important features in a dataset. We start with simple tools like histograms and density plots for characterizing one variable at a time, move on to scatter plots and other useful tools for showing how two variables relate, and finally introduce some tools for visualizing more complex relationships in our dataset.Characterizing a single variable50 xpThe hist() and truehist() functions100 xpDensity plots as smoothed histograms100 xpUsing the qqPlot() function to see many details in data100 xpVisualizing relations between two variables50 xpThe sunflowerplot() function for repeated numerical data100 xpUseful options for the boxplot() function100 xpUsing the mosaicplot() function100 xpShowing more complex relations between variables50 xpUsing the bagplot() function100 xpPlotting correlation matrices with the corrplot() function100 xpBuilding and plotting rpart() models100 xp
Adding details to plots
Most base R graphics functions support many optional arguments and parameters that allow us to customize our plots to get exactly what we want. In this chapter, we will learn how to modify point shapes and sizes, line types and widths, add points and lines to plots, add explanatory text and generate multiple plot arrays.The plot() function and its options50 xpIntroduction to the par() function100 xpExploring the type option100 xpThe surprising utility of the type "n" option100 xpAdding lines and points to plots50 xpThe lines() function and line types100 xpThe points() function and point types100 xpAdding trend lines from linear regression models100 xpAdding text to plots50 xpUsing the text() function to label plot features100 xpAdjusting text position, size, and font100 xpRotating text with the srt argument100 xpAdding or modifying other plot details50 xpUsing the legend() function100 xpAdding custom axes with the axis() function100 xpUsing the supsmu() function to add smooth trend curves100 xp
How much is too much?
As we have seen, base R graphics provides tremendous flexibility in creating plots with multiple lines, points of different shapes and sizes, and added text, along with arrays of multiple plots. If we attempt to add too many details to a plot or too many plots to an array, however, the result can become too complicated to be useful. This chapter focuses on how to manage this visual complexity so the results remain useful to ourselves and to others.Managing visual complexity50 xpToo much is too much100 xpDeciding how many scatterplots is too many100 xpHow many words is too many?100 xpCreating plot arrays with the mfrow parameter50 xpThe Anscombe quartet100 xpThe utility of common scaling and individual titles100 xpUsing multiple plots to give multiple views of a dataset100 xpCreating plot arrays with the layout() function50 xpConstructing and displaying layout matrices100 xpCreating a triangular array of plots100 xpCreating arrays with different sized plots100 xp
Advanced plot customization and beyond
This final chapter introduces a number of important topics, including the use of numerical plot details returned invisibly by functions like barplot() to enhance our plots, and saving plots to external files so they don't vanish when we end our current R session. This chapter also offers some guidelines for using color effectively in data visualizations, and it concludes with a brief introduction to the other three graphics systems in R.Creating and saving more complex plots50 xpSome plot functions also return useful information100 xpUsing the symbols() function to display relations between more than two variables100 xpSaving plot results as files100 xpUsing color effectively50 xpIliinsky and Steele's 12 recommended colors100 xpUsing color to enhance a bubbleplot100 xpUsing color to enhance stacked barplots100 xpOther graphics systems in R50 xpThe tabplot package and grid graphics100 xpA lattice graphics example100 xpA ggplot2 graphics example100 xp
PhD in Electrical Engineering and Computer Science from M.I.T.
Ron has been actively involved in data analysis and predictive modeling in a variety of technical positions, both academic and commercial, including the DuPont Company, the Swiss Federal Institute of Technology (ETH Zurich), the Tampere University of Technology in Tampere, Finland, the Travelers Companies and DataRobot. He holds a PhD in Electrical Engineering and Computer Science from M.I.T. and has written or co-written five books, including Exploring Data in Engineering, the Sciences, and Medicine (Oxford University Press, 2011) and Nonlinear Digital Filtering with Python (CRC Press, 2016, with Moncef Gabbouj). Ron is the author and maintainer of the GoodmanKruskal R package, and one of the authors of the datarobot R package.
What do other learners have to say?
I've used other sites—Coursera, Udacity, things like that—but DataCamp's been the one that I've stuck with.
Devon Edwards Joseph
Lloyds Banking Group
DataCamp is the top resource I recommend for learning data science.
Harvard Business School
DataCamp is by far my favorite website to learn from.
Decision Science Analytics, USAA