Luiz Felipe Pereira Figueiredo has completed
Introduction to the Tidyverse
Start course For Free4 hours
4,150 XP
Loved by learners at thousands of companies
Course Description
This is an introduction to the programming language R, focused on a powerful set of tools known as the Tidyverse. You'll learn the intertwined processes of data manipulation and visualization using the tools dplyr and ggplot2. You'll learn to manipulate data by filtering, sorting, and summarizing a real dataset of historical country data in order to answer exploratory questions. You'll then learn to turn this processed data into informative line plots, bar plots, histograms, and more with the ggplot2 package. You’ll get a taste of the value of exploratory data analysis and the power of Tidyverse tools. This is a suitable introduction for those who have no previous experience in R and are interested in performing data analysis.
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and more- 1
Data wrangling
FreeIn this chapter, you'll learn to do three things with a table: filter for particular observations, arrange the observations in a desired order, and mutate to add or change a column. You'll see how each of these steps allows you to answer questions about your data.
The gapminder dataset50 xpLoading the gapminder and dplyr packages100 xpUnderstanding a data frame50 xpThe filter verb50 xpFiltering for one year100 xpFiltering for one country and one year100 xpThe arrange verb50 xpArranging observations by life expectancy100 xpFiltering and arranging100 xpThe mutate verb50 xpUsing mutate to change or create a column100 xpCombining filter, mutate, and arrange100 xp - 2
Data visualization
Often a better way to understand and present data as a graph. In this chapter, you'll learn the essential skills of data visualization using the ggplot2 package, and you'll see how the dplyr and ggplot2 packages work closely together to create informative graphs.
Visualizing with ggplot250 xpVariable assignment100 xpComparing population and GDP per capita100 xpComparing population and life expectancy100 xpLog scales50 xpPutting the x-axis on a log scale100 xpPutting the x- and y- axes on a log scale100 xpAdditional aesthetics50 xpAdding color to a scatter plot100 xpAdding size and color to a plot100 xpFaceting50 xpCreating a subgraph for each continent100 xpFaceting by year100 xp - 3
Grouping and summarizing
So far you've been answering questions about individual country-year pairs, but you may be interested in aggregations of the data, such as the average life expectancy of all countries within each year. Here you'll learn to use the group by and summarize verbs, which collapse large datasets into manageable summaries.
The summarize verb50 xpSummarizing the median life expectancy100 xpSummarizing the median life expectancy in 1957100 xpSummarizing multiple variables in 1957100 xpThe group_by verb50 xpSummarizing by year100 xpSummarizing by continent100 xpSummarizing by continent and year100 xpVisualizing summarized data50 xpVisualizing median life expectancy over time100 xpVisualizing median GDP per capita per continent over time100 xpComparing median life expectancy and median GDP per continent in 2007100 xp - 4
Types of visualizations
In this chapter, you'll learn how to create line plots, bar plots, histograms, and boxplots. You'll see how each plot requires different methods of data manipulation and preparation, and you’ll understand how each of these plot types plays a different role in data analysis.
Line plots50 xpVisualizing median GDP per capita over time100 xpVisualizing median GDP per capita by continent over time100 xpBar plots50 xpVisualizing median GDP per capita by continent100 xpVisualizing GDP per capita by country in Oceania100 xpHistograms50 xpVisualizing population100 xpVisualizing population with x-axis on a log scale100 xpBoxplots50 xpComparing GDP per capita across continents100 xpAdding a title to your graph100 xpConclusion50 xp
For Business
Training 2 or more people?
Get your team access to the full DataCamp library, with centralized reporting, assignments, projects and moreDavid Robinson
See MorePrincipal Data Scientist at Heap
Dave is the Principal Data Scientist at Heap. He has worked as a data scientist at DataCamp and Stack Overflow, and received his PhD in Quantitative and Computational Biology from Princeton University. Follow him at @drob on Twitter or on his blog, Variance Explained.
Join over 14 million learners and start Introduction to the Tidyverse today!
Create Your Free Account
or
By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.