In this course, you'll learn to work with data using tools from the tidyverse in R. By data, we mean your own data, other people's data, messy data, big data, small data - any data with rows and columns that comes your way! By work, we mean doing most of the things that sound hard to do with R, and that need to happen before you can analyze or visualize your data. But work doesn't mean that it is not fun - you will see why so many people love working in the tidyverse as you learn how to explore, tame, tidy, and transform your data. Throughout this course, you'll work with data from a popular television baking competition called "The Great British Bake Off."
You will start this course by learning how to read data into R. We'll begin with the readr package, and use it to read in data files organized in rows and columns. In the rest of the chapter, you'll learn how to explore your data using tools to help you view, summarize, and count values effectively. You'll see how each of these steps gives you more insights into your data.
In this chapter, you will learn some basics of data taming, like how to tame your variable types, names, and values.
Now that your data has been tamed, it is time to get tidy. In this chapter, you will get hands-on experience tidying data and combining multiple tidying functions together in a chain using the pipe operator.
In this chapter, you will learn how to tame specific types of variables that are known to be tricky to work with, such as dates, strings, and factors.
Professor and Data Scientist
Alison is an Associate Professor of Pediatrics at Oregon Health & Science University (OHSU) in Portland, Oregon, and the Assistant Director of OHSU’s Center for Spoken Language Understanding, home to the Computer Science graduate education program. She has studied health-related applications of Natural Language Processing-based methods, with a focus on pediatric populations with developmental disabilities like Autism Spectrum Disorders.
Alison is also an experienced educator, with peer- and student-nominated awards for teaching. She teaches graduate-level data science courses on Statistics and Data Visualization using R.