Tidy Data in Python Mini-Course
Open Course Description
Most of the world's data are not sorted in a clean and organized fashion; nor are they easy to process. As a data scientist, you need to know what the standards for tidy data are and how to create tidy datasets from messy ones. This mini-course will prepare you for these tasks.
Chapter 1: Tidy Data in Python
It is often said that data scientists spend only 20% of their time analyzing their data, and 80% of time cleaning it. Indeed, maintaining a tidy, easy-to-use dataset is crucial in our age of big data. In the paper Tidy Data, veteran statistician Hadley Wickham gives definitions of tidy and messy data so that all data scientists can keep their work organized. In this mini-course, you'll learn to transform messy datasets into tidy datasets using the pandas package in python. Let's get started!