pandas DataFrames are the most widely used in-memory representation of complex data collections within Python. Whether in finance, a scientific field, or data science, familiarity with pandas is essential. This course teaches you to work with real-world datasets containing both string and numeric data, often structured around time series. You will learn powerful analysis, selection, and visualization techniques in this course.
Data ingestion & inspectionFree
In this chapter, you will be introduced to pandas DataFrames. You will use pandas to import and inspect a variety of datasets, ranging from population data obtained from the World Bank to monthly stock data obtained via Yahoo Finance. You will also practice building DataFrames from scratch and become familiar with the intrinsic data visualization capabilities of pandas.
Exploratory data analysis
Now that you’ve learned how to ingest and inspect your data, you will next learn how to explore it visually and quantitatively. This process, known as exploratory data analysis (EDA), is a crucial component of any data science project. pandas has powerful methods that help with statistical and visual EDA. In this chapter, you will learn how and when to apply these techniques.
In this chapter, you will learn how to manipulate and visualize time series data using pandas. You will become familiar with concepts such as upsampling, downsampling, and interpolation. You will practice using method chaining to efficiently filter your data and perform time series analyses. From stock prices to flight timings, time series data can be found in a wide variety of domains, and being able to effectively work with it is an invaluable skill.
Case Study - Sunlight in Austin
Working with real-world weather and climate data, this chapter will allow you to apply all of the skills you have acquired in this course. You will use pandas to manipulate the data into a usable form for analysis and systematically explore it using the techniques you’ve learned.
Data Science Training
This course was created in collaboration with Anaconda. With over 6 million users, the open source Anaconda Distribution is the fastest and easiest way to do Python data science and machine learning. It's the industry standard for developing, testing, and training on a single machine.