this is the nav!
Exploratory Data Analysis in Python
• AI Chat
• Code
• Report

Use this workspace to take notes, store code snippets, or build your own interactive cheatsheet! The datasets used in this course are available in the `datasets` folder.

```.mfe-app-workspace-11z5vno{font-family:JetBrainsMonoNL,Menlo,Monaco,'Courier New',monospace;font-size:13px;line-height:20px;}```# Import any packages you want to use here
import seaborn as sns
import numpy as np
import matplotlib.pyplot as plt``````

### Take Notes

Add notes here about the concepts you've learned and code cells with code you want to keep.

``````import pandas as pd

### Global unemployment in 2021

• Import the required visualization libraries.

• Create a histogram of the distribution of 2021 unemployment percentages across all countries in unemployment; show a full percentage point in each bin.

``sns.histplot(x='2021', data= unemployment,binwidth=10)``

### Detecting data types

• Update the data type of the 2019 column of unemployment to float.

• Print the dtypes of the unemployment DataFrame again to check that the data type has been updated!

``````#Convert dtype of '2019' from object to float
unemployment['2019'] = unemployment['2019'].astype(float)

print(unemployment.dtypes)``````

### Validating continents

• Define a Series of Booleans describing whether or not each continent is outside of Oceania; call this Series not_oceania.
• Boolean indexing to print the unemployment DataFrame without any of the data related to countries in Oceania.
``````not_oceania = ~unemployment['continent'].isin(['Oceania'])

print(unemployment[not_oceania])``````

### Validating range

• Print the minimum and maximum unemployment rates, in that order, during 2021.
• Create a boxplot of 2021 unemployment rates, broken down by continent.
``````print(min(unemployment['2021']), max(unemployment['2021']))

sns.boxplot(data=unemployment, x='2021', y='continent')
plt.show()``````

### Visualizing categorical summaries

Create a bar plot showing continents on the x-axis and their respective average 2021 unemployment rates on the y-axis.

``sns.barplot(x='continent', y='2021',data=unemployment)``

### Dealing with Missing Values

Print the number of missing values in each column of the DataFrame.