Skip to content
1 hidden cell
Data Manipulation with pandas
Data Manipulation with pandas
Run the hidden code cell below to import the data used in this course.
1 hidden cell
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
import numpy as np
import pandas as pd
import seaborn as sns
highest_sales_department = walmart.groupby('department')['weekly_sales'].max().sort_values(ascending=False).head(5)
print(highest_sales_department)
total_2019_organic_avo = avocado[
(avocado['year'] == 2017) &
(avocado['type'] == 'organic')
]['nb_sold'].sum().round()
print(f"Total number of organic avocados sold in 2017: {total_2019_organic_avo}")
# Total homeless by region
homeless_by_region = (
homelessness.groupby('region')['individuals']
.sum()
.sort_values(ascending=False)
)
# Vertical bar chart
homeless_by_region.plot(kind='bar', title='Total Homeless People by Region')
plt.ylabel('Total Homeless People')
plt.show()
# Horizontal bar chart (Bonus)
homeless_by_region.plot(kind='barh', title='Total Homeless People by Region')
plt.xlabel('Total Homeless People')
plt.show()
# Filter for the desired cities
filtered_temperatures = temperatures[temperatures['city'].isin(['Toronto', 'Rome'])]
# Create the line plot
plt.figure(figsize=(12, 6))
sns.lineplot(
data=filtered_temperatures,
x='date',
y='avg_temp_c',
hue='city',
style='city',
markers=True,
dashes=False
)
# Add title and labels
plt.title('Average Temperatures in Toronto and Rome Over Time', fontsize=16)
plt.xlabel('Date', fontsize=12)
plt.ylabel('Average Temperature (°C)', fontsize=12)
# Add grid and save the figure
plt.grid(True)
plt.legend(title='Cities')
plt.savefig('temperatures_toronto_rome.png', dpi=300, bbox_inches='tight')
plt.show()
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Print the highest weekly sales for each
department
in thewalmart
DataFrame. Limit your results to the top five departments, in descending order. If you're stuck, try reviewing this video. - What was the total
nb_sold
of organic avocados in 2017 in theavocado
DataFrame? If you're stuck, try reviewing this video. - Create a bar plot of the total number of homeless people by region in the
homelessness
DataFrame. Order the bars in descending order. Bonus: create a horizontal bar chart. If you're stuck, try reviewing this video. - Create a line plot with two lines representing the temperatures in Toronto and Rome. Make sure to properly label your plot. Bonus: add a legend for the two lines. If you're stuck, try reviewing this video.