Skip to content
Data Manipulation with pandas
  • AI Chat
  • Code
  • Report
  • Data Manipulation with pandas

    Run the hidden code cell below to import the data used in this course.

    # Import the course packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
    # Import the four datasets
    avocado = pd.read_csv("datasets/avocado.csv")
    homelessness = pd.read_csv("datasets/homelessness.csv")
    temperatures = pd.read_csv("datasets/temperatures.csv")
    walmart = pd.read_csv("datasets/walmart.csv")

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Add your notes here

    # Import matplotlib.pyplot with alias plt
    import matplotlib.pyplot as plt
    
    # Look at the first few rows of data
    print(avocado.head())
    
    # Get the total number of avocados sold of each size
    nb_sold_by_size = avocado[["nb_sold","size"]]
    
    # Create a bar plot of the number of avocados sold by size
    nb_sold_by_size.plot(kind = "bar")
    
    # Show the plot
    plt.show()
    # Add your code snippets here

    Explore Datasets

    Use the DataFrames imported in the first cell to explore the data and practice your skills!

    • Print the highest weekly sales for each department in the walmart DataFrame. Limit your results to the top five departments, in descending order. If you're stuck, try reviewing this video.
    • What was the total nb_sold of organic avocados in 2017 in the avocado DataFrame? If you're stuck, try reviewing this video.
    • Create a bar plot of the total number of homeless people by region in the homelessness DataFrame. Order the bars in descending order. Bonus: create a horizontal bar chart. If you're stuck, try reviewing this video.
    • Create a line plot with two lines representing the temperatures in Toronto and Rome. Make sure to properly label your plot. Bonus: add a legend for the two lines. If you're stuck, try reviewing this video.