Intermediate Python
Run the hidden code cell below to import the data used in this course.
# Import the course packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Import the two datasets
gapminder = pd.read_csv("datasets/gapminder.csv")
brics = pd.read_csv("datasets/brics.csv")
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Pandas = high level data manipulation tool based on NumPy that uses DataFrames
Series = 1D array that can be labeled and can be combined into DataFrames
# import pandas as pd
# data_frame = pd.DataFrame(pd.read('xxx.csv', index_col = 0))
print(brics)
### LOC (label based access in pd)
# Row access (look up by index or another row)
# brics.loc[["RU", "CH"]]
# Column access (look up by columns)
# brics.loc[:, 'country', 'capital']
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Create a loop that iterates through the
brics
DataFrame and prints "The population of {country} is {population} million!". - Create a histogram of the life expectancies for countries in Africa in the
gapminder
DataFrame. Make sure your plot has a title, axis labels, and has an appropriate number of bins. - Simulate 10 rolls of two six-sided dice. If the two dice add up to 7 or 11, print "A win!". If the two dice add up to 2, 3, or 12, print "A loss!". If the two dice add up to any other number, print "Roll again!".
### COMPARISON OF CREATING A PANDA SERIES TO A PANDA DATAFRAME
## SERIES
print(brics["country"])
## DATAFRAME
print(brics[["country"]])
loc and iloc also allow you to select both rows and columns from a DataFrame. To experiment, try out the following commands in the IPython Shell. Again, paired commands produce the same result.
cars.loc['IN', 'cars_per_cap']
cars.iloc[3, 0]
cars.loc[['IN', 'RU'], 'cars_per_cap']
cars.iloc[[3, 4], 0]
cars.loc[['IN', 'RU'], ['cars_per_cap', 'country']]
cars.iloc[[3, 4], [0, 1]]
It's also possible to select only columns with loc and iloc. In both cases, you simply put a slice going from beginning to end in front of the comma:
cars.loc[:, 'country']
cars.iloc[:, 1]
cars.loc[:, ['country','drives_right']]
cars.iloc[:, [1, 2]]