Intermediate Python
Run the hidden code cell below to import the data used in this course.
# Import the course packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# Import the two datasets
gapminder = pd.read_csv("datasets/gapminder.csv")
brics = pd.read_csv("datasets/brics.csv")
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
To look in a dictionary is something is there key in dictionary -> returns a boolean. Or just subset it dictionary[key] Can delete with del(dictionary[key])
sealand in world
world['sealand']
del(world['sealand'])
Pandas - create a dataframe with dictionaries in which the KEY = Column name. Values = data in the column. Reading in a CSV file with an INDEX built in (say like the question in examsoft) use:
# Add your code snippets here
pd.read_csv("path", index_col = column with the index value - normally 0)
Selecting columns in a dataframe. [] will return a data.series. (single) [[]] will return a Data.Frame. (double)
Selecting Rows in a dataframe use a slice. [0:5] for example loc function will give the data as a series. example df.loc['row location'] to get as DataFrame. df.loc[[]] <- double brackets.
Select Rows AND Columns with loc function. df.loc[[ROWs], [Columns]]
iloc <- JUST like .loc BUT use the index locations rather than the names of the rows/columns
data= [60, 63, 64, 66, 68, 69, 71, 71.5, 72, 72.5, 73, 73.5, 74, 74.5, 76, 76.2, 76.5, 77]
range = max(data) - min(data)
range
Way to Filter Panda DataFrames.
- Select column and do it as a data Series (not a data frame)
- Apply operator
- Select rows with True.
Random generator information.
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Create a loop that iterates through the
brics
DataFrame and prints "The population of {country} is {population} million!". - Create a histogram of the life expectancies for countries in Africa in the
gapminder
DataFrame. Make sure your plot has a title, axis labels, and has an appropriate number of bins. - Simulate 10 rolls of two six-sided dice. If the two dice add up to 7 or 11, print "A win!". If the two dice add up to 2, 3, or 12, print "A loss!". If the two dice add up to any other number, print "Roll again!".