Skip to content

Intermediate Python

Run the hidden code cell below to import the data used in this course.


1 hidden cell

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

  • Create a loop that iterates through the brics DataFrame and prints "The population of {country} is {population} million!".
  • Create a histogram of the life expectancies for countries in Africa in the gapminder DataFrame. Make sure your plot has a title, axis labels, and has an appropriate number of bins.
  • Simulate 10 rolls of two six-sided dice. If the two dice add up to 7 or 11, print "A win!". If the two dice add up to 2, 3, or 12, print "A loss!". If the two dice add up to any other number, print "Roll again!".

pandas:

the country and capital are strings, for example. Your datasets will typically comprise different data types, so we need a tool that's better suited for the job. To easily and efficiently handle this data, there's the Pandas package. Pandas is a high level data manipulation tool developed by Wes McKinney, built on the NumPy package. Compared to NumPy, it's more high level, making it very interesting for data scientists all over the world. In pandas, we store the tabular data like the brics table here in an object called a DataFrame. Pandas is an open source library, providing high-performance, easy-to-use data structures and data analysis tools for Python

import pandas as pd
import matplotlib.pyplot as plt

# Load your Excel file
df = pd.read_csv('datasets/User Experience and System Usability Evaluation Survey (Responses) - Form Responses 1 (1).csv')
sus_scores = df['SUS Score']  # Update column name to match the actual column name in the dataset
# Create a histogram of SUS scores
plt.hist(sus_scores, bins=10, alpha=0.7, color='blue', edgecolor='black')

plt.title('Histogram of SUS Scores')
plt.xlabel('SUS Score')
plt.ylabel('Frequency')

plt.show()

Recap:

  • **Square brackets:
  • limited functionality Ideally 2D NumPy arrays
  • my_array[rows, columns]****
  • **pandas
  • loc (label-based)
  • iloc (integer position-based)**

-_ Square brackets Column access brics[["country", "capital"]]

  • Row access: only through slicing brics[1:4]
  • loc (label-based)Row access brics.loc[["RU", "IN", "CH"]]
  • Column access brics.loc[:, ["country", "capital"]]
  • Row & Column accessbrics.loc[["RU", "IN", "CH"], ["country", "capital"]]_