Skip to content
Introduction to Python
Run the hidden code cell below to import the data used in this course.
# Importing course packages; you can add more too!
import numpy as np
import math
import pandas as pd
import matplotlib.pyplot as plt
# Import columns as numpy arrays
baseball_names = np.genfromtxt(
fname="baseball.csv", # This is the filename
delimiter=",", # The file is comma-separated
usecols=0, # Use the first column
skip_header=1, # Skip the first line
dtype=str, # This column contains strings
)
baseball_heights = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=5, skip_header=1
)
soccer_names = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=1,
skip_header=1,
dtype=str,
encoding="utf",
)
soccer_ratings = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=2,
skip_header=1,
encoding="utf",
)
soccer_positions = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=3,
skip_header=1,
encoding="utf",
dtype=str,
)
soccer_heights = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=4,
skip_header=1,
encoding="utf",
)
soccer_shooting = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=8,
skip_header=1,
encoding="utf",
)
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
df = pd.read_csv('soccer.csv')
df
Explore Datasets
Use the arrays imported in the first cell to explore the data and practice your skills!
- Print out the weight of the first ten baseball players.
- What is the median weight of all baseball players in the data?
- Print out the names of all players with a height greater than 80 (heights are in inches).
- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
- The values in
soccer_shooting
are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98). - Do taller players get higher ratings? Calculate the correlation between
soccer_ratings
andsoccer_heights
to find out! - What is the average rating for attacking players (
'A'
)?
- Print out the weight of the first ten baseball players.
#- Print out the weight of the first ten baseball players.
baseball= pd.read_csv('baseball.csv')
baseball[['Weight']].head(10)
- What is the median weight of all baseball players in the data?
print(baseball[["Weight"]].median())
Weight 200.0
#- What is the median weight of all baseball players in the data?
print(baseball[["Weight"]].median())
#baseball[["Weight"]].mean()
- Print out the names of all players with a height greater than 80 (heights are in inches).
#- Print out the names of all players with a height greater than 80 (heights are in inches).
#height_80 = baseball['Height'] > 80
#baseball[height_80]
height_80 = baseball[baseball['Height'] > 80]
height_80[['Name', 'Height']]
- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
#- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
inchtocm = baseball['Height'] * 2.54
#inchtocm= baseball[baseball['Height'] * 2.54]
baseball_avgh = inchtocm.mean()
print(baseball_avgh)
soccer_avgh = df['height'].mean()
print(soccer_avgh)
#==========================
# Calculate the average height of baseball players in inches
baseball_avg_height = baseball['Height'].mean()
# Convert the average height of baseball players to cm
baseball_avg_height_cm = baseball_avg_height * 2.54
# Calculate the average height of soccer players in cm
soccer_avg_height_cm = df['height'].mean()
# Compare the average heights and print the result
if baseball_avg_height_cm > soccer_avg_height_cm:
print(baseball_avg_height_cm, ".", "Baseball players are taller than soccer players on average.")
else:
print(baseball_avg_height_cm, "Soccer players are taller than baseball players on average.")