Skip to content

Introduction to Python

Run the hidden code cell below to import the data used in this course.

# Importing course packages; you can add more too!
import numpy as np
import math

# Import columns as numpy arrays
baseball_names = np.genfromtxt(
    fname="baseball.csv",  # This is the filename
    delimiter=",",  # The file is comma-separated
    usecols=0,  # Use the first column
    skip_header=1,  # Skip the first line
    dtype=str,  # This column contains strings
)
baseball_heights = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=5, skip_header=1
)

soccer_names = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=1,
    skip_header=1,
    dtype=str,
    encoding="utf", 
)
soccer_ratings = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=2,
    skip_header=1,
    encoding="utf", 
)
soccer_positions = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=3,
    skip_header=1,
    encoding="utf", 
    dtype=str,
)
soccer_heights = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=4,
    skip_header=1,
    encoding="utf", 
)
soccer_shooting = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=8,
    skip_header=1,
    encoding="utf", 
)

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

#Import library
from matplotlib import pyplot as plt

# Get average
height_mean = soccer_heights.mean()
print(height_mean)

#Get players taller than mean
mean_taller = []
for index, player_height in enumerate(soccer_heights):
    if player_height > height_mean:
        #print(soccer_names[index] + " is number " + str(index))
        mean_taller.append(index)
    if index > 30:
        break
        
#Get striker taller mean
for index in mean_taller:
    if soccer_positions[index] != "A":
        print(soccer_names[index] + " " + str(index))
    
#See relation between heights and rating
ratings = soccer_ratings[10:100]
height = soccer_heights[10:100]
#print(ratings)
plt.scatter(ratings, height)
plt.xlabel("Player ratings"), plt.ylabel("Player heights")
plt.show()

Explore Datasets

Use the arrays imported in the first cell to explore the data and practice your skills!

  • Print out the weight of the first ten baseball players.
  • What is the median weight of all baseball players in the data?
  • Print out the names of all players with a height greater than 80 (heights are in inches).
  • Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
  • The values in soccer_shooting are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98).
  • Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out!
  • What is the average rating for attacking players ('A')?