Skip to content

Introduction to Python

Run the hidden code cell below to import the data used in this course.

# Importing course packages; you can add more too!
import numpy as np
import math

# Import columns as numpy arrays
baseball_names = np.genfromtxt(
    fname="baseball.csv",  # This is the filename
    delimiter=",",  # The file is comma-separated
    usecols=0,  # Use the first column
    skip_header=1,  # Skip the first line
    dtype=str,  # This column contains strings
)
baseball_heights = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
    fname="baseball.csv", delimiter=",", usecols=5, skip_header=1
)

soccer_names = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=1,
    skip_header=1,
    dtype=str,
    encoding="utf", 
)
soccer_ratings = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=2,
    skip_header=1,
    encoding="utf", 
)
soccer_positions = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=3,
    skip_header=1,
    encoding="utf", 
    dtype=str,
)
soccer_heights = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=4,
    skip_header=1,
    encoding="utf", 
)
soccer_shooting = np.genfromtxt(
    fname="soccer.csv",
    delimiter=",",
    usecols=8,
    skip_header=1,
    encoding="utf", 
)

Python Basics

Variable types Floats are decimal. Integers are a number that can be positive or negative or 0, but they cannot have a decimal point. Strings are texts. Boolean indicates a True or a False value.

float() str() int() bool()

Index Python index starts with 0 not 1. So the fifth element in a list means fam[4]. This will return the fifth element. Also, fam[2:5] means start from the second element (included) and end at the fifth (nor included). [2:] mean start from two go till the end.

Methods sister.replace("z","sa") will return "liz" (name of the sister) as lisa. sister.index(2) will return z since z is the third element of the string. fam.index(mom) on the other hand will return 4 because now mom is the fifth element of the list.

fam.append("me") will add the string me to the list.

Download and install packages*

import NameOfThePackage

from NameOfThePackage import NameOfTheFunction

To change the name of the package or function you can do the following:

import numpy as np

from scipy.linalg import inv as my_inv

scipy: name fo the package

linalg: name of the subpackage

inv: name of the function

my_inv: the name that I gave

Arrays

Import numpy

import numpy as np

Calculate the BMI: bmi

np_height_m = np.array(height_in) * 0.0254 np_weight_kg = np.array(weight_lb) * 0.453592 bmi = np_weight_kg / np_height_m ** 2

Create the light array

light = bmi<21

Print out light

print(light)

Print out BMIs of all baseball players whose BMI is below 21

print(bmi[light])

print(bmi[light])

Create baseball, a list of lists

baseball = [[180, 78.4], [215, 102.7], [210, 98.5], [188, 75.2]]

Import numpy

import numpy as np

Create a 2D numpy array from baseball: np_baseball

np_baseball = np.array(baseball)

Print out the type of np_baseball

print(type(np_baseball))

Print out the shape of np_baseball

print(np_baseball.shape)

Create np_baseball (2 cols)

np_baseball = np.array(baseball)

Print out the 50th row of np_baseball

print(np_baseball[49,:])

Select the entire second column of np_baseball: np_weight_lb

np_weight_lb = np_baseball[:,1]

Print out height of 124th player

print(np_baseball[123,0])

stats* np.mean(np_city[:,0])
np.mean(np_city[:,0]) np.corrcoef(np_city[:,0], np_city[:,1]) np.std(np_city[:,0])

# Add your code snippets here

Explore Datasets

Use the arrays imported in the first cell to explore the data and practice your skills!

  • Print out the weight of the first ten baseball players.
  • What is the median weight of all baseball players in the data?
  • Print out the names of all players with a height greater than 80 (heights are in inches).
  • Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
  • The values in soccer_shooting are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98).
  • Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out!
  • What is the average rating for attacking players ('A')?