Skip to content
Introduction to Python
  • AI Chat
  • Code
  • Report
  • Introduction to Python

    Run the hidden code cell below to import the data used in this course.

    # Importing course packages; you can add more too!
    import numpy as np
    import math
    import pandas as pd
    import matplotlib.pyplot as plt
    
    # Import columns as numpy arrays
    baseball_names = np.genfromtxt(
        fname="baseball.csv",  # This is the filename
        delimiter=",",  # The file is comma-separated
        usecols=0,  # Use the first column
        skip_header=1,  # Skip the first line
        dtype=str,  # This column contains strings
    )
    baseball_heights = np.genfromtxt(
        fname="baseball.csv", delimiter=",", usecols=3, skip_header=1
    )
    baseball_weights = np.genfromtxt(
        fname="baseball.csv", delimiter=",", usecols=4, skip_header=1
    )
    baseball_ages = np.genfromtxt(
        fname="baseball.csv", delimiter=",", usecols=5, skip_header=1
    )
    
    soccer_names = np.genfromtxt(
        fname="soccer.csv",
        delimiter=",",
        usecols=1,
        skip_header=1,
        dtype=str,
        encoding="utf", 
    )
    soccer_ratings = np.genfromtxt(
        fname="soccer.csv",
        delimiter=",",
        usecols=2,
        skip_header=1,
        encoding="utf", 
    )
    soccer_positions = np.genfromtxt(
        fname="soccer.csv",
        delimiter=",",
        usecols=3,
        skip_header=1,
        encoding="utf", 
        dtype=str,
    )
    soccer_heights = np.genfromtxt(
        fname="soccer.csv",
        delimiter=",",
        usecols=4,
        skip_header=1,
        encoding="utf", 
    )
    soccer_shooting = np.genfromtxt(
        fname="soccer.csv",
        delimiter=",",
        usecols=8,
        skip_header=1,
        encoding="utf", 
    )

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Add your notes here

    # Add your code snippets here
    df = pd.read_csv('soccer.csv')
    df

    Explore Datasets

    Use the arrays imported in the first cell to explore the data and practice your skills!

    • Print out the weight of the first ten baseball players.
    • What is the median weight of all baseball players in the data?
    • Print out the names of all players with a height greater than 80 (heights are in inches).
    • Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
    • The values in soccer_shooting are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98).
    • Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out!
    • What is the average rating for attacking players ('A')?

    - Print out the weight of the first ten baseball players.

    #- Print out the weight of the first ten baseball players. 
    
    baseball= pd.read_csv('baseball.csv')
    baseball[['Weight']].head(10)
    

    - What is the median weight of all baseball players in the data?

    print(baseball[["Weight"]].median())

    Weight 200.0

    #- What is the median weight of all baseball players in the data? 
    
    print(baseball[["Weight"]].median())
    #baseball[["Weight"]].mean()

    - Print out the names of all players with a height greater than 80 (heights are in inches).

    #- Print out the names of all players with a height greater than 80 (heights are in inches). 
    #height_80 = baseball['Height'] > 80
    #baseball[height_80]
    height_80 = baseball[baseball['Height'] > 80]
    height_80[['Name', 'Height']]

    - Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!

    #- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
    inchtocm = baseball['Height'] * 2.54
    #inchtocm= baseball[baseball['Height'] * 2.54]
    baseball_avgh = inchtocm.mean()
    print(baseball_avgh)
    soccer_avgh = df['height'].mean()
    print(soccer_avgh)
    
    #==========================
    
    # Calculate the average height of baseball players in inches
    baseball_avg_height = baseball['Height'].mean()
    
    # Convert the average height of baseball players to cm
    baseball_avg_height_cm = baseball_avg_height * 2.54
    
    # Calculate the average height of soccer players in cm
    soccer_avg_height_cm = df['height'].mean()
    
    # Compare the average heights and print the result
    if baseball_avg_height_cm > soccer_avg_height_cm:
        print(baseball_avg_height_cm, ".", "Baseball players are taller than soccer players on average.")
    else:
        print(baseball_avg_height_cm, "Soccer players are taller than baseball players on average.")