Skip to content
Introduction to Python - first try of DataCamp workspace
  • AI Chat
  • Code
  • Report
  • Introduction to Python

    👋 Welcome to your workspace! Here, you can write and run Python code and add text. The purpose of this workspace is to allow you to experiment with the data from Introduction to Python and practice your newly learned skills with some challenges. You can find out more about DataCamp Workspace here.

    Cells with text (such as this one) are Markdown cells. Markdown cells can contain notes, explain code, and summarize findings. As this may be your first workspace, we have included some steps to get you started!

    1. Get Started

    Below is a code cell. It is used to execute Python code. The code below imports two packages you used in Introduction to Python: numpy and math. The code also imports data you used in the course.

    It contains a function you may not have seen before: np.genfromtxt(). This function can be used to read in data from a csv file. You can review the code comments on the first array (baseball_names) to see how the function works!

    🏃To execute the code, select the cell and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell.

    # Importing course packages; you can add more too!
    import numpy as np
    import math
    
    # Import columns as numpy arrays
    baseball_names = np.genfromtxt(
        fname="datasets/baseball.csv",  # This is the filename
        delimiter=",",  # The file is comma-separated
        usecols=0,  # Use the first column
        skip_header=1,  # Skip the first line
        dtype=str,  # This column contains strings
    )
    baseball_heights = np.genfromtxt(
        fname="datasets/baseball.csv", delimiter=",", usecols=3, skip_header=1
    )
    baseball_weights = np.genfromtxt(
        fname="datasets/baseball.csv", delimiter=",", usecols=4, skip_header=1
    )
    baseball_ages = np.genfromtxt(
        fname="datasets/baseball.csv", delimiter=",", usecols=5, skip_header=1
    )
    
    # Print the first array
    print(baseball_names)

    2. Write Code

    After running the cell above, you have created four numpy arrays: baseball_names, baseball_heights, baseball_weights, and baseball_ages.

    Try one (or more) of the following tasks to get you started. Don't forget to add more code cells if you need them. This is your place to experiment!

    1. Print out the weight of the first ten baseball players. If you're stuck, try reviewing this video!
    2. What is the median weight of all baseball players in the data? If you're stuck, try reviewing this video!
    3. Print out the names of all players with a height greater than 80 (heights are in inches). If you're stuck, try reviewing this video!
    print('First ten weight of baseball players:')
    print(baseball_weights[:10])
    print('Median weight of all baseball players is: ', np.median(baseball_weights))
    print('Players with height greater than 80:')
    print(baseball_names[baseball_heights > 80])

    3. Load More Data

    In the final exercise of Introduction to Python, you experimented with soccer data. Below is the code to import several columns from this data as numpy arrays.

    🏃To execute the code, select the cell and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell.

    # Import columns as numpy arrays
    soccer_names = np.genfromtxt(
        fname="datasets/soccer.csv",
        delimiter=",",
        usecols=1,
        skip_header=1,
        dtype=str,
        encoding="utf",  # Encoding set to utf so the data can be read in properly
    )
    soccer_ratings = np.genfromtxt(
        fname="datasets/soccer.csv",
        delimiter=",",
        usecols=2,
        skip_header=1,
        encoding="utf",  # Encoding set to utf so the data can be read in properly
    )
    soccer_positions = np.genfromtxt(
        fname="datasets/soccer.csv",
        delimiter=",",
        usecols=3,
        skip_header=1,
        encoding="utf",  # Encoding set to utf so the data can be read in properly
        dtype=str,
    )
    soccer_heights = np.genfromtxt(
        fname="datasets/soccer.csv",
        delimiter=",",
        usecols=4,
        skip_header=1,
        encoding="utf",  # Encoding set to utf so the data can be read in properly
    )
    soccer_shooting = np.genfromtxt(
        fname="datasets/soccer.csv",
        delimiter=",",
        usecols=8,
        skip_header=1,
        encoding="utf",  # Encoding set to utf so the data can be read in properly
    )
    
    # Print the first array
    print(soccer_names)

    4. Continue to Explore

    Just as with the baseball data, you now have a set of numpy arrays available containing different types of information about soccer players. Here is another set of challenges for you to try.

    1. Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches! If you're stuck, try reviewing this video.
    2. The values in soccer_shooting are whole numbers. Convert them to a decimal (e.g., 98 becomes 0.98). If you're stuck, try reviewing this video.
    3. Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out! If you're stuck, try reviewing this video.
    4. What is the average rating for attacking players ('A')? If you're stuck, try reviewing this video.
    # Who is taller on average?
    if np.average(baseball_heights*2.54) > np.average(soccer_heights):
    	print('Baseball players are higher on average!')
    else:
    	print('Soccer players are higher on average!')
        
    # Convert numbers to decimal - wrong request, its decimal and shall be converted to full numbers
    print('Numbers before conversion:')
    print(soccer_shooting)
    soccer_shooting_whole_numbers = soccer_shooting*100
    print('Converted numbers:')
    print(soccer_shooting_whole_numbers)
    
    # Taller player ratings
    correlation = np.corrcoef(soccer_heights, soccer_ratings)
    print('Correlation between heights and ratings: ')
    print(correlation)
    
    # Average rating for attacking players
    average_rating = np.mean(soccer_ratings[soccer_positions == 'A'])
    print('Average rating for attacking players: ')
    print(average_rating)

    Testing #Markdowns

    #Testing #AddCode