Skip to content

Introduction to Python

๐Ÿ‘‹ Welcome to your workspace! Here, you can write and run Python code and add text. The purpose of this workspace is to allow you to experiment with the data from Introduction to Python and practice your newly learned skills with some challenges. You can find out more about DataCamp Workspace here.

Cells with text (such as this one) are Markdown cells. Markdown cells can contain notes, explain code, and summarize findings. As this may be your first workspace, we have included some steps to get you started!

1. Get Started

Below is a code cell. It is used to execute Python code. The code below imports two packages you used in Introduction to Python: numpy and math. The code also imports data you used in the course.

It contains a function you may not have seen before: np.genfromtxt(). This function can be used to read in data from a csv file. You can review the code comments on the first array (baseball_names) to see how the function works!

๐ŸƒTo execute the code, select the cell and click "Run" or the โ–บ icon. You can also use Shift-Enter to run a selected cell.

# Importing course packages; you can add more too!
import numpy as np
import math

# Import columns as numpy arrays
baseball_names = np.genfromtxt(
    fname="datasets/baseball.csv",  # This is the filename
    delimiter=",",  # The file is comma-separated
    usecols=0,  # Use the first column
    skip_header=1,  # Skip the first line
    dtype=str,  # This column contains strings
)
baseball_heights = np.genfromtxt(
    fname="datasets/baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
    fname="datasets/baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
    fname="datasets/baseball.csv", delimiter=",", usecols=5, skip_header=1
)

# Print the first array
print(baseball_names)

2. Write Code

After running the cell above, you have created four numpy arrays: baseball_names, baseball_heights, baseball_weights, and baseball_ages.

Try one (or more) of the following tasks to get you started. Don't forget to add more code cells if you need them. This is your place to experiment!

  1. Print out the weight of the first ten baseball players. If you're stuck, try reviewing this video!
  2. What is the median weight of all baseball players in the data? If you're stuck, try reviewing this video!
  3. Print out the names of all players with a height greater than 80 (heights are in inches). If you're stuck, try reviewing this video!
print('First ten weight of baseball players:')
print(baseball_weights[:10])
print('Median weight of all baseball players is: ', np.median(baseball_weights))
print('Players with height greater than 80:')
print(baseball_names[baseball_heights > 80])

3. Load More Data

In the final exercise of Introduction to Python, you experimented with soccer data. Below is the code to import several columns from this data as numpy arrays.

๐ŸƒTo execute the code, select the cell and click "Run" or the โ–บ icon. You can also use Shift-Enter to run a selected cell.

# Import columns as numpy arrays
soccer_names = np.genfromtxt(
    fname="datasets/soccer.csv",
    delimiter=",",
    usecols=1,
    skip_header=1,
    dtype=str,
    encoding="utf",  # Encoding set to utf so the data can be read in properly
)
soccer_ratings = np.genfromtxt(
    fname="datasets/soccer.csv",
    delimiter=",",
    usecols=2,
    skip_header=1,
    encoding="utf",  # Encoding set to utf so the data can be read in properly
)
soccer_positions = np.genfromtxt(
    fname="datasets/soccer.csv",
    delimiter=",",
    usecols=3,
    skip_header=1,
    encoding="utf",  # Encoding set to utf so the data can be read in properly
    dtype=str,
)
soccer_heights = np.genfromtxt(
    fname="datasets/soccer.csv",
    delimiter=",",
    usecols=4,
    skip_header=1,
    encoding="utf",  # Encoding set to utf so the data can be read in properly
)
soccer_shooting = np.genfromtxt(
    fname="datasets/soccer.csv",
    delimiter=",",
    usecols=8,
    skip_header=1,
    encoding="utf",  # Encoding set to utf so the data can be read in properly
)

# Print the first array
print(soccer_names)

4. Continue to Explore

Just as with the baseball data, you now have a set of numpy arrays available containing different types of information about soccer players. Here is another set of challenges for you to try.

  1. Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches! If you're stuck, try reviewing this video.
  2. The values in soccer_shooting are whole numbers. Convert them to a decimal (e.g., 98 becomes 0.98). If you're stuck, try reviewing this video.
  3. Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out! If you're stuck, try reviewing this video.
  4. What is the average rating for attacking players ('A')? If you're stuck, try reviewing this video.
# Who is taller on average?
if np.average(baseball_heights*2.54) > np.average(soccer_heights):
	print('Baseball players are higher on average!')
else:
	print('Soccer players are higher on average!')
    
# Convert numbers to decimal - wrong request, its decimal and shall be converted to full numbers
print('Numbers before conversion:')
print(soccer_shooting)
soccer_shooting_whole_numbers = soccer_shooting*100
print('Converted numbers:')
print(soccer_shooting_whole_numbers)

# Taller player ratings
correlation = np.corrcoef(soccer_heights, soccer_ratings)
print('Correlation between heights and ratings: ')
print(correlation)

# Average rating for attacking players
average_rating = np.mean(soccer_ratings[soccer_positions == 'A'])
print('Average rating for attacking players: ')
print(average_rating)

Testing #Markdowns

#Testing #AddCode