Introduction to Python
👋 Welcome to your workspace! Here, you can write and run Python code and add text. The purpose of this workspace is to allow you to experiment with the data from Introduction to Python and practice your newly learned skills with some challenges. You can find out more about DataCamp Workspace here.
Cells with text (such as this one) are Markdown cells. Markdown cells can contain notes, explain code, and summarize findings. As this may be your first workspace, we have included some steps to get you started!
1. Get Started
Below is a code cell. It is used to execute Python code. The code below imports two packages you used in Introduction to Python: numpy
and math
. The code also imports data you used in the course.
It contains a function you may not have seen before: np.genfromtxt()
. This function can be used to read in data from a csv
file. You can review the code comments on the first array (baseball_names
) to see how the function works!
🏃To execute the code, select the cell and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell.
# Importing course packages; you can add more too!
import numpy as np
import math
# Import columns as numpy arrays
baseball_names = np.genfromtxt(
fname="datasets/baseball.csv", # This is the filename
delimiter=",", # The file is comma-separated
usecols=0, # Use the first column
skip_header=1, # Skip the first line
dtype=str, # This column contains strings
)
baseball_heights = np.genfromtxt(
fname="datasets/baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
fname="datasets/baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
fname="datasets/baseball.csv", delimiter=",", usecols=5, skip_header=1
)
# Print the first array
print(baseball_names)
2. Write Code
After running the cell above, you have created four numpy
arrays: baseball_names
, baseball_heights
, baseball_weights
, and baseball_ages
.
Try one (or more) of the following tasks to get you started. Don't forget to add more code cells if you need them. This is your place to experiment!
- Print out the weight of the first ten baseball players. If you're stuck, try reviewing this video!
- What is the median weight of all baseball players in the data? If you're stuck, try reviewing this video!
- Print out the names of all players with a height greater than 80 (heights are in inches). If you're stuck, try reviewing this video!
print('First ten weight of baseball players:')
print(baseball_weights[:10])
print('Median weight of all baseball players is: ', np.median(baseball_weights))
print('Players with height greater than 80:')
print(baseball_names[baseball_heights > 80])
3. Load More Data
In the final exercise of Introduction to Python, you experimented with soccer data. Below is the code to import several columns from this data as numpy
arrays.
🏃To execute the code, select the cell and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell.
# Import columns as numpy arrays
soccer_names = np.genfromtxt(
fname="datasets/soccer.csv",
delimiter=",",
usecols=1,
skip_header=1,
dtype=str,
encoding="utf", # Encoding set to utf so the data can be read in properly
)
soccer_ratings = np.genfromtxt(
fname="datasets/soccer.csv",
delimiter=",",
usecols=2,
skip_header=1,
encoding="utf", # Encoding set to utf so the data can be read in properly
)
soccer_positions = np.genfromtxt(
fname="datasets/soccer.csv",
delimiter=",",
usecols=3,
skip_header=1,
encoding="utf", # Encoding set to utf so the data can be read in properly
dtype=str,
)
soccer_heights = np.genfromtxt(
fname="datasets/soccer.csv",
delimiter=",",
usecols=4,
skip_header=1,
encoding="utf", # Encoding set to utf so the data can be read in properly
)
soccer_shooting = np.genfromtxt(
fname="datasets/soccer.csv",
delimiter=",",
usecols=8,
skip_header=1,
encoding="utf", # Encoding set to utf so the data can be read in properly
)
# Print the first array
print(soccer_names)
4. Continue to Explore
Just as with the baseball data, you now have a set of numpy
arrays available containing different types of information about soccer players. Here is another set of challenges for you to try.
- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches! If you're stuck, try reviewing this video.
- The values in
soccer_shooting
are whole numbers. Convert them to a decimal (e.g., 98 becomes 0.98). If you're stuck, try reviewing this video. - Do taller players get higher ratings? Calculate the correlation between
soccer_ratings
andsoccer_heights
to find out! If you're stuck, try reviewing this video. - What is the average rating for attacking players (
'A'
)? If you're stuck, try reviewing this video.
# Who is taller on average?
if np.average(baseball_heights*2.54) > np.average(soccer_heights):
print('Baseball players are higher on average!')
else:
print('Soccer players are higher on average!')
# Convert numbers to decimal - wrong request, its decimal and shall be converted to full numbers
print('Numbers before conversion:')
print(soccer_shooting)
soccer_shooting_whole_numbers = soccer_shooting*100
print('Converted numbers:')
print(soccer_shooting_whole_numbers)
# Taller player ratings
correlation = np.corrcoef(soccer_heights, soccer_ratings)
print('Correlation between heights and ratings: ')
print(correlation)
# Average rating for attacking players
average_rating = np.mean(soccer_ratings[soccer_positions == 'A'])
print('Average rating for attacking players: ')
print(average_rating)
Testing #Markdowns
#Testing #AddCode