Skip to content

## Introduction to Python

Run the hidden code cell below to import the data used in this course.

```
# Importing course packages; you can add more too!
import numpy as np
import math
# Import columns as numpy arrays
baseball_names = np.genfromtxt(
fname="baseball.csv", # This is the filename
delimiter=",", # The file is comma-separated
usecols=0, # Use the first column
skip_header=1, # Skip the first line
dtype=str, # This column contains strings
)
baseball_heights = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=3, skip_header=1
)
baseball_weights = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=4, skip_header=1
)
baseball_ages = np.genfromtxt(
fname="baseball.csv", delimiter=",", usecols=5, skip_header=1
)
soccer_names = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=1,
skip_header=1,
dtype=str,
encoding="utf",
)
soccer_ratings = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=2,
skip_header=1,
encoding="utf",
)
soccer_positions = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=3,
skip_header=1,
encoding="utf",
dtype=str,
)
soccer_heights = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=4,
skip_header=1,
encoding="utf",
)
soccer_shooting = np.genfromtxt(
fname="soccer.csv",
delimiter=",",
usecols=8,
skip_header=1,
encoding="utf",
)
```

### Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

*Add your notes here*

### Explore Datasets

Use the arrays imported in the first cell to explore the data and practice your skills!

- Print out the weight of the first ten baseball players.
- What is the median weight of all baseball players in the data?
- Print out the names of all players with a height greater than 80 (heights are in inches).
- Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
- The values in
`soccer_shooting`

are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98). - Do taller players get higher ratings? Calculate the correlation between
`soccer_ratings`

and`soccer_heights`

to find out! - What is the average rating for attacking players (
`'A'`

)?

```
#Print out the weight of the first ten baseball players.
print(baseball_heights[0:10])
```

```
#What is the median weight of all baseball players in the data?
median_height_baseball = np.median(baseball_heights[:])
print(median_height_baseball)
```

```
import numpy as np
# Print out the names of all players with a height greater than 80 (heights are in inches).
baseball_players_above_80 = np.array(baseball_names[baseball_heights > 80])
print(baseball_players_above_80)
```

```
#Who is taller on average? Baseball players or soccer players? Keep in mind that baseball heights are stored in inches!
import numpy as np
baseball_heights_meters = np.array(baseball_heights * .0254 * 100)
print(baseball_heights_meters)
soccer_heights_average = np.mean(soccer_heights)
baseball_heights_average = np.mean(baseball_heights_meters)
soccer_heights_average = round(soccer_heights_average,1)
baseball_heights_average = round(baseball_heights_average,1)
print(soccer_heights_average)
print(baseball_heights_average)
print("Baseball Players are higher on average")
```

```
#The values in soccer_shooting are decimals. Convert them to whole numbers (e.g., 0.98 becomes 98).
print(soccer_shooting)
soccer_shooting = np.array(soccer_shooting)
soccer_shooting_whole = (soccer_shooting * 100)
print(soccer_shooting_whole)
```

```
#Do taller players get higher ratings? Calculate the correlation between soccer_ratings and soccer_heights to find out!
import numpy as np
print(np.corrcoef(soccer_ratings,soccer_heights))
print("Taller soccer players do not get higher ratings against shorter soccer players")
```

```
#What is the average rating for attacking players ('A')?
import numpy as np
print(soccer_positions)
np_soccer_positions = np.array(soccer_positions)
np_soccer_ratings = np.array(soccer_ratings)
Attack_Rating = np_soccer_ratings[np_soccer_positions == "A"]
Average_Attack_Rating = np.mean(Attack_Rating)
Average_Attack_Rating_Round = round(Average_Attack_Rating,2)
print(Average_Attack_Rating_Round)
print("The average rating for attacking players is " + str(Average_Attack_Rating_Round))
```