Michael Jordan's first NBA season
Michael Jordan was one of the greatest basketball players of all time. Not only did he win six NBA titles with the Chicago Bulls, but he was selected as the Most Valuable Player (MVP) in five different seasons.
You want to explore the performances of "Air Jordan" in terms of points per game in his first season. You are especially interested in the variability of points scored over course of the season.
Instructions

data_jordan
, which is available in your workspace, contains Michael Jordan's points per game in his first NBA season. Print the data to the console.  Use the
plot()
function to make a scatterplot of thepoints
column. Use themain
argument to title the plot"Michael Jordan's first season"
 Calculate the
mean()
points per game and save the result tomean_jordan
 Add a horizontal line to the plot to show the mean points per game using the
abline()
function with one argument:h
(for "horizontal"). For example,abline(h = 7)
would create a horizontal line at the value 7.
data_jordan = read.table("http://assets.datacamp.com/course/Conway/Lecture_Data/L15L16_game_points_jordan.txt",header=T)
## The dataset `data_jordan` is already loaded
# View Michael Jordan's first season data
# Make a scatterplot of his points per game
# Calculate mean_jordan
# Add horizontal line with abline()
## The dataset `data_jordan` is already loaded
# View Michael Jordan's first season data
data_jordan
# Make a scatterplot of his points per game
plot(data_jordan$points, main = "Michael Jordan's first season")
# Calculate mean_jordan
mean_jordan < mean(data_jordan$points)
# Add horizontal line with abline()
abline(h = mean_jordan)
test_output_contains("data_jordan", incorrect_msg = "Print <code>data_jordan</code> to the console. You can do this by just typing its name.")
test_function("plot", args = c("x", "main"))
test_object("mean_jordan", incorrect_msg = "Did you calculate the mean points per game and assign the result to <code>mean_jordan</code>?")
test_function("abline", args = "h", incorrect_msg = "Draw the mean amount of points that Michael Jordan scored as a horizontal line on the scatterplot. You should use the function <code>abline</code> with the argument <code>h</code> for this. If you don't know how this function works, you can always type <code>?abline</code> to see the help file.")
success_msg("Nice! Take a look at your plot. Do you understand the concept of deviation with respect to the mean? If so, move on to the next exercise.")
Call the plot function with two arguments: the column of data you wish to plot and the main
argument with the appropriate title. Use the $
operator to select the points
column from data_jordan
.
Calculate the variance manually
As a reminder, we use the following process to calculate the sample variance:
 Calculate the sample mean
 Calculate the squared difference between each data point and the sample mean
 Sum these squared differences (i.e. compute the sum of squares)
 Divide the sum of squares by \(N1\) (i.e. the sample size minus 1)
Let's calculate the sample variance of Michael Jordan's points per game!
Instructions
The dataset data_jordan
is loaded into your workspace.
 Calculate the mean points per game and save the result to
mean_ppg
.  Subtract the mean points per game from the vector of points scored in each game and assign the result to
diff
.  Square this vector of differences and save to
squared_diff
.  Calculate the sample variance by summing the values in
squared_diff
withsum()
and dividing by the sample size minus 1 usinglength()
to count the number of games in the sample. Just print the result without saving it.  Check your result by calculating the variance with R's builtin
var()
function.
data_jordan = read.table("http://assets.datacamp.com/course/Conway/Lecture_Data/L15L16_game_points_jordan.txt",header=T)
## The dataset `data_jordan` is already loaded
# Calculate mean points per game
mean_ppg < ___
# Calculate deviations from mean
diff < ___
# Calculate squared deviations
squared_diff < ___
# Combine everything to compute sample variance
# Compare with the result of var()
## The dataset `data_jordan` is already loaded
# Calculate mean points per game
mean_ppg < mean(data_jordan$points)
# Calculate deviations from mean
diff < data_jordan$points  mean_ppg
# Calculate squared deviations
squared_diff < diff^2
# Combine everything to compute sample variance
sum(squared_diff) / (length(data_jordan$points)  1)
# Compare with the result of var()
var(data_jordan$points)
test_object("mean_ppg", undefined_msg = "Save the mean points per game to a new variable called `mean_ppg`", incorrect_msg = "Use `mean(data_jordan$points)` to compute the mean points per game and save the result to `mean_ppg`")
test_object("diff", undefined_msg = "Subtract `mean_ppg` from `data_jordan$points` and save the result to `diff`", incorrect_msg = "Subtract `mean_ppg` from `data_jordan$points` and save the result to `diff`")
test_object("squared_diff", undefined_msg = "Square `diff` with `diff^2` and save the result to `squared_diff`", incorrect_msg = "Square `diff` with `diff^2` and save the result to `squared_diff`")
test_correct({
test_function("sum", not_called_msg = "Don't forget to use the `sum()` function when manually computing the sample variance!")
test_function("length", not_called_msg = "Don't forget to use the `length()` function when manually computing the sample variance!")
}, {
test_output_contains("sum(squared_diff) / (length(data_jordan$points)  1)", incorrect_msg = "Sum the values in `squared_diff` with `sum()`, then divide by `length(data_jordan$points)  1`")
})
test_student_typed("var(data_jordan$points)", not_typed_msg = "Call `var()` with one argument: the `points` column of `data_jordan`")
test_error()
success_msg("Great work!")
 Calculate the mean points per game using the
mean()
function.  You can selecting a column from a dataframe by using the
$
operator.  Use the
^
operator to square every element in a vector. For example,c(1, 2, 3)^2
will result inc(1, 4, 9)
.  When you calculate the sample variance, do not forget to divide by
length(data)  1
instead oflength(data)
.