Skip to content
Investigating Lego Data by Mia Freidenburg
Investigating Lego Data
Includes answers to prompt questions and a summary of the findings.
1. What is the average number of Lego sets released per year?
Average number of unique sets per year:
DataFrameas
df
variable
WITH sets_per_year AS (
SELECT year, COUNT(DISTINCT(set_num)) AS total_sets_released
FROM sets
GROUP BY year
ORDER BY year ASC
)
SELECT AVG(total_sets_released) AS avg_sets_per_year
FROM sets_per_year;The average number of unique sets released per year is 176.86 (177 rounded).**
More investigation on sets released per year:
DataFrameas
total_sets_released_by_year
variable
SELECT year, COUNT(DISTINCT(set_num)) AS total_sets_released
FROM sets
GROUP BY year
ORDER BY year ASC;Current Type: Bar
Current X-axis: year
Current Y-axis: total_sets_released
Current Color: None
Sets Released Per Year Over Time
2. What is the average number of Lego parts per year?
Average parts for unique sets per year:
DataFrameas
avg_parts_per_year
variable
SELECT SUM(num_parts)/COUNT(DISTINCT(year)) AS avg_parts_per_year
FROM sets;The average number of parts in unique sets released per year is 28,698.
3. Create a visualization for item 2.
DataFrameas
total_parts_per_year
variable
SELECT year, SUM(num_parts) AS total_parts_per_year
FROM sets
GROUP BY year
ORDER BY year ASC;Ideally, there would be an average line on the chart as a reference! This could be accomplished within Python with the DataFrame saved as a csv and then imported using pandas (or a similar method).
Current Type: Bar
Current X-axis: year
Current Y-axis: total_parts_per_year
Current Color: None
Total Lego Parts in New Sets by Year