Skip to content
0

Investigating Lego Data

Includes answers to prompt questions and a summary of the findings.

1. What is the average number of Lego sets released per year?

Average number of unique sets per year:

Spinner
DataFrameas
df
variable
WITH sets_per_year AS (
    SELECT year, COUNT(DISTINCT(set_num)) AS total_sets_released
    FROM sets
    GROUP BY year
    ORDER BY year ASC
    )

SELECT AVG(total_sets_released) AS avg_sets_per_year
FROM sets_per_year;

The average number of unique sets released per year is 176.86 (177 rounded).**

More investigation on sets released per year:

Spinner
DataFrameas
total_sets_released_by_year
variable
SELECT year, COUNT(DISTINCT(set_num)) AS total_sets_released
FROM sets
GROUP BY year
ORDER BY year ASC;
Current Type: Bar
Current X-axis: year
Current Y-axis: total_sets_released
Current Color: None

Sets Released Per Year Over Time

2. What is the average number of Lego parts per year?

Average parts for unique sets per year:

Spinner
DataFrameas
avg_parts_per_year
variable
SELECT SUM(num_parts)/COUNT(DISTINCT(year)) AS avg_parts_per_year
FROM sets;

The average number of parts in unique sets released per year is 28,698.

3. Create a visualization for item 2.

Spinner
DataFrameas
total_parts_per_year
variable
SELECT year, SUM(num_parts) AS total_parts_per_year
FROM sets
GROUP BY year
ORDER BY year ASC;

Ideally, there would be an average line on the chart as a reference! This could be accomplished within Python with the DataFrame saved as a csv and then imported using pandas (or a similar method).

Current Type: Bar
Current X-axis: year
Current Y-axis: total_parts_per_year
Current Color: None

Total Lego Parts in New Sets by Year