Skip to content
PostgreSQL Lego biggest sets
Lego biggest sets query
Since the lego company was founded on the year 1932, it has remained a key player in the toy industry and market, and a lot of diverse themes and sets have been published and purchased by avid fans. Lego sets can generally be classified into several specific topics, ranging from Star Wars themed sets to technical sets that include generic spare parts.
To analise the history of this company, the Lego database provided by datacamp has been used in this analysis to obtain various key indicators stored in the data, and specifically;
- The years and names of the oldest and newest lego
themes - The total number of lego
themesandsets - The top 5 themes with greater number of
sets - The top 10
setspublished between the years 2000 and 2015 (inclusive) with more number of parts, excluding those within the all-time top 5 lego themes with more number ofsets
1. Years and names of the oldest and newest lego themes
Year of oldest and newest sets
DataFrameas
years
variable
SELECT
MIN(year) AS min_year_available,
MAX(year) AS max_year_available
FROM sets;Name and year of the oldest and newest themes
DataFrameas
df1
variable
SELECT
DISTINCT t.name AS theme,
s.year
FROM
themes as t
INNER JOIN
sets as s
ON t.id = s.theme_id
WHERE
s.year = (SELECT MIN(year) FROM sets) OR
s.year = (SELECT MAX(year) FROM sets)
ORDER BY
s.year,
theme;2. Total number of lego themes and sets
Number of themes
DataFrameas
num_themes
variable
SELECT COUNT(DISTINCT(name)) AS num_themes
FROM themes;Number of sets
DataFrameas
num_sets
variable
SELECT COUNT(DISTINCT(name)) AS num_sets
FROM sets;3. Top 5 lego themes with the greater number of sets
DataFrameas
top_themes
variable
SELECT
t.name AS theme,
COUNT(s.name) AS num_sets
FROM
themes AS t
INNER JOIN
sets AS s
ON t.id = s.theme_id
GROUP BY t.name
ORDER BY num_sets DESC
LIMIT 5;4. Top 10 sets published between the years 2000 and 2015 with more parts, excluding the all-time top 5 themes
DataFrameas
top_sets
variable
-- Set top_themes
WITH top_themes AS (
SELECT
t.name AS theme,
COUNT(s.name) as num_sets
FROM
themes as t
INNER JOIN
sets as s
ON t.id = s.theme_id
GROUP BY t.name
ORDER BY num_sets DESC
LIMIT 5)
-- Get name, year, number of parts and theme of the top 10 sets published between the years 2000 and
-- 2015 (inclusive) with more number of parts and excluding the all-time top 5 most popular lego themes
SELECT
s.name AS set_name,
s.year AS set_year,
s.num_parts AS set_num_parts,
t.name as theme
FROM
sets AS s
INNER JOIN
themes AS t
ON s.theme_id = t.id
WHERE
s.year BETWEEN 2000 AND 2015 AND
t.name NOT IN (SELECT theme FROM top_themes)
ORDER BY set_num_parts DESC
LIMIT 10;