Skip to content
Understanding Lego sets popularity
Information about the data
💾 The data
inventory_parts
- "inventory_id" - id of the inventory the part is in (as in the inventories table)
- "part_num" - unique id for the part (as in the parts table)
- "color_id" - id of the color
- "quantity" - the number of copies of the part included in the set
- "is_spare" - whether or not it is a spare part
parts
- "part_num" - unique id for the part (as in the inventory_parts table)
- "name" - name of the part
- "part_cat_id" - part category id (as in part_catagories table)
part_categories
- "id" - part category id (as in parts table)
- "name" - name of the category the part belongs to
colors
- "id" - id of the color (as in inventory_parts table)
- "name" - color name
- "rgb" - rgb code of the color
- "is_trans" - whether or not the part is transparent/translucent
inventories
- "id" - id of the inventory the part is in (as in the inventory_sets and inventory_parts tables)
- "version" - version number
- "set_num" - set number (as in sets table)
inventory_sets
- "inventory_id" - id of the inventory the part is in (as in the inventories table)
- "set_num" - set number (as in sets table)
- "quantity" - the quantity of sets included
sets
- "set_num" - unique set id (as in inventory_sets and inventories tables)
- "name" - the name of the set
- "year" - the year the set was published
- "theme_id" - the id of the theme the set belongs to (as in themes table)
- num-parts - the number of parts in the set
themes
- "id" - the id of the theme (as in the sets table)
- "name" - the name of the theme
- "parent_id" - the id of the larger theme, if there is one
Acknowledgments: Rebrickable.com
Importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
1. What is the average number of Lego sets released per year?
Studying the data of the sets table and counting the number of sets produced per year
Unknown integration
DataFrameavailable as
df
variable
SELECT COUNT(set_num) as count_set_year, year
FROM sets
GROUP BY year
ORDER BY year ASC
Visualization of data on the number of sets produced for each year
Current Type: Bar
Current X-axis: year
Current Y-axis: count_set_year
Current Color: None
To get the current value of the average number of sets produced, data from 1998 to the latest available data will be taken for calculation
print('Average number of Lego sets produced per year (current calculation):', df[df['year'] >= 1998]['count_set_year']
.agg(np.mean).round(0))
Average number of Lego sets produced per year for all years
print('Average number of Lego sets produced per year:', df['count_set_year'].agg(np.mean).round(0))