Let's dive into the Lego world
Table of contents
- Executive summary
- Data and Method
- Analysis and results
- Summary of main findings
- Annex
1. Executive Summary
Itroduction
Lego is a Danish company founded in 1932 by Ole Kirk Christiansen. Originally, the company made wooden toys, but by 1934, it had begun creating plastic toys, including the Lego bricks that would become its most famous product. Lego has grown to be one of the world's most popular toys, noted for its versatility and creativity.
Lego has always had a strong appeal to both children and adults. Lego offers a fun and imaginative play experience for children, allowing them to build and design their own worlds and characters. For adults, Lego offers a nostalgic connection to their childhood and a creative opportunity for self-expression. Lego has recently expanded into video games, movies, and theme parks, further strengthening its appeal to a wider audience.
The task
We recently applied for a position as a data analyst intern at Denmark's well-known Lego Group. We were given the following take-home project as part of the job interview process:
We were tasked with analyzing the popularity of various Lego sets and themes using the dataset provided. The goal is to become comfortable with the data to be prepared for an interview with a business stakeholder. In doing so we will answer several questions:
- What is the average number of Lego sets released per year?
- What is the average number of Lego parts per year?
- Create a visualization for item 2.
- What are the 5 most popular colors used in Lego parts?
- What proportion of Lego parts are transparent?
- What are the 5 rarest lego bricks?
- What are the top 10 Lego themes?
The report starts with a concise introduction to the data and methodology utilized. This is followed by a comprehensive data analysis phase, where we meticulously examine all relevant information to achieve the project objectives. The report ends with a summarization of the most important findings. The aim of the report is to provide a comprehensive and in-depth examination of the data, leading to a better understanding of the results and their implications.
Results
The results of the analysis showed that:
- The company has released an average of 176.82 sets per year since its establishment.
- With an average of 162.26 pieces in each set.
- The most commonly used colors in Lego parts are:
- Black
- White
- Light Bluish Gray
- Red
- Dark Bluish Gray
- The analysis also revealed that 6.29% of the 23114 total parts were observed to be transparent with 1455 pieces, while 21695 were noted to be of solid color.
- The rarest Lego bricks were identified as id_part:
- 30603pb04
- 25123
- 30601
- 3001b
- 5306bc017.
- The top 10 lego themes are:
- Supplemental
- Technic
- City
- Friends
- Basic Set
- Creator
- Gear
- Service Packs
- Duplo
- Star Wars.
- These results provide valuable insights into the company's production and inventory, enabling informed decisions to be made regarding future production goals and inventory management.
2. Data and Methods
The database consists of eight tables having information regarding the inventories, parts, part color, sets and themes, spanning the entire history of the company up until the year 2017. Below there is a visualization of how the tables are related to each other. (source) About the method, the database has been querying with PostgreSQL using Common Table Expression, subqueries, aggregate functions, window functions and filtering. The visualizations have been created using the Python library Plotly.
inventory_parts
- "inventory_id" - id of the inventory the part is in (as in the inventories table)
- "part_num" - unique id for the part (as in the parts table)
- "color_id" - id of the color
- "quantity" - the number of copies of the part included in the set
- "is_spare" - whether or not it is a spare part
parts
- "part_num" - unique id for the part (as in the inventory_parts table)
- "name" - name of the part
- "part_cat_id" - part category id (as in part_catagories table)
part_categories
- "id" - part category id (as in parts table)
- "name" - name of the category the part belongs to
colors
- "id" - id of the color (as in inventory_parts table)
- "name" - color name
- "rgb" - rgb code of the color
- "is_trans" - whether or not the part is transparent/translucent
inventories
- "id" - id of the inventory the part is in (as in the inventory_sets and inventory_parts tables)
- "version" - version number
- "set_num" - set number (as in sets table)
inventory_sets
- "inventory_id" - id of the inventory the part is in (as in the inventories table)
- "set_num" - set number (as in sets table)
- "quantity" - the quantity of sets included
sets
- "set_num" - unique set id (as in inventory_sets and inventories tables)
- "name" - the name of the set
- "year" - the year the set was published
- "theme_id" - the id of the theme the set belongs to (as in themes table)
- num_parts - the number of parts in the set
themes
- "id" - the id of the theme (as in the sets table)
- "name" - the name of the theme
- "parent_id" - the id of the larger theme, if there is one
Acknowledgments: Rebrickable.com
3. Analysis and results
1 hidden cell
3.1 What is the average number of Lego sets released per year?
In order to gain a comprehensive understanding of the number of sets released by the company over the years, it is necessary to execute a query on the 'sets' table consisting of 11,673 rows and 5 columns. By performing the query, we will be able to determine the annual number of sets released. This information will give us a clear picture of the company's production over time, allowing us to track its growth and development. Furthermore, by analyzing the data contained within the table, we will have the ability to calculate the average number of sets released by the company since its establishment.
df
1 hidden cell
The average has been estimated to be 176.82, providing us with a baseline for understanding the company's production output over time and enabling us to make informed decisions about the company's future production goals.
3.2 What is the average number of Lego parts per year?
Through a more in-depth analysis of the 'sets' table, we can gain a deeper understanding of the number of pieces that the company has produced over time. This analysis not only provides us with the overall average number of pieces across all sets, but also offers us an opportunity to identify trends and patterns in the data.
df