Electrify your insights with BI
📖 Background
Equipment, such as computers and smartphones, is widely available today, and the global consumer electronics market generated an impressive $950 billion in revenue in 2024.
What better way than to utilize your BI skills on a dataset related to this vast industry? As a BI analyst for GlobalElec, a leading global electronics retailer, you will help analyze and visualize the company’s sales data, covering customers, products, sales, stores, and international exchange rates.
Your task is to uncover trends like the dominance of telephony, the rise of online sales, and regional market leaders while crafting impactful dashboards in a BI tool of your choice.
Showcase your ability to transform raw data into actionable insights and shape the future of GlobalElec!
import pandas as pd
# Load all CSVs
sales = pd.read_csv("data/Sales.csv")
exchange = pd.read_csv("data/Exchange_Rates.csv",encoding="latin1")
stores = pd.read_csv("data/Stores.csv")
customers = pd.read_csv("data/Customers.csv",encoding="latin1")
products = pd.read_csv("data/Products.csv")
# --- Step 1: Merge Sales with Customers ---
merged = pd.merge(sales, customers, on="CustomerKey", how="left")
# --- Step 2: Merge with Products ---
merged = pd.merge(merged, products, on="ProductKey", how="left")
# --- Step 3: Merge with Stores ---
merged = pd.merge(merged, stores, on="StoreKey", how="left")
# --- Step 4: Merge with Exchange Rates (composite key: Currency + Date) ---
# Create composite key in both dataframes
merged["CurrencyDateKey"] = merged["Currency Code"].astype(str) + "-" + merged["Order Date"].astype(str)
exchange["CurrencyDateKey"] = exchange["Currency"].astype(str) + "-" + exchange["Date"].astype(str)
# Now merge
merged = pd.merge(merged, exchange[["CurrencyDateKey", "Exchange"]], on="CurrencyDateKey", how="left")
# --- Step 5: Drop helper key (optional) ---
merged.drop(columns=["CurrencyDateKey"], inplace=True)
# Save final merged dataset
merged.to_csv("final_dataset.csv", index=False)
print("✅ Final dataset created:", merged.shape)
print(merged.head())
merged.info()💾 The data
The data can be downloaded from the Files section (File > Show workbook files).
| Group | Column name | Dataset | Definition |
|---|---|---|---|
| Customers | CustomerKey | Customers.csv | Primary key to identify customers |
| Customers | Gender | Customers.csv | Customer gender |
| Customers | Name | Customers.csv | Customer full name |
| Customers | City | Customers.csv | Customer city |
| Customers | State Code | Customers.csv | Customer state (abbreviated) |
| Customers | State | Customers.csv | Customer state (full) |
| Customers | Zip Code | Customers.csv | Customer zip code |
| Customers | Country | Customers.csv | Customer country |
| Customers | Continent | Customers.csv | Customer continent |
| Customers | Birthday | Customers.csv | Customer date of birth |
| Products | ProductKey | Products.csv | Primary key to identify products |
| Products | Product Name | Products.csv | Product name |
| Products | Brand | Products.csv | Product brand |
| Products | Color | Products.csv | Product color |
| Products | Unit Cost USD | Products.csv | Cost to produce the product in USD |
| Products | Unit Price USD | Products.csv | Product list price in USD |
| Products | SubcategoryKey | Products.csv | Key to identify product subcategories |
| Products | Subcategory | Products.csv | Product subcategory name |
| Products | CategoryKey | Products.csv | Key to identify product categories |
| Products | Category | Products.csv | Product category name |
| Sales | Order Number | Sales.csv | Unique ID for each order |
| Sales | Line Item | Sales.csv | Identifies individual products purchased |
| Sales | Order Date | Sales.csv | Date the order was placed |
| Sales | Delivery Date | Sales.csv | Date the order was delivered |
| Sales | CustomerKey | Sales.csv | Unique key identifying which customer ordered |
| Sales | StoreKey | Sales.csv | Unique key identifying which store processed |
| Sales | ProductKey | Sales.csv | Unique key identifying which product purchased |
| Sales | Quantity | Sales.csv | Number of items purchased |
| Sales | Currency Code | Sales.csv | Currency used to process the order |
| Stores | StoreKey | Stores.csv | Primary key to identify stores |
| Stores | Country | Stores.csv | Store country |
| Stores | State | Stores.csv | Store state |
| Stores | Square Meters | Stores.csv | Store footprint in square meters |
| Stores | Open Date | Stores.csv | Store open date |
| Exchange Rates | Date | Exchange_Rates.csv | Date |
| Exchange Rates | Currency | Exchange_Rates.csv | Currency code |
| Exchange Rates | Exchange | Exchange_Rates.csv | Exchange rate compared to USD |
merged_data.to_csv("ElectrifyYourInsights.csv")Exchange_rate = pd.read_csv("data/Exchange_Rates.csv", encoding="latin1")
Exchange_rate.head()
Sales = pd.read_csv("data/Sales.csv", encoding="latin1")
Sales.head()
Sales["Order Date"]print(Sales["Order Date"].value_counts(dropna=False)) # dropna=False shows NaN counts too
print("\n" + "="*50 + "\n")
Sales["Order Date"] = pd.to_datetime(Sales["Order Date"], format='%m/%d/%Y')
print(Sales["Order Date"].value_counts(dropna=False)) # dropna=False shows NaN counts too
print("\n" + "="*50 + "\n")
# Extract date components
Sales['Year'] = Sales['Order Date'].dt.year
Sales['Month'] = Sales['Order Date'].dt.month
Sales['Day_of_Week'] = Sales['Order Date'].dt.day_name()
# Filter by date ranges
recent_orders = Sales[Sales['Order Date'] >= '2020-01-01']
# Group by month/year
# Only sum numeric columns to avoid summing datetime columns
monthly_sales = Sales.groupby(Sales['Order Date'].dt.to_period('M')).sum(numeric_only=True)
# Sort by date
Sales_sorted = Sales.sort_values('Order Date')💪 Challenge
Using any BI tool you have access to, create interactive visuals and a dashboard to summarize your research. Things to consider:
- Use this Workspace to prepare your data (optional).
- Some ideas to get you started (Don't feel limited by these, you're encouraged to use your skills to consolidate as much information as possible):
- Which countries or regions contribute the most to overall revenue?
- Which product categories and regions yield the highest profit margins?
- Are there patterns or changes in delivery times across different regions or product categories?
- Create a screenshot of your main dashboard/visuals, and paste that into the designated field.
- Summarize your findings in an executive summary.
✍️ Judging criteria
This is a community-based competition. The top 5 most upvoted entries will win.
The winners will receive DataCamp merchandise.