Pet Supplies
Background
PetMind is a retailer of products for pets. They are based in the United States. PetMind sells products that are a mix of luxury items and everyday items. Luxury items include toys. Everyday items include food.
The company wants to increase sales by selling more everyday products repeatedly. They have been testing this approach for the last year. They now want a report on how repeat purchases impact sales.
Dataset Overview
The dataset contains 1,500 records of pet supply products, capturing key information about each item.
Dataset Features:
-
Product ID: A unique identifier for each product
-
Category: The type of product (i.e., Housing, Food, Toys, Equipment, Medicine, Accessory)
-
Animal: The type of pet the product is intended for (i.e., Dog, Cat, Fish, Bird)
-
Size: The size variant of the product (i.e., Small, Medium, Large)
-
Price: The selling price of the product
-
Sales: The value of all sales of the product in the last year
-
Rating: Customer rating for the product from 1 to 10
-
Repeat_Purchase: A binary indicator showing whether the product was purchased more than once by the same customer
Purpose of the Dataset Analysis
The dataset was analyzed to:
-
Understand overall sales trends across different pet product categories
-
Identify which product types generate the highest revenue and customer engagement
-
Analyze the impact of repeat purchases on total sales
-
Explore how product attributes such as price, size, and customer ratings influence buying behavior
-
Provide data-driven recommendations to improve sales strategy and customer retention
Data Validation
-
Product_id: There were 1500 unique values. There were no missing values. No changes were made to this column.
-
Category: There were 6 unique values that match what was given in the description. There were 25 missing values. The missing values were replaced with Unknown.
-
Animal: There were four unique values, matching what was stated in the instructions. There were no missing values. No changes were made to this column.
-
Size: The values in this column were the same but in different formats (some were in lowercase, some in uppercase, etc.). I converted all the values in the column to title case. There are no missing values in the column.
-
Price: The values in this column ranged from 12.85 to 54.16. There were 150 missing values. The missing values were replaced with the median value of the remaining data, which was 28.06. I also converted its data type from object to float as the description stated that the data was continuous, rounding it to 2 decimal places.
-
Sales: The values in this column ranged from 286.94 to 2255.96. I rounded the values in the column to 2 decimal places, matching what was stated in the description. There were no missing values. No changes were made to this column.
-
Rating: The values in this column were between 1 and 10. There were 150 missing values. The missing values were replaced with 0. I converted the data type of the column from float to integer because the description stated that the column consisted of discrete values.
-
Repeat_purchase: The values in this column were 1 and 0. There were no missing values. No changes were made to this column.
How many products are repeat purchases?
The category indicated with 1 has the most observations meaning more customers repeatedly bought products.
The categories are unbalanced as 312 (906-594=312) more people repeatedly bought products.
Sales Distribution
The sales distribution reveals that Equipment and Toys are the highest-selling categories, contributing significantly to overall sales. Food and Medicine also perform well, while Housing and Accessory show moderate sales. The Unknown category has the least impact on overall sales. From this distribution, I would advise the team to focus on high-performing categories and make informed decisions to improve lower-performing ones.
Relationship Between Repeat Purchases and Sales
Customers who make repeat purchases contribute significantly more to the overall sales compared to one-time buyers. Their higher level of engagement and loyalty result in increased revenue for the company.
The difference in sales between the two groups (884,046.17 - 610,850.60 = 273,195.57) represents the additional revenue generated from repeat purchases.
Encouraging repeat purchases and building customer loyalty can be a valuable strategy for increasing sales and revenue.
Recommendations
I would recommend we focus on the following steps:
-
Focus on promoting high-performing categories ('Equipment' and 'Toys') to sustain and boost sales.
-
Implement strategies to encourage repeat purchases, such as loyalty programs or personalized marketing, to enhance revenue.
-
Investigate underperforming categories to identify potential areas for improvement or targeted marketing efforts.