Skip to content

PRODUCT SALES


5 hidden cells

Data Validation

The dataset contains 15000 rows and 8 columns before cleaning and validation. I have validated all the columns against the criteria in the dataset table:

  1. week: numeric values without missing values, same as the description with a minimum of 1 week and maximum of 6 weeks. No cleaning is needed.
  2. sales_method: There are no missing values, but instead of 3 categories being found, 5 were found. The other two categories are misspelling of the existing categories. They were changed to their original categories.
  3. customer_id: These are unique values that have no missing values, same as the description. No cleaning is needed.
  4. nb_sold: These are numeric values without missing values, same as the description. No cleaning is needed.
  5. revenue: There are 1074 missing values. Firstly, revenue was grouped by sales_method, then the missing values were replaced by the median values of the corresponding sales_method.
  6. years_as_customer: numeric values without missing values, but two wrong entries have been identified i.e. 47, 63. The maximum number for this column should be 40. These entries have been replaced with the median.
  7. nb_site_visits: numeric values without missing values, same as the description. No cleaning is needed.
  8. state: 50 categories without missing values, same as the description. No cleaning is needed.

After data validation, the dataset contains 15000 rows and 8 columns.

About the Sales method

There were three sales methods used namely: Email, Call and Email + Call.

Hidden code

As shown above, there are 7466 customers that were only sent emails which represent about 50% of the dataset, 4962 customers that were only called which represent about 30% of the dataset and 2572 customers that were sent emails and called which represent about 20% of the dataset.

Hidden code

From the boxplot above, most customers bought products at lower price ranges. 50% of Customers bought products between $53 and $108. On average, customers bought products worth $91.

What sales method gave the best result?

Hidden code

Firstly, focus was made on the total revenue of each sales method. It can be seen that the Email sales method made the highest total revenue of $724,000, followed by the Email and Call sales method with a total revenue of $473,000 and Call sales method with a total revenue of $236,000.

Hidden code

From the boxplot above, customers that were contacted through Email and Call on average bought products worth $184.74, then customers who were contacted through Email on average bought products worth $95.58 and lastly customers who were contacted through Call on average bought products worth $49.07.

Hidden code

In the first week, the Email sales method had the highest total revenue of close to $248,000, followed by Call sales method with a total revenue of $27,000 and lastly, Email and Call sales method had a total revenue of $20,000.

On the sixth week, the total revenue generated by the Email sales method was $25,000 which is a decrease of 90% of the revenue of the first week. The total revenue generated by the Call sales method was $29,000 which is an increase of 7% of the revenue of the first week. Lastly, the Email and Call sales method had a total revenue of about $129,000 which is an increase of 545% of the revenue of the first week.

Though the Email sales method made the most revenue on the first week. The revenue declined to the point that its total revenue on its sixth week was similar to the other sales method revenue on their first week. If this trend continues, it would either bring even smaller revenue or stop bringing any revenue.

On the other hand, the Email and Call method made the least revenue on the first week. On its sixth week, the revenue increased by more than 500%. If this trend continues, the revenue would continually increase.