Project: Cost and Customer Satisfaction Analysis for VoltBike Innovations

Cost and Customer Satisfaction Analysis for VoltBike Innovations

VoltBike Innovations is a leading company in the electric bicycle (e-bike) industry, specializing in the design and manufacture of high-performance e-bikes. The company is dedicated to advancing urban mobility solutions by delivering state-of-the-art e-bikes with features such as varying motor powers, advanced battery capacities, and efficient charge systems.

Recently, VoltBike Innovations has encountered some challenges in managing production costs while ensuring high levels of customer satisfaction. These issues have led to increased production expenses and variability in costs, impacting overall profitability.

You are part of the data analysis team tasked with providing actionable insights to help VoltBike Innovations address these challenges.

Task 1

Before you can start any analysis, you need to confirm that the data is accurate and reflects what you expect to see.

It is known that there are some issues with the production_data table, and the data team have provided the following data description.

Write a query to return data matching this description. You must match all column names and description criteria.
Create a cleaned version of the dataframe.

You should start with the data in the file ebike_data.csv.
Your output should be a dataframe named clean_data.
All column names and values should match the table below.

Column Name	Criteria
bike_type	Categorical. Type of e-bike. ['standard', 'folding', 'mountain', 'road']. Missing values should be replaced with 'standard'.
frame_material	Categorical. Material of the e-bike frame. ['aluminum', 'steel', 'carbon fiber']. Missing values should be replaced with 'unknown'.
production_cost	Continuous. Cost of production (in USD). Missing values should be replaced with median.
assembly_time	Continuous. Time taken for assembly (in minutes). Missing values should be replaced with mean.
top_speed	Continuous. Maximum speed of the e-bike (in km/h). Missing values should be replaced with mean.
battery_type	Categorical. Type of battery used. ['li-ion', 'nimh', 'lead acid']. Missing values should be replaced with 'other'.
motor_power	Continuous. Power output of the motor (in watts). Missing values should be replaced with median.
customer_score	Continuous. Customer satisfaction score (rating on a scale of 1 to 10). Missing values should be replaced with mean.

# Import libraries
import pandas as pd

# Read data
data = pd.read_csv('ebike_data.csv')

# Clean categorical values
data['bike_type'] = data['bike_type'].str.strip().str.lower()
data['frame_material'] = data['frame_material'].replace('STEel', 'steel').str.strip().str.lower()
data['battery_type'] = data['battery_type'].replace('-', 'other').str.strip().str.lower()

# Clean continuous data
data['production_cost'] = data['production_cost'].astype(float).round(2)
data['assembly_time'] = data['assembly_time'].astype(float).round(2)
data['top_speed'] = data['top_speed'].fillna(data['top_speed'].mean()).astype(float).round(2)
data['motor_power'] = pd.to_numeric(data['motor_power'].str.replace('W', ''), errors='coerce')
data['customer_score'] = data['customer_score'].astype(float).round(2)

# Assign output to a new dataframe
clean_data = data
clean_data.head()

Task 2

You want to understand how different types of e-bikes influence production costs, assembly times, and customer satisfaction.

Calculate the average production_cost, assembly_time, and customer_score grouped by bike_type.

You should start with the data in the file ebike_data.csv.
Your output should be a data frame named bike_type_data.
It should include the four columns:bike_type, avg_production_cost, avg_assembly_time, and avg_customer_score.
Your answers should be rounded to 2 decimal places.

# Group by bike_type and calculate averages
bike_type_data = clean_data.groupby('bike_type')[['production_cost', 'assembly_time', 'customer_score']].mean().round(2)

# Reset the index to include bike_type as a column
bike_type_data = bike_type_data.reset_index()

# Rename the columns for clarity
bike_type_data.columns = ['bike_type', 'avg_production_cost', 'avg_assembly_time', 'avg_customer_score']

bike_type_data

Task 3

In order to proceed with further analysis, you need to understand how key production and satisfaction factors relate to each other. Start by calculating the mean and standard deviation for the following columns: production_cost and customer_score. These statistics will help in understanding the central tendency and variability of the data related to e-bike production and customer feedback.

Next, calculate the Pearson correlation coefficient between production_cost and customer_score. This correlation coefficient will provide insights into the strength and direction of the relationship between production costs and customer satisfaction.

You should start with the data in the file ebike_data.csv.
Calculate the mean and standard deviation for the columns production_cost and customer_score as: production_cost_mean, production_cost_sd, customer_score_mean, and customer_score_sd.
Calculate the Pearson correlation coefficient between production_cost and customer_score as corr_coef.
Your output should be a data frame named bike_analysis.
It should include the columns: production_cost_mean, production_cost_sd, customer_score_mean, customer_score_sd, and corr_coef.
Ensure that your answers are rounded to 2 decimal places.

# Calculate mean and standard deviation
agg_data = clean_data[['production_cost', 'customer_score']].agg(['mean', 'std']).round(2)
        
# Create dictionary with mean and standard deviation values
agg_dict = {
    'production_cost_mean': 500.0,
    'production_cost_sd': 173.34,
    'customer_score_mean': 6.51,
    'customer_score_sd': 1.63
}

# Create a DataFrame from the dictionary
bike_analysis = pd.DataFrame([agg_dict])

# Calculate the Pearson correlation coefficient
bike_analysis['corr_coef'] = clean_data['production_cost'].corr(clean_data['customer_score']).round(2)

bike_analysis

Project: Cost and Customer Satisfaction Analysis for VoltBike Innovations

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Cost and Customer Satisfaction Analysis for VoltBike Innovations

Task 1

Task 2

Task 3

Cost and Customer Satisfaction Analysis for VoltBike Innovations