Skip to content
Certification - Data Scientist Associate - Electric Mopeds (copy)
Data Scientist Associate Practical Exam Submission
Use this template to complete your analysis and write up your summary for submission.
Importation of packages
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
Importation of dataset locally
bike_df = pd.read_csv('electric_bike_ratings_2212.csv')
print(bike_df.info())
Checking and resolving missing data
print(bike_df.isna().sum())
bike_df['web_browser'] = bike_df.web_browser.fillna('unknown')
bike_df.web_browser.value_counts() #to verify the filling of the missing values
Resolving data description of featues
#verifying some features
print(bike_df['owned'].value_counts())
print(bike_df['make_model'].value_counts())
print(bike_df['primary_use'].value_counts())
working on the reviewer_age, value_for_money and review_month features
bike_df.reviewer_age = bike_df.reviewer_age.str.replace('-', '0').astype('Float64')
bike_df.reviewer_age = bike_df.reviewer_age.replace(0, np.nan, inplace= False)
age_mean = bike_df.reviewer_age.mean()
bike_df.reviewer_age = bike_df.reviewer_age.fillna(age_mean, inplace = False)
bike_df.reviewer_age = bike_df.reviewer_age.astype('Int64')
bike_df
bike_df.value_for_money = bike_df.value_for_money.str.replace('/10', '')
bike_df.value_for_money = bike_df.value_for_money.astype('Int64')
bike_df
bike_df.review_month = bike_df.review_month.str.strip()
bike_df.review_month = bike_df.review_month.str.replace(r'..-', '', regex = True)
bike_df.review_month.value_counts()
bike_df.info()