Skip to content

As electronic vehicles (EVs) become more popular, there is an increasing need for access to charging stations, also known as ports. To that end, many modern apartment buildings have begun retrofitting their parking garages to include shared charging stations. A charging station is shared if it is accessible by anyone in the building.

But with increasing demand comes competition for these ports — nothing is more frustrating than coming home to find no charging stations available! In this project, you will use a dataset to help apartment building managers better understand their tenants’ EV charging habits.

The data has been loaded into an Azure Databricks database, containing a schema called vehicles and a single table named charging_sessions with the following columns:

vehicles.charging_sessions

ColumnDefinitionData type
garage_idIdentifier for the garage/buildingSTRING
user_idIdentifier for the individual userSTRING
user_typeIndicating whether the station is Shared or PrivateSTRING
start_pluginThe date and time the session startedTIMESTAMP
start_plugin_hourThe hour (in military time) that the session startedNUMERIC
end_plugoutThe date and time the session endedTIMESTAMP
end_plugout_hourThe hour (in military time) that the session endedNUMERIC
duration_hoursThe length of the session, in hoursNUMERIC
el_kwhAmount of electricity used (in Kilowatt hours)NUMERIC
month_pluginThe month that the session startedSTRING
weekdays_pluginThe day of the week that the session startedSTRING

Let’s get started!

Sources
  • Data: CC BY 4.0, via Kaggle,
  • Image: Julian Herzog, CC BY 4.0, via Wikimedia Commons
Spinner
DataFrameas
unique_users_per_garage
variable
-- unique_users_per_garage
-- Modify the code below
SELECT vehicles.charging_sessions.garage_id as garage_id,COUNT(DISTINCT vehicles.charging_sessions.user_id) as num_unique_users
FROM vehicles.charging_sessions
WHERE vehicles.charging_sessions.user_type = 'Shared'
GROUP BY vehicles.charging_sessions.garage_id
ORDER BY num_unique_users DESC;
Spinner
DataFrameas
most_popular_shared_start_times
variable
-- most_popular_shared_start_times
SELECT weekdays_plugin, start_plugin_hour ,COUNT(*) AS num_charging_sessions
FROM vehicles.charging_sessions
WHERE vehicles.charging_sessions.user_type = 'Shared'
GROUP BY weekdays_plugin, start_plugin_hour
ORDER BY num_charging_sessions DESC
LIMIT 10;
Spinner
DataFrameas
long_duration_shared_users
variable
-- long_duration_shared_users
SELECT user_id, AVG(duration_hours) AS avg_charging_duration
FROM vehicles.charging_sessions
WHERE user_type= 'Shared'
GROUP BY user_id 
HAVING AVG(duration_hours) > 10
ORDER BY AVG(duration_hours) DESC;