Working with Dates and Times in Python
Run the hidden code cell below to import the data used in this course.
# Importing the course packages
import pandas as pd
import matplotlib.pyplot as plt
from datetime import date, datetime, timezone, timedelta
from dateutil import tz
import pickle
# Import the course datasets
rides = pd.read_csv('datasets/capital-onebike.csv')
with open('datasets/florida_hurricane_dates.pkl', 'rb') as f:
florida_hurricane_dates = pickle.load(f)
florida_hurricane_dates = sorted(florida_hurricane_dates)
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- Count how many hurricanes made landfall each year in Florida using
florida_hurricane_dates
. - Reload the dataset
datasets/capital-onebike.csv
so that it correctly parses date and time columns. - Calculate the average trip duration of bike rentals on weekends in
rides
. Compare it with the average trip duration of bike rentals on weekdays.
Course Overview and Introduction to Dates in Python
This course on working with dates and times in Python is structured into four main chapters. Initially, we'll explore handling dates and calendars. The subsequent chapter introduces combining dates with time, followed by a deep dive into time zones and Daylight Saving in the third chapter. The final chapter demonstrates how Pandas simplifies complex date-related queries.
Python's date
class is essential for representing dates, akin to other data types like strings or numbers. It comes with specific rules for creation and methods for manipulation. This course will guide you through creating date objects, extracting information, and performing operations like calculating the elapsed days between dates, ordering dates, and determining weekdays.
To illustrate, consider the task of analyzing hurricane landfalls in Florida over 67 years. Without a dedicated date class, operations such as finding the number of days between two dates or sorting them would be cumbersome. Python, however, simplifies these tasks significantly.
Creating a date object involves importing the date
class from the datetime
package and using the date()
function with year, month, and day as arguments. Attributes of a date object, such as the year, month, and day, can be accessed directly. Additionally, the weekday()
method allows for determining the day of the week, with Monday as 0 and Sunday as 6, making it easier to work with dates in a more intuitive manner.
# Import date from datetime
from datetime import date
# Create a date object
hurricane_andrew = date(1992, 8, 24)
# Which day of the week is the date?
print(hurricane_andrew.weekday())
# Counter for how many before June 1
early_hurricanes = 0
# We loop over the dates
for hurricane in florida_hurricane_dates:
# Check if the month is before June (month number 6)
if hurricane.month < 6:
early_hurricanes = early_hurricanes + 1
print(early_hurricanes)
In our last lesson, we covered creating date objects and accessing their attributes. This lesson focuses on performing mathematical operations with dates, such as counting days between events, adjusting dates by a certain number of days, and ordering dates.
Think back to basic arithmetic and the concept of a number line, which helps understand order and distance between numbers. For example, on a number line ranging from 10 to 16, if we choose 11 and 14 represented by variables a
and b
and put them into a list l
, we can find the smallest number using the min()
function, which would be 11. Subtracting 11 from 14 gives us 3, indicating the distance between these two numbers.
Applying this concept to dates involves using a "calendar line" where each point represents a specific day. For instance, comparing November 5th, 2017 (d1
), and December 4th, 2017 (d2
), and placing them in a list l
, calling min(l)
returns the earliest date, November 5th, 2017.
Subtracting two dates in Python yields a timedelta
object, which represents the elapsed time between events. Accessing the days
attribute of this object shows the number of days between the two dates.
We can also add a timedelta
to a date to move forward in time. By importing timedelta
from datetime
and creating a 29-day timedelta
, adding it to November 5th, 2017, correctly returns December 4th, 2017, accounting for the varying lengths of months automatically.
Lastly, the "plus-equals" operation (+=
) is a shorthand for incrementing variables, which will be frequently used throughout the course. It simplifies adding a value to a variable, for example, incrementing x
by 1 can be done with x += 1
.
# Import date
from datetime import date
# Create a date object for May 9th, 2007
start = date(2007, 5, 9)
# Create a date object for December 13th, 2007
end = date(2007, 12, 13)
# Subtract the two dates and print the number of days
print((end - start).days)
# A dictionary to count hurricanes per calendar month
hurricanes_each_month = {1: 0, 2: 0, 3: 0, 4: 0, 5: 0, 6:0,
7: 0, 8:0, 9:0, 10:0, 11:0, 12:0}
# Loop over all hurricanes
for hurricane in florida_hurricane_dates:
# Pull out the month
month = hurricane.month
# Increment the count in your dictionary by one
hurricanes_each_month[month] +=1
print(hurricanes_each_month)
# Print the first and last scrambled dates
print(dates_scrambled[0])
print(dates_scrambled[-1])
# Put the dates in order
dates_ordered = sorted(dates_scrambled)
# Print the first and last ordered dates
print(dates_ordered[0])
print(dates_ordered[-1])
Turning dates into strings is essential for readability, especially when printing results, naming files, or writing dates to CSV or Excel files. Python defaults to printing dates in ISO 8601 format (YYYY-MM-DD), which ensures consistency and easy sorting due to its fixed length and clear structure.
For instance, creating a date object for November 5th, 2017, and printing it will display the date in this ISO format. To explicitly convert a date to an ISO 8601 string, use the isoformat()
method. This format's advantage is evident when sorting dates as strings; ISO 8601 formatted strings sort chronologically, which is beneficial for organizing files or data.
If ISO 8601 doesn't meet your needs, Python's strftime()
method offers flexibility in date representation. By passing a format string to strftime()
, you can customize the date format. For example, %Y
in a format string will be replaced with the four-digit year. Other placeholders like %m
for month and %d
for day allow for various date formats.
In summary, Python supports converting dates to strings using the ISO 8601 format for consistency and sorting, and strftime()
for customizable formats, catering to a wide range of needs.
# Assign the earliest date to first_date
first_date = min(florida_hurricane_dates)
# Convert to ISO and US formats
iso = "Our earliest hurricane date: " + first_date.isoformat()
us = "Our earliest hurricane date: " + first_date.strftime("%m/%d/%Y")
print("ISO: " + iso)
print("US: " + us)
# Import date
from datetime import date
# Create a date object
andrew = date(1992, 8, 26)
# Print the date in the format 'YYYY-MM'
print(andrew.strftime('%Y-%m'))
In this chapter, we'll expand our focus from working solely with dates to handling both dates and times, encompassing both the calendar day and the specific time within that day.
To represent dates and times together in Python, we start by importing the datetime
class from the datetime
package. Despite the confusion arising from their identical names, this is a standard practice.
Creating a datetime
object involves specifying the year, month, and day first, followed by the hour in 24-hour format, then minutes, and finally seconds. For example, October 1, 2017, at 3:23:25 PM is represented as 2017, 10, 1, 15, 23, 25
. For additional precision, microseconds can be added, with Python allowing for representation down to millionths of a second. If needed, nanosecond precision is available with Pandas for applications requiring even finer detail.
Arguments to the datetime
constructor can be made more readable through the use of named arguments. Moreover, the replace()
method allows for modifying specific parts of a datetime
object, such as setting minutes, seconds, and microseconds to zero to round down to the start of an hour.
The course will utilize data from Capital Bikeshare, the oldest municipal bike-sharing program in the U.S., focusing on the journeys of bike ID "W20529" during the last three months of 2017 across Washington, DC. This real-world dataset will help us practice creating and manipulating datetime
objects to analyze and understand patterns in the bike's usage.
# Import datetime
from datetime import datetime
# Create a datetime object
dt = datetime(2017, 10, 1, 15, 26, 26)
# Print the results in ISO 8601 format
print(dt.isoformat())