Hacking Date Functions in SQLite

In this tutorial, learn how to use date functions in SQLite.

Some of the most useful functions in Postgres implementations of SQL (like Amazon Redshift)are DATE_DIFF and DATE_TRUNC:

  • DATE_DIFF gives the amount of time that has elapsed between two different dates. For example, the following code would give the number of days between date1 and date2:
DATE_DIFF('day', date1, date2)

DATE_DIFF is great for calculating the number of days from sign-up to cancellation or the number of hours from sign-in to sign-out.

  • DATE_TRUNC cuts off a date to the nearest day, week, month, or year. For example, the following code would give the nearest Monday, to the timestamp my_timestamp:
DATE_TRUNC('week', my_timestamp)

DATE_TRUNC is great for aggregating data. For instance, we can use it to find the number of Monthly Active Users (MAU) by truncating to the nearest month start date.

But not every SQL implementation had these great functions. For our Learn SQL from Scratch code, we use SQLite, a light-weight implementation of SQL that can be run on a single Docker instance. SQLite is great for website backends and small projects, but it's missing my two favorite functions. Luckily, there are workarounds.

To emulate, DATE_DIFF, we can use a little-known function called juliandate. According to Wikipedia, ​"the ​Julian Day Number​ (​JDN​) is the integer assigned to a whole solar day in the Julian day count starting from noon ​Universal time​, with Julian day number 0 assigned to the day starting at noon on Monday, January 1, ​4713 BC​". By converting a date to a floating point number, we can use subtraction to find the difference between two timestamps.

We can even convert our answer to hours by multiplying by 24 or to minutes by multiplying by 24 * 60.

We can emulate some of the functionality of DATE_TRUNC by using strftime. This function converts a timestamp to a string with a given output.

%dday of month: 00 %ffractional seconds: SS.SSS %Hhour: 00-24%jday of year: 001-366 %JJulian day number%mmonth: 01-12 %Mminute: 00-59 %sseconds since 1970-01-01 %Sseconds: 00-59 %wday of week 0-6 with Sunday==0 %Wweek of year: 00-53 %Yyear: 0000-9999

Normally, we might use this to convert between different timestamp formats like YYYY-MM-DD to MM-DD-YYYY:

strftime('%M-%D-%Y', mydate)

But we can cleverly choose our formatting to truncate to the appropriate point. For example, if we want to truncate to the nearest month, we can:

strftime('%M/%Y', mydate)

Or we can truncate to the nearest week, by using:

strftime('%Y-%w', mydate)

With these two easy tricks, you can use SQLite for some of the same great analyses as Amazon Redshift!

If you would like to learn more about the basics of SQL, take DataCamp's Intro to SQL for Data Science course.