Skip to content
Duplicate of 2022 12 SQL Live Training Workspace
  • AI Chat
  • Code
  • Report
  • Free Trial Performance Analysis in SQL

    How much is a free trial worth? In this SQL live training, you'll learn how to use SQL to analyze marketing data for a free trial of a product. You’ll summarize and aggregate data to calculate the sorts of metrics that are the bread and butter of any marketing analyst role.

    This live training will require only SQL and Workspace, it will not involve R or Python.

    The Data

    Data Access

    To access the data, create a PostgreSQL integration with the following details:

    host: workspacedemodb.datacamp.com

    port: 5432

    database: free_trial_performance

    username: competition_free_trial

    password: workspace

    Data Deals

    We'll be using synthetic data that was created for this training. This data represents a product that is sold via a 1-month free trial. The Free Trials table records instances of customers beginning a free trial. 1 month after the free trial period starts, the customer may choose to pay, and if so we will have a Purchase record.

    There are four tables:

    Free Trials

    A list of instances of free trials starts.

    • Trial ID - An ID unique to the Free Trial.
    • Free Trial Start Date - The date when the customer began their free trial.
    • Region - The world region where the customer is located.

    Purchases

    A list of instances of customers paying, following their free trial.

    • Trial ID - The ID of the free trial, from the Free Trials table. This ID is unique as each trial may have a maximum of 1 purchase asociated with it.
    • Purchase Date - The date when the customer made their purchase, 1 month after they began their free trial.
    • Purchase Value - The USD value of the Customer's purchase.

    Dates

    A list of dates, provided for convenience.

    • Date - A sequential list of dates.
    • Month - The first of the month for each date.

    Prices

    Optional - a list of prices of the product by region over time. This table will not be used in the live training, and is for optional follow-up activity. Prices are set on a Monthly basis, but the price for each customer is set at the beginning of their free trial, so subsequent price changes will not affect a customer.

    • Free Trial Start Month - the month of free trials that the price applies to.
    • Region - the customer's world region, as in the Free Trials table.
    • The price that will be locked in at the beginning of the customer's Free Trial, based on their Free Trial Start Month & Region.

    Unknown integration
    Data frameavailable as
    df
    variable
    select * from trials
    Unknown integration
    Data frameavailable as
    df
    variable
    select * from purchases

    1. Getting Familiar with the Data

    We will query the Free Trials & Purchases tables, and produce graphs showing the volume of both of these over time.

    a) Group trials by the month of free_trial_start_date (and order by the same).

    Count the rows as num_free_trials.

    Unknown integration
    Data frameavailable as
    df
    variable
    select
            date_trunc('month', free_trial_start_date) as month
        ,	count(*) as num_free_trials
    from trials
        group by 1
        order by 1

    b) Group purchases by the month of purchase_date (and order by the same).

    Count the rows as num_purchases, and sum purchase_value as usd_value.

    Call the output purchases_by_month.

    Unknown integration
    Data frameavailable as
    purchases_by_month
    variable
    select
            date_trunc('month', purchase_date) as month
        ,	count(*) as num_purchases
        ,	sum(purchase_value) as usd_value
    from purchases
        group by 1
        order by 1

    c) Create a line graph of num_purchases by month.

    Current Type: Line
    Current X-axis: month
    Current Y-axis: num_purchases
    Current Color: None

    Purchases per Day

    2. Data Aggregation 1 - Velocity Metrics by Month

    We will pull metrics for Free Trial Starts, Purchases, and Gross Merchandise Value by month from the Free Trials & Purchases tables.

    a) Now that we can aggregate the data by month, create both summaries as Common Table Expressions (CTEs), and left join our purchases per month against the free trials per month to get the results into a combined results table.

    Unknown integration
    Data frameavailable as
    df
    variable
    with free_trials_per_month as (
        select
                date_trunc('month', free_trial_start_date) as month
            ,	count(*) as num_free_trials
        from trials
            group by 1
            order by 1
    )
    
    ,	purchases_per_month as (
        select
                date_trunc('month', purchase_date) as month
            ,	count(*) as num_purchases
            ,	sum(purchase_value) as usd_value
        from purchases
            group by 1
            order by 1
    )
    
    select
    		free_trials_per_month.month
        ,	free_trials_per_month.num_free_trials
        ,	purchases_per_month.num_purchases
        ,	purchases_per_month.usd_value
    from free_trials_per_month
    	left join purchases_per_month
        	on purchases_per_month.month = free_trials_per_month.month

    Do you notice that there's some data missing? When we left join purchases_per_month, we only match against months that exist in free_trials_per_month. There are several ways to solve this.