Skip to content
Point of Sale(POS) Data from a Supermarket: Transactions and Cashier Operations - SQL
  • AI Chat
  • Code
  • Report
  • Point of Sale(POS) Data from a Supermarket: Transactions and Cashier Operations

    Retailers/Businesses need to understand their checkout data to plan effectively and potentially increase profit.

    We explore questions like

    • Do people spend more with card or cash?
    • How many items do people buy at different times?
    • Are self-service checkouts quicker than those with a cashier?
    • What is the trend in amount across days?
    • Should stores be open on Sundays?
    • and more

    Data from transaction data from a Polish Supermarket

    We will use SQL for queries and Plotly for visualization.

    Importing libraries that we may use.

    import plotly.express as px
    import numpy as np

    Task 1: Exploring the data

    A good first step in exploring a database is to select the first few rows from each table.

    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT *
    FROM 'pos_operator_logs.csv'
    LIMIT 10
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT *
    FROM 'pos_transactions.csv'
    LIMIT 10
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Task 2: Do more people make transactions by card or by cash?

    Here we look to understand if people prefer to make transactions from their card or cash.

    Unknown integration
    DataFrameavailable as
    df1
    variable
    SELECT 
    SUM(t_cash::numeric) AS trans_cash,
    SUM(t_card::numeric) AS trans_card
    FROM 'pos_transactions.csv'
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    More transactions were done using cash than with card. Let's visualise this using pie plot

    px.pie(df1,values= df1.iloc[0],names=['Trans_cash','Trans_card'],
           title='RATIO OF TRANSACTION OF CASH AND CARD',
          color_discrete_sequence=px.colors.sequential.Jet)

    Task 3: Do people spend more per transaction when using cash or card?

    Similarly, knowing how much people spend per transaction using different payment methods is helpful for deciding on security arrangments, and marketing to encourage shoppers to use different payment methods.

    Unknown integration
    DataFrameavailable as
    df2
    variable
    SELECT 
    AVG(CASE WHEN t_cash == 'True' THEN amount END) as avg_amount_cash,
    AVG(CASE WHEN t_card == 'True' THEN amount END) AS avg_amount_card
    FROM 'pos_transactions.csv'
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.