Skip to content

Practical Exam: Loan Insights

EasyLoan offers a wide range of loan services, including personal loans, car loans, and mortgages.

EasyLoan offers loans to clients from Canada, United Kingdom and United States.

The analytics team wants to report performance across different geographic areas. They aim to identify areas of strength and weakness for the business strategy team.

They need your help to ensure the data is accessible and reliable before they start reporting.

Database Schema

The data you need is in the database named lending.

Task 1

The analytics team wants to use the client table to create a dashboard for client details. For them to proceed, they need to be sure the data is clean enough to use.

The client table below illustrates what the analytics team expects the data types and format to be.

Write an SQL query that returns the client table with the specified format, including identifying and cleaning all invalid values.

  • Your output should be a DataFrame with the name 'client'. Do not modify the client table.
  • Note that the DataLab environment formats dates as YYYY-MM-DD-hh-ss-SSS.
Column NameDescription
client_idUnique integer (set by the database, can’t take any other value)
date_of_birthDate of birth of the client, as a date
employment_statusCurrent employment status of the client, either employed or unemployed, as a lower case string
countryThe country where the client resides, either USA, UK or CA, as an upper case string
Spinner
DataFrameas
client
variable
SELECT DISTINCT
	client_id,
    date_of_birth,
	CASE
        WHEN TRIM(LOWER(employment_status)) = 'emplouyed' THEN 'employed'
        WHEN TRIM(LOWER(employment_status)) = 'fulltime' THEN 'employed'
        WHEN TRIM(LOWER(employment_status))= 'parttime' THEN 'employed'
        ELSE TRIM(LOWER(employment_status))
    END AS employment_status,
    TRIM(UPPER(country)) AS country
FROM client

Task 2

You have been told that there was a problem in the backend system as some of the repayment_channel values are missing.

The missing values are critical to the analysis so they need to be filled in before proceeding.

Luckily, they have discovered a pattern in the missing values:

  • Repayment higher than 4000 dollars should be made via bank account.
  • Repayment lower than 1000 dollars should be made via mail.

Write an SQL query that makes the repayment table match these criteria.

  • Your output should be a DataFrame with the name 'repayment'. Do not modify the original repayment table.
Spinner
DataFrameas
repayment
variable
SELECT
  repayment_id,
  repayment_date,
  repayment_amount,
  CASE
    WHEN repayment_amount > 4000 AND repayment_channel = '-' THEN 'bank account'
    WHEN repayment_amount < 1000 AND repayment_channel = '-' THEN 'mail'
    ELSE repayment_channel
  END AS repayment_channel
FROM
    repayment

Task 3

Starting on January 1st, 2022, all US clients started to use an online system to sign contracts.

The analytics team wants to analyze the loans for US clients who used the new online system.

Write a query that returns the data for the analytics team. Your output should include client_id,contract_date, principal_amount and loan_type columns.

Spinner
DataFrameas
us_loans
variable
SELECT 
	LO.client_id AS client_id,
    CO.contract_date AS contract_date,
    LO.principal_amount AS principal_amount,
    LO.loan_type AS loan_type
FROM loan LO
INNER JOIN contract CO 
	ON CO.contract_id = LO.contract_id
INNER JOIN client CL 
	ON CL.client_id = LO.client_id
WHERE CL.country LIKE '%USA%'
AND CO.contract_date >= '2022-01-01'

Task 4

The business strategy team is considering offering a more competitive rate to the US market.

The analytic team want to compare the average interest rates offered by the company for the same loan type in different countries to determine if there are significant differences.

Write a query that returns the data for the analytics team. Your output should include loan_type, country and avg_rate columns.

Spinner
DataFrameas
average_data
variable
SELECT LO.loan_type, CL.country, AVG(lo.interest_rate) avg_rate
FROM  loan LO
LEFT JOIN client CL
	ON CL.client_id = LO.client_id
GROUP BY LO.loan_type, CL.country