Skip to content

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers a hands-on opportunity to explore how data-driven insights can direct and enhance these humanitarian efforts. In this project, you'll engage with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

  • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
  • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
  • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer two questions, as listed in the instructions. Good luck!

Initial steps

To answer the two questions, we will start with exploratory analysis of the three tables

Spinner
DataFrameas
df
variable
-- Table: Assignments
-- Visualizing 10 records
Select 
*
FROM public.assignments
LIMIT 5
Spinner
DataFrameas
df3
variable
-- Table: Assignments
Select 
COUNT (*)
FROM public.assignments
Spinner
DataFrameas
df1
variable
-- Table: donations
Select 
*
FROM public.donations
LIMIT 5
Spinner
DataFrameas
df4
variable
-- Table: donations
Select 
COUNT (*)
FROM public.donations
Spinner
DataFrameas
df2
variable
-- Table: Donors
Select 
*
FROM public.donors
LIMIT 5
-- Table: Donors
-- Counting records
Select 
COUNT (*)
FROM public.donors

Task 1.

I tested using CTE to get clean, readable, joined from the three tables with the columns that will help me answer the question. I limited the first 10 rows to preview the data.

Spinner
DataFrameas
df6
variable
WITH goodthought AS(
	SELECT
		public.assignments.assignment_name,
		public.assignments.region,
		public.donations.amount,
		public.donors.donor_type
	FROM
		public.assignments
		INNER JOIN public.donations ON public.assignments.assignment_id = public.donations.assignment_id
		INNER JOIN public.donors ON public.donations.donor_id = public.donors.donor_id
		)
	SELECT 
		assignment_name,
		region,
		amount,
		donor_type
	FROM goodthought
	LIMIT 10;

Task 2.

As it is required to get the top regional impact assignments, I first previewed 10 records with highest impact score after working on the CTE "goodthought."

Spinner
DataFrameas
df10
variable
WITH goodthought AS(
		SELECT
			public.assignments.assignment_name,
			public.assignments.region,
			public.assignments.impact_score,	
			public.donations.donation_id
		FROM
			public.assignments
			INNER JOIN public.donations ON public.assignments.assignment_id = public.donations.assignment_id
			)
		SELECT 
			assignment_name,
			region,
			impact_score,
			COUNT (donation_id) AS num_total_donations
		FROM goodthought
		GROUP BY 
			assignment_name,
			region,
			impact_score
			ORDER BY impact_score DESC 
		LIMIT 10

Now, to obtain the top impact assignments by regions, I partitioned the table and included "ranked." This allows me to rank the assignments by impact scores per region.

Spinner
DataFrameas
df8
variable
WITH goodthought AS (
    SELECT
        public.assignments.assignment_name,
        public.assignments.region,
        public.assignments.impact_score,	
        public.donations.donation_id
    FROM
        public.assignments
        INNER JOIN public.donations ON public.assignments.assignment_id = public.donations.assignment_id
),
ranked AS (
    SELECT 
        assignment_name,
        region,
        impact_score,
        COUNT(donation_id) AS num_total_donations,
        ROW_NUMBER() OVER (PARTITION BY region ORDER BY impact_score DESC) AS region_rank
    FROM goodthought
    GROUP BY 
        assignment_name,
        region,
        impact_score
)
SELECT 
    assignment_name,
    region,
    impact_score,
    num_total_donations,
    region_rank
FROM ranked
ORDER BY region_rank ASC, impact_score DESC
LIMIT 5;

Final results