Skip to content

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers a hands-on opportunity to explore how data-driven insights can direct and enhance these humanitarian efforts. In this project, you'll engage with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

  • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
  • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
  • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer two questions, as listed in the instructions. Good luck!

Spinner
DataFrameas
df
variable
SELECT * FROM information_schema.tables;
Spinner
DataFrameas
df1
variable
SELECT * FROM assignments;
Spinner
DataFrameas
df3
variable
SELECT * FROM public.donations
Spinner
DataFrameas
df4
variable
SELECT * FROM public.donors

Identifying the top five assignments with the highest total donations by donor type

Spinner
DataFrameas
withoutCTE_highest_donation_assignments
variable
--highest_donation_assignments
SELECT a.assignment_name,
	region,
	ROUND(SUM(dt.amount),2) AS rounded_total_donation_amount,
	dr.donor_type
FROM public.assignments AS a
JOIN public.donations AS dt 
	ON a.assignment_id = dt.assignment_id
JOIN public.donors AS dr 
	ON dt.donor_id = dr.donor_id
GROUP BY 
	a.assignment_name, 
	region, 
	dr.donor_type
ORDER BY rounded_total_donation_amount DESC
LIMIT 5;

Identifying the leading assignment by impact in each region

Spinner
DataFrameas
withoutCTE_top_regional_impact_assignments
variable
-- top_regional_impact_assignments
SELECT 
	a.assignment_name,
	region,impact_score,
	COUNT(dt.amount) AS num_total_donations
FROM public.assignments AS a
JOIN public.donations AS dt 
	ON a.assignment_id = dt.assignment_id
WHERE
    (a.region, a.impact_score) IN (
        SELECT
            region,
            MAX(impact_score)
        FROM
            public.assignments
        GROUP BY
            region
    )
GROUP BY 
	a.assignment_name,
    a.region,
	a.impact_score

ORDER BY region ASC;

Using CTE : Identifying the top five assignments with the highest total donations by donor type

Spinner
DataFrameas
highest_donation_assignments
variable
WITH highest_donations AS (
	SELECT 
	a.assignment_name,
	a.region,
	ROUND(SUM(dt.amount),2) AS rounded_total_donation_amount,
	donor_type
	FROM public.assignments AS a
	JOIN public.donations AS dt
		ON a.assignment_id = dt.assignment_id
	JOIN public.donors AS dr 
		ON dt.donor_id = dr.donor_id

	GROUP BY 
	a.assignment_name, 
	a.region, 
	dr.donor_type
)

SELECT 
	assignment_name,
	region,
	rounded_total_donation_amount,
	donor_type
FROM highest_donations
ORDER BY rounded_total_donation_amount DESC
LIMIT 5;

Using CTE:Identifying the assignment with the highest impact store in each region

Spinner
DataFrameas
top_regional_impact_assignments
variable
WITH highest_scoring AS (
	SELECT 
		a.assignment_name,
		region,impact_score,
		COUNT(dt.amount) AS num_total_donations
	FROM public.assignments AS a
	JOIN public.donations AS dt 
		ON a.assignment_id = dt.assignment_id
	GROUP BY 
		a.assignment_name,
    	a.region,
		a.impact_score
	HAVING COUNT(dt.amount) >=1
),

ranking_assignments AS (
	SELECT 
		assignment_name,
		region,
		impact_score,
		num_total_donations,
		ROW_NUMBER() OVER(PARTITION BY region ORDER BY impact_score DESC) AS assignment_rank
	FROM highest_scoring
)

SELECT 
	assignment_name,
	region,
	impact_score,
	num_total_donations
FROM ranking_assignments
	WHERE assignment_rank=1
ORDER BY region ASC, impact_score DESC;
Spinner
DataFrameas
online_retail_csv
variable
SELECT * FROM 'online_retail.csv'