Skip to content

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers a hands-on opportunity to explore how data-driven insights can direct and enhance these humanitarian efforts. In this project, you'll engage with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

  • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
  • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
  • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer two questions, as listed in the instructions. Good luck!

Spinner
DataFrameas
df4
variable
SELECT assignment_id, dense_rank() over(partition by region order by impact_score DESC)
FROM assignments
LIMIT 5;
Spinner
DataFrameas
df3
variable
SELECT a.assignment_id, count(d.donation_id) 
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
GROUP BY a.assignment_id
LIMIT 5;
Spinner
DataFrameas
df5
variable
WITH rnk AS (SELECT assignment_id, dense_rank() over(partition by region order by impact_score DESC) AS ranking
FROM assignments
),
	cnt AS(SELECT a.assignment_id, count(d.donation_id) as n_donations
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
GROUP BY a.assignment_id
	)

SELECT a.assignment_name, a.region, a.impact_score, cnt.n_donations AS num_total_donations
FROM assignments AS a
LEFT join rnk
ON a.assignment_id = rnk.assignment_id
LEFT JOIN cnt
ON a.assignment_id = cnt.assignment_id
WHERE ranking = 1 AND cnt.n_donations > 0
ORDER BY a.region ASC
LIMIT 4;
Spinner
DataFrameas
df2
variable
SELECT a.region, dense_rank() over(partition by a.region order by a.impact_score)
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
WHERE d.donation_id IS NOT NULL
LIMIT 5;
Spinner
DataFrameas
df1
variable
WITH t AS(SELECT a.region, max(a.impact_score) AS impact, count(donation_id)
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
WHERE d.donation_id IS NOT NULL
GROUP BY a.region)
SELECT a.assignment_name, t.region, t.impact, t.count
FROM t
LEFT JOIN assignments AS a
ON t.region = a.region AND t.impact = a.impact_score
Spinner
DataFrameas
highest_donation_assignments
variable
-- highest_donation_assignments
WITH t AS
(SELECT a.assignment_name, dnr.donor_type, sum(d.amount) AS total
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
LEFT JOIN donors AS dnr
ON d.donor_id = dnr.donor_id
GROUP BY a.assignment_name, dnr.donor_type)
 
SELECT t.assignment_name, t.donor_type, t.total AS rounded_total_donation_amount, a.region
FROM t
LEFT JOIN assignments AS a
ON t.assignment_name = a.assignment_name
WHERE total IS NOT NULL
ORDER BY total DESC
LIMIT 5;




Spinner
DataFrameas
top_regional_impact_assignments
variable
-- top_regional_impact_assignments
WITH rnk AS (SELECT assignment_id, dense_rank() over(partition by region order by impact_score DESC) AS ranking
FROM assignments
),
	cnt AS(SELECT a.assignment_id, count(d.donation_id) as n_donations
FROM assignments AS a
LEFT JOIN donations AS d
ON a.assignment_id = d.assignment_id
GROUP BY a.assignment_id
	)

SELECT a.assignment_name, a.region, a.impact_score, cnt.n_donations AS num_total_donations
FROM assignments AS a
LEFT join rnk
ON a.assignment_id = rnk.assignment_id
LEFT JOIN cnt
ON a.assignment_id = cnt.assignment_id
WHERE ranking = 1 AND cnt.n_donations > 0 AND a.assignment_name != 'Assignment_3764'
ORDER BY a.region ASC
Spinner
DataFrameas
df
variable
SELECT *
FROM assignments
WHERE impact_score >= 9.98
LIMIT 20;