Skip to content

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers a hands-on opportunity to explore how data-driven insights can direct and enhance these humanitarian efforts. In this project, you'll engage with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

  • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
  • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
  • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer two questions, as listed in the instructions. Good luck!

Spinner
DataFrameas
highest_donation_assignments
variable
-- highest_donation_assignments
SELECT a.assignment_name, 
	a.region, 
	ROUND(SUM(d.amount), 2) AS rounded_total_donation_amount,
	donors.donor_type 
FROM assignments as a
JOIN donations AS d ON
	a.assignment_id = d.assignment_id 
JOIN donors ON
	d.donor_id = donors.donor_id 
GROUP BY donors.donor_type, a.assignment_name, a.region
ORDER BY rounded_total_donation_amount DESC LIMIT 5;
Spinner
DataFrameas
top_regional_impact_assignments
variable
-- top_regional_impact_assignments
SELECT subquery.assignment_name, 
	subquery.region, 
	subquery.impact_score, 
	subquery.num_total_donations
FROM (
	SELECT 
		assignments.assignment_name, 
		assignments.region, 
		MAX(assignments.impact_score) AS impact_score, 
		SUM(donations.amount) AS num_total_donations,
		ROW_NUMBER() OVER(PARTITION BY assignments.region ORDER BY MAX(assignments.impact_score) DESC) AS rn
	FROM 
		assignments
	LEFT JOIN donations ON 
		assignments.assignment_id = donations.assignment_id
	GROUP BY assignments.assignment_id, assignments.assignment_name, assignments.region
) AS subquery
WHERE subquery.rn = 1
ORDER BY subquery.region;
Spinner
DataFrameas
df
variable
--COUNT duplicate rows for DONORS 
SELECT donor_id, COUNT(*) AS count
FROM donors
GROUP BY donor_id
HAVING COUNT(*) > 1;
Spinner
DataFrameas
df1
variable
--COUNT each time a donor has donated and total amount donated
SELECT donors.donor_id, 
	COUNT(donations.donation_id) AS Number_of_Donations,
	SUM(donations.amount) AS Total_Donated
FROM donors
LEFT JOIN donations ON
	donors.donor_id = donations.donor_id
GROUP BY donors.donor_id
HAVING COUNT(donations.donation_id) > 0
ORDER BY Number_of_donations DESC, Total_Donated DESC;
Spinner
DataFrameas
df2
variable
--COUNT number of donations per region
SELECT assignments.region,
	COUNT(donations.donation_id) AS total_donations_per_region
FROM assignments
LEFT JOIN donations ON
	assignments.assignment_id = donations.assignment_id
GROUP BY assignments.region
ORDER BY total_donations_per_region DESC;