Skip to content

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers insights into these humanitarian efforts. This project engages with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

  • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
  • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
  • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer the following two questions:

  1. What are the top five regional assignments and donor type with the highest donations?
  2. What assignment in each region has the highest impact score?

NGOs (non-governmental organizations) are nonprofits that operate independently of any government, typically one whose purpose is to address social or political issues.

Upon initial query, it is noted that each table has 5,000 rows. Donations were made to 3,188 distinct assignments, and they were given by one of three distinct types of entities; individual, organization, or corporate.

Spinner
DataFrameas
df
variable
-- Initial query of donations table
SELECT *
FROM donations;
Spinner
DataFrameas
df1
variable
-- Initial query of assignments table
SELECT *
FROM assignments;
Spinner
DataFrameas
df2
variable
-- Initial query of donor table
SELECT *
FROM donors;
Spinner
DataFrameas
df4
variable
SELECT DISTINCT(assignment_id)
FROM donations;
Spinner
DataFrameas
df3
variable
SELECT DISTINCT(donor_type)
FROM donors;

To determine the highest dontation, I joined the donor and assignment tables to the donations table. The highest total donations given to an assignment were from individual donars to Assignment_3033. All but the southern region were included in the highest donations given.

Spinner
DataFrameas
highest_donation_assignments
variable
-- Determining the top 5 highest assignments donated to
SELECT
	assignment_name,
	region,
	SUM(amount) AS total_donation_amount,
	donor_type
FROM donations AS d1
JOIN assignments AS a
	ON d1.assignment_id = a.assignment_id
JOIN donors as d2
	ON d1.donor_id = d2.donor_id
GROUP BY assignment_name, region, donor_type
ORDER BY total_donation_amount DESC
LIMIT 5

NGOs use impact scores to track their progress towards their goals like poverty reduction, community development, or environmental sustainability. The scores in this data range from 1-10, 10 reflecting the highest impact.

To determine the assignments with the highest impact score per region required partitioning the table by region and then ordering the partitions by impact score. The number of donations were added to the result as well. This was done by creating a CTE and a Window Function in a subquery.

Spinner
DataFrameas
top_regional_impact_assignments
variable
-- Deteriming the assignments with the top regional impact
WITH donation_count AS (
	SELECT
		assignment_id,
		COUNT(*) AS num_total_donations
	FROM donations
	GROUP BY assignment_id
), donation_rank AS (
	SELECT
		a.assignment_name,
		a.region,
		a.impact_score,
		dc.num_total_donations,
		ROW_NUMBER() OVER (PARTITION BY a.region ORDER BY a.impact_score DESC) AS rank_in_region
	FROM assignments as a
	JOIN donation_count as dc
		ON a.assignment_id = dc.assignment_id
)

SELECT assignment_name,
	region,
	impact_score,
	num_total_donations
FROM donation_rank
WHERE rank_in_region = 1
ORDER by region ASC