Project: Impact Analysis of GoodThought NGO Initiatives

GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

This project offers insights into these humanitarian efforts. This project engages with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

You will execute SQL queries to answer the following two questions:

What are the top five regional assignments and donor type with the highest donations?
What assignment in each region has the highest impact score?

NGOs (non-governmental organizations) are nonprofits that operate independently of any government, typically one whose purpose is to address social or political issues.

Upon initial query, it is noted that each table has 5,000 rows. Donations were made to 3,188 distinct assignments, and they were given by one of three distinct types of entities; individual, organization, or corporate.

DataFrameas

df

variable

-- Initial query of donations table
SELECT *
FROM donations;

DataFrameas

df1

variable

-- Initial query of assignments table
SELECT *
FROM assignments;

DataFrameas

df2

variable

-- Initial query of donor table
SELECT *
FROM donors;

DataFrameas

df4

variable

SELECT DISTINCT(assignment_id)
FROM donations;

DataFrameas

df3

variable

SELECT DISTINCT(donor_type)
FROM donors;

To determine the highest dontation, I joined the donor and assignment tables to the donations table. The highest total donations given to an assignment were from individual donars to Assignment_3033. All but the southern region were included in the highest donations given.

DataFrameas

highest_donation_assignments

variable

-- Determining the top 5 highest assignments donated to
SELECT
	assignment_name,
	region,
	SUM(amount) AS total_donation_amount,
	donor_type
FROM donations AS d1
JOIN assignments AS a
	ON d1.assignment_id = a.assignment_id
JOIN donors as d2
	ON d1.donor_id = d2.donor_id
GROUP BY assignment_name, region, donor_type
ORDER BY total_donation_amount DESC
LIMIT 5

NGOs use impact scores to track their progress towards their goals like poverty reduction, community development, or environmental sustainability. The scores in this data range from 1-10, 10 reflecting the highest impact.

To determine the assignments with the highest impact score per region required partitioning the table by region and then ordering the partitions by impact score. The number of donations were added to the result as well. This was done by creating a CTE and a Window Function in a subquery.

DataFrameas

top_regional_impact_assignments

variable

-- Deteriming the assignments with the top regional impact
WITH donation_count AS (
	SELECT
		assignment_id,
		COUNT(*) AS num_total_donations
	FROM donations
	GROUP BY assignment_id
), donation_rank AS (
	SELECT
		a.assignment_name,
		a.region,
		a.impact_score,
		dc.num_total_donations,
		ROW_NUMBER() OVER (PARTITION BY a.region ORDER BY a.impact_score DESC) AS rank_in_region
	FROM assignments as a
	JOIN donation_count as dc
		ON a.assignment_id = dc.assignment_id
)

SELECT assignment_name,
	region,
	impact_score,
	num_total_donations
FROM donation_rank
WHERE rank_in_region = 1
ORDER by region ASC