Skip to content
(SQL) Project: Impact Analysis of GoodThought NGO Initiatives
  • AI Chat
  • Code
  • Report
  • Impact Analysis of GoodThought NGO Initiatives

    Use SQL to explore and analyze GoodThought NGO's database, uncovering key insights from over 13 years of transformative projects.

    Since 2010, GoodThought NGO has led transformative efforts in education, healthcare, and sustainability worldwide. Dive into a PostgreSQL database to analyze key metrics from 2010 to 2023, track donations, and assess program effectiveness. This project offers a deep dive into data, revealing the impact and outcomes of GoodThought's initiatives

    GoodThought NGO has been a catalyst for positive change, focusing its efforts on education, healthcare, and sustainable development to make a significant difference in communities worldwide. With this mission, GoodThought has orchestrated an array of assignments aimed at uplifting underprivileged populations and fostering long-term growth.

    This project offers a hands-on opportunity to explore how data-driven insights can direct and enhance these humanitarian efforts. In this project, you'll engage with the GoodThought PostgreSQL database, which encapsulates detailed records of assignments, funding, impacts, and donor activities from 2010 to 2023. This comprehensive dataset includes:

    • Assignments: Details about each project, including its name, duration (start and end dates), budget, geographical region, and the impact score.
    • Donations: Records of financial contributions, linked to specific donors and assignments, highlighting how financial support is allocated and utilized.
    • Donors: Information on individuals and organizations that fund GoodThought’s projects, including donor types.

    Refer to the below ERD diagram for a visual representation of the relationships between these data tables:

    You will execute SQL queries to answer two questions, as listed in the instructions. Good luck!

    Spinner
    DataFrameavailable as
    df
    variable
    SELECT * FROM assignments
    LIMIT 10
    Spinner
    DataFrameavailable as
    df1
    variable
    SELECT * FROM donations
    LIMIT 10
    Spinner
    DataFrameavailable as
    df2
    variable
    SELECT * FROM donors
    LIMIT 10

    1. Identifying the top five assignments with the highest total donations by donor type

    List the top five assignments based on total value of donations, categorized by donor type. The output should include four columns: 1) assignment_name, 2) region, 3) rounded_total_donation_amount rounded to two decimal places, and 4) donor_type, sorted by rounded_total_donation_amount in descending order. Save the result as highest_donation_assignments.

    Spinner
    DataFrameavailable as
    highest_donation_assignments
    variable
    -- highest_donation_assignments
    
    SELECT assignment_name, region, ROUND(SUM(amount),2) AS rounded_total_donation_amount, donor_type
    FROM donors AS d1
    INNER JOIN donations AS d2
    USING(donor_id)
    INNER JOIN assignments AS a1
    USING(assignment_id)
    GROUP BY assignment_name, region, donor_type
    ORDER BY rounded_total_donation_amount DESC
    LIMIT 5
    Spinner
    DataFrameavailable as
    df6
    variable
    -- Alternative solution to query above
    
    WITH donation_details AS (
        SELECT
            d.assignment_id,
            ROUND(SUM(d.amount), 2) AS rounded_total_donation_amount,
            dn.donor_type
        FROM donations d
        JOIN donors dn ON d.donor_id = dn.donor_id
        GROUP BY d.assignment_id, dn.donor_type
    )
    
    SELECT
        a.assignment_name,
        a.region,
        dd.rounded_total_donation_amount,
        dd.donor_type
    FROM assignments a
    JOIN donation_details dd ON a.assignment_id = dd.assignment_id
    ORDER BY dd.rounded_total_donation_amount DESC
    LIMIT 5;

    2. Identifying the leading assignment by impact in each region

    Identify the assignment with the highest impact score in each region, ensuring that each listed assignment has received at least one donation. The output should include four columns: 1) assignment_name, 2) region, 3) impact_score, and 4) num_total_donations, sorted by region in ascending order. Include only the highest-scoring assignment per region, avoiding duplicates within the same region. Save the result as top_regional_impact_assignments.

    Spinner
    DataFrameavailable as
    top_regional_impact_assignments
    variable
    -- top_regional_impact_assignments
    
    WITH num_donation AS (
    	SELECT assignment_id, COUNT(donation_id) AS num_total_donations
    	FROM donations
    	GROUP BY assignment_id
    	HAVING COUNT(donation_id) > 0
    ),
    
    max_impact_score AS (
    	SELECT 
    		assignment_name, region, impact_score, num_total_donations,
    		ROW_NUMBER() OVER (PARTITION BY region ORDER BY impact_score DESC) AS rank_in_region
    	FROM assignments
    	INNER JOIN num_donation AS nd
    	ON assignments.assignment_id = nd.assignment_id
    )
    
    SELECT assignment_name, region, impact_score, num_total_donations
    FROM max_impact_score
    WHERE rank_in_region = 1
    ORDER BY region;
    Spinner
    DataFrameavailable as
    df7
    variable
    -- Alternative solution to query above
    
    WITH donation_counts AS (
        SELECT
            assignment_id,
            COUNT(donation_id) AS num_total_donations
        FROM donations
        GROUP BY assignment_id
    ),
    
    ranked_assignments AS (
        SELECT
            a.assignment_name,
            a.region,
            a.impact_score,
            dc.num_total_donations,
            ROW_NUMBER() OVER (PARTITION BY a.region ORDER BY a.impact_score DESC) AS rank_in_region
        FROM assignments a
        JOIN donation_counts dc ON a.assignment_id = dc.assignment_id
        WHERE dc.num_total_donations > 0
    )
    
    SELECT
        assignment_name,
        region,
        impact_score,
        num_total_donations
    FROM ranked_assignments
    WHERE rank_in_region = 1
    ORDER BY region ASC;