Skip to content
(SQL) Project: Analyzing Industry Carbon Emissions
  • AI Chat
  • Code
  • Report
  • Analyzing Industry Carbon Emissions

    Product emissions make up more than 75% of global emissions. But which industries are the worst offenders?

    Photo by Maxim Tolchinskiy on Unsplash

    When factoring heat generation required for the manufacturing and transportation of products, Greenhouse gas emissions attributable to products, from food to sneakers to appliances, make up more than 75% of global emissions. The Carbon Catalogue

    Our data, which is publicly available on nature.com, contains product carbon footprints (PCFs) for various companies. PCFs are the greenhouse gas emissions attributable to a given product, measured in CO2 (carbon dioxide equivalent).

    This data is stored in a PostgreSQL database containing one table, product_emissions, which looks at PCFs by product as well as the stage of production that these emissions occurred. Here's a snapshot of what product_emissions contains in each column:

    product_emissions

    fielddata type
    idVARCHAR
    yearINT
    product_nameVARCHAR
    companyVARCHAR
    countryVARCHAR
    industry_groupVARCHAR
    weight_kgNUMERIC
    carbon_footprint_pcfNUMERIC
    upstream_percent_total_pcfVARCHAR
    operations_percent_total_pcfVARCHAR
    downstream_percent_total_pcfVARCHAR

    You'll use this data to examine the carbon footprint of each industry in the dataset!

    Using the product_emissions table, find the number of unique companies and their total carbon footprint PCF for each industry group, filtering for the most recent year in the database. The query should return three columns: industry_group, num_companies, and total_industry_footprint, with the last column being rounded to one decimal place. The results should be sorted by total_industry_footprint from highest to lowest values.

    Spinner
    DataFrameavailable as
    carbon_emissions_by_industry
    variable
    -- Update your query here
    SELECT * FROM product_emissions
    LIMIT 5;
    Spinner
    DataFrameavailable as
    df1
    variable
    SELECT year, COUNT(*)
    FROM product_emissions
    GROUP BY year
    Spinner
    DataFrameavailable as
    carbon_emissions_by_industry
    variable
    WITH num_comp AS (
    SELECT 
    	industry_group, 
    	COUNT(DISTINCT company) AS num_companies
    FROM product_emissions
    WHERE year IN (SELECT MAX(year) 
    			   FROM product_emissions)
    GROUP BY industry_group),
    
    total_carbon AS (
    SELECT 
    	industry_group, 
    	ROUND(SUM(carbon_footprint_pcf),1) AS total_industry_footprint
    FROM product_emissions
    WHERE year IN (SELECT MAX(year) 
    			   FROM product_emissions)
    GROUP BY industry_group)
    
    SELECT t.industry_group, num_companies, total_industry_footprint
    FROM total_carbon AS t
    INNER JOIN num_comp n
    ON t.industry_group = n.industry_group
    ORDER BY total_industry_footprint DESC