Skip to content

Analyzing Product Carbon Footprints Using SQL

In the wake of increasing global awareness about climate change, understanding the environmental impact of products has become crucial for industries and consumers alike. Greenhouse gas emissions attributable to products, ranging from food to electronics, make up a significant portion of global emissions. This project aims to analyze the product carbon footprints (PCFs) of various products, companies, and industries by leveraging SQL queries on a PostgreSQL database. By focusing on key metrics such as product weight, carbon emissions, and production stages, we will gain insights into which industries and products contribute the most to greenhouse gas emissions.

When factoring heat generation required for the manufacturing and transportation of products, Greenhouse gas emissions attributable to products, from food to sneakers to appliances, make up more than 75% of global emissions. The Carbon Catalogue

This data, which is publicly available on nature.com, contains product carbon footprints (PCFs) for various companies. PCFs are the greenhouse gas emissions attributable to a given product, measured in CO2 (carbon dioxide equivalent).

This data is stored in a PostgreSQL database containing one table, product_emissions, which looks at PCFs by product as well as the stage of production that these emissions occurred. Here's a snapshot of what product_emissions contains in each column:

product_emissions

fielddata type
idVARCHAR
yearINT
product_nameVARCHAR
companyVARCHAR
countryVARCHAR
industry_groupVARCHAR
weight_kgNUMERIC
carbon_footprint_pcfNUMERIC
upstream_percent_total_pcfVARCHAR
operations_percent_total_pcfVARCHAR
downstream_percent_total_pcfVARCHAR

I'll use this data to examine the carbon footprint of each industry in the dataset!

Data Overview

Spinner
DataFrameas
carbon_emissions_by_industry
variable
SELECT *
FROM product_emissions
LIMIT 5;

How many total rows are present in the product_emissions table?

Spinner
DataFrameas
df6
variable
SELECT COUNT(*) AS total_rows
FROM product_emissions;

What distinct years are present in the product_emissions table?

Spinner
DataFrameas
df5
variable
SELECT DISTINCT year
FROM product_emissions;

How many distinct countries are represented in the product_emissions table?

Spinner
DataFrameas
df7
variable
SELECT COUNT(DISTINCT country)
FROM product_emissions;

How many distinct products are present in the product_emissions table?

Spinner
DataFrameas
df8
variable
SELECT COUNT(DISTINCT product_name)
FROM product_emissions;

How many distinct companies are represented in the product_emissions table?

Spinner
DataFrameas
df9
variable
SELECT COUNT(DISTINCT company)
FROM product_emissions;

How many distinct industry groups are represented in the product_emissions table?