Skip to content

What's in an Avocado Toast: A Supply Chain Analysis

You're in London, making an avocado toast, a quick-to-make dish that has soared in popularity on breakfast menus since the 2010s. A simple smashed avocado toast can be made with five ingredients: one ripe avocado, half a lemon, a big pinch of salt flakes, two slices of sourdough bread and a good drizzle of extra virgin olive oil.

It's no small feat that most of these ingredients are readily available in grocery stores. In this project, you'll conduct a supply chain analysis of the ingredients used in an avocado toast, utilizing the Open Food Facts database. This database contains extensive, openly-sourced information on various foods, including their origins. Through this analysis, you will gain an in-depth understanding of the complex supply chain involved in producing a single dish. The data is contained in .csv files in the data/ folder provided.

After completing this project, you'll be armed with a list of ingredients and their countries of origin, and be well-positioned to launch into other analyses that explore how long, on average, these ingredients spend at sea.

import pandas as pd
# Load data
avocado = pd.read_csv('data/avocado.csv', sep='\t')
avocado.info()
# Subset the data
column = [ 'code', 'lc', 'product_name_en', 'quantity', 'serving_size', 'packaging_tags', 'brands', 'brands_tags', 'categories_tags', 'labels_tags', 'countries', 'countries_tags', 'origins','origins_tags']

avocado = avocado[column]
avocado.info()
# Drop rows with null categories_tags
avocado = avocado.dropna(subset='categories_tags')
# Show unique categories list
avocado['categories_tags'].unique()
# Splitting the comma seperated tags to column of list
avocado['categories_list'] = avocado['categories_tags'].str.split(',')
# Identify relevant categories
relevant_categories = ['en:avocadoes', 
                       'en:avocados', 
                       'en:fresh-foods', 
                       'en:fresh-vegetables', 
                       'en:fruchte', 
                       'en:fruits', 
                       'en:raw-green-avocados', 
                       'en:tropical-fruits', 
                       'en:tropische-fruchte', 
                       'en:vegetables-based-foods',
                       'fr:hass-avocados'
                      ]

# Filter data bases on relevant categories
avocado = avocado[avocado['categories_list'].apply(lambda x: any([i for i in x if i in relevant_categories]))]
avocado['categories_list'].head()
# Filter UK
avocados_uk = avocado[avocado['countries'] == 'United Kingdom']

avocados_uk['origins_tags'].value_counts()
avocado_origin = 'Peru'
# Creating function to repeat for another dataset
def read_and_filter_data(filepath, relevant_categories):
  df = pd.read_csv('data/' + filepath, sep='\t')

  # Subset data
  df = df[column]

  # Split tags into lists
  df['categories_list'] = df['categories_tags'].str.split(',')

  # Drop null categories and filter data
  df = df.dropna(subset = 'categories_list')
  df = df[df['categories_list'].apply(lambda x: any([i for i in x if i in relevant_categories]))]
  df = df[(df['countries']=='United Kingdom')]
  print(f'**{filepath[:-4]} origins**','\n',df['origins_tags'].value_counts(), '\n')
  return df
# Lemon top supply
relevant_lemon_categories = ['en:aromatic-plants', 
                             'en:citron', 
                             'en:citrus', 
                             'en:fresh-fruits', 
                             'en:fresh-lemons', 
                             'en:fruits', 
                             'en:lemons', 
                             'en:unwaxed-lemons'
                            ]

lemon = read_and_filter_data('lemon.csv', relevant_lemon_categories)
lemon_origin = 'South Africa'
# Olive oil top supply
with open("data/relevant_olive_oil_categories.txt", "r") as file:
    relevant_olive_oil_categories = file.read().splitlines()
    file.close()
    
olive_oil = read_and_filter_data('olive_oil.csv', relevant_olive_oil_categories)
olive_oil_origin = 'Greece'
# Sourdough top supply
with open("data/relevant_sourdough_categories.txt", "r") as file:
    relevant_sourdough_categories = file.read().splitlines()
    file.close()
    
sourdough = read_and_filter_data('sourdough.csv', relevant_sourdough_categories)
sourdough_origin = 'United Kingdom'
# Salt top supply
relevant_salt_categories = [
 'en:edible-common-salt',
 'en:salts',
 'en:sea-salts',]

salt_flakes = read_and_filter_data('salt_flakes.csv', relevant_salt_categories)