Skip to content

Context

The Human Freedom Index (HFI) presents a broad measure of human freedom, understood as the absence of coercive constraint. This ninth annual index uses 86 distinct indicators of personal and economic freedom in the following areas:

  • Rule of law
  • Security and safety
  • Movement
  • Religion
  • Association, assembly, and civil society
  • Expression and information
  • Relationships
  • Size of government
  • Legal system and property rights
  • Sound money
  • Freedom to trade internationally
  • Regulation

The Human Freedom index encompasses Personal Freedom (PF) and Economic Freedom (EF).

Source: Fraser Instute, 2023.

Project Questions

  1. What are the top 10 countries with the highest levels of freedom in 2020?
  2. What is the most free region per year?
  3. How many countries are in the first quartile of the human freedom rank, compared to the 4 quartile in 2020?
  4. What was the average global score per area, in the last 3 years?
  5. What were the top 10 less economic free countries in 2020?
  6. What was the standard deviation of personal freedom, in Latin America & the Caribbean in the last 5 years?
  7. How many countries had an HFI aboved the average in 2020?

Data Preparation

Spinner
DataFrameas
df
variable
-- Explore the data in the table
SELECT *
FROM 'hfi_cc_2022.csv';
-- There are some missing values, and the level of detail in the data shows the scores obtained in each sub-area. 
-- To have a cleaner data set, these sub-scores won't be used. The first step is to create a new table with the wanted fields only.
Spinner
DataFrameas
df
variable
-- Create a temporary table 
CREATE TEMP TABLE hfi_table AS
SELECT *
FROM 'hfi_cc_2022.csv';
-- Use information_schema.columns to see the data type of each field
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'hfi_table';
Spinner
DataFrameas
hfi_cleaned
variable
-- Create a new cleaned table without the sub-scores and rounding to 2 decimals each value
	SELECT year,
		countries,
		region,
		ROUND(hf_score, 2) AS hf_score,
		hf_rank,
		hf_quartile,
		ROUND(pf_rol, 2) AS rule_of_law,
		ROUND(pf_ss, 2) AS security_and_safety,
		ROUND(pf_movement, 2) AS movement,
		ROUND(pf_religion, 2) AS religion,
		ROUND(pf_assembly, 2) AS association,
		ROUND(pf_expression, 2) AS expression_and_information,
		ROUND(pf_identity, 2) AS relationships,
		ROUND(pf_score, 2) AS pf_score,
		pf_rank AS personal_rank,
		ROUND(ef_government, 2) AS size_of_government,
		ROUND(ef_legal, 2) AS legal_system_and_pr,
		ROUND(ef_money, 2) AS sound_money,
		ROUND(ef_trade, 2) AS freedom_to_trade,
		ROUND(ef_regulation, 2) AS regulation,
		ROUND(ef_score, 2) AS ef_score,
		ef_rank
FROM 'hfi_cc_2022.csv';

Data Analysis

1. What are the top 10 countries with the highest levels of freedom in 2020?

Spinner
DataFrameas
top10_2020
variable
-- Select the top 10 countries with the highest levels of freedom in 2020
SELECT
	hf_rank,
	countries,
	hf_score,
	pf_score,
	ef_score
FROM hfi_cleaned
WHERE year = 2020
ORDER BY hf_rank 
LIMIT 10;

2. What is the most free region, in average, per year?

Spinner
DataFrameas
df1
variable
-- Explore the different regions in the data base
SELECT
	DISTINCT(region)
FROM hfi_cleaned;
Spinner
DataFrameas
free_regions
variable
-- First I create a CTE to get the average HFI score for every region, by year.
-- I use the ROW_NUMBER function to generate a rank by average HFI score
WITH avg_hfscore_region AS (
SELECT
	year,
	region,
	ROUND(AVG(hf_score), 2) AS avg_hf_score,
	ROW_NUMBER () OVER (PARTITION BY year ORDER BY avg_hf_score DESC) AS rank
FROM hfi_cleaned
GROUP BY year, region
ORDER BY year DESC)
-- Query the created CTE and, filter by rank and order it by year 
SELECT
	year,
	region,
	avg_hf_score
FROM avg_hfscore_region 
WHERE rank = 1
GROUP BY year, region, 	avg_hf_score
ORDER BY year DESC;
/*We can see that, in average, North America was the freest region in the last 20 years, 
but Western Europe surpased it in 2020.*/

Visualization of the average Human freedom Index in the North America region in the last 20 years.

# Import Matplotlib package 
import matplotlib.pyplot as plt

# Filter the data for North America region
north_america_data = free_regions[free_regions['region'] == 'North America']

# Plot the data
plt.figure(figsize=(12, 6))
plt.plot(north_america_data['year'], north_america_data['avg_hf_score'], marker='o', linestyle='-', color='b')

# Add titles and labels
plt.title('Average Human Freedom Score for North America (2000-2020)')
plt.xlabel('Year')
plt.grid(True)

# Show the plot
# Y-axis from 8.5 to 9
# X-acis from 2000 to 2019
plt.ylim([8.5, 9]) 
plt.xlim([2000, 2020]) 
plt.xticks(np.arange(2000, 2020, step=2))
plt.show()

3. How many countries are in the first quartile of the human freedom rank, compared to the fourth quartile in 2020?

Spinner
DataFrameas
df2
variable
/*Create a query that returns the count of countries 
in the first and fourth quartile for the year 2020*/
SELECT
	hf_quartile,
	COUNT(countries) AS count_countries
FROM hfi_cleaned
WHERE year = 2020 AND hf_quartile IN (1, 4)
GROUP BY hf_quartile;
/*It appears that there is just 2 more countries 
in the lowest quartile compared to the highest one*/