Skip to content
Freedom Index Analysis
Context
The Human Freedom Index (HFI) presents a broad measure of human freedom, understood as the absence of coercive constraint. This ninth annual index uses 86 distinct indicators of personal and economic freedom in the following areas:
- Rule of law
- Security and safety
- Movement
- Religion
- Association, assembly, and civil society
- Expression and information
- Relationships
- Size of government
- Legal system and property rights
- Sound money
- Freedom to trade internationally
- Regulation
The Human Freedom index encompasses Personal Freedom (PF) and Economic Freedom (EF).
Source: Fraser Instute, 2023.
Project Questions
- What are the top 10 countries with the highest levels of freedom in 2020?
- What is the most free region per year?
- How many countries are in the first quartile of the human freedom rank, compared to the 4 quartile in 2020?
- What was the average global score per area, in the last 3 years?
- What were the top 10 less economic free countries in 2020?
- What was the standard deviation of personal freedom, in Latin America & the Caribbean in the last 5 years?
- How many countries had an HFI aboved the average in 2020?
Data Preparation
DataFrameas
df
variable
-- Explore the data in the table
SELECT *
FROM 'hfi_cc_2022.csv';
-- There are some missing values, and the level of detail in the data shows the scores obtained in each sub-area.
-- To have a cleaner data set, these sub-scores won't be used. The first step is to create a new table with the wanted fields only.DataFrameas
df
variable
-- Create a temporary table
CREATE TEMP TABLE hfi_table AS
SELECT *
FROM 'hfi_cc_2022.csv';
-- Use information_schema.columns to see the data type of each field
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'hfi_table';
DataFrameas
hfi_cleaned
variable
-- Create a new cleaned table without the sub-scores and rounding to 2 decimals each value
SELECT year,
countries,
region,
ROUND(hf_score, 2) AS hf_score,
hf_rank,
hf_quartile,
ROUND(pf_rol, 2) AS rule_of_law,
ROUND(pf_ss, 2) AS security_and_safety,
ROUND(pf_movement, 2) AS movement,
ROUND(pf_religion, 2) AS religion,
ROUND(pf_assembly, 2) AS association,
ROUND(pf_expression, 2) AS expression_and_information,
ROUND(pf_identity, 2) AS relationships,
ROUND(pf_score, 2) AS pf_score,
pf_rank AS personal_rank,
ROUND(ef_government, 2) AS size_of_government,
ROUND(ef_legal, 2) AS legal_system_and_pr,
ROUND(ef_money, 2) AS sound_money,
ROUND(ef_trade, 2) AS freedom_to_trade,
ROUND(ef_regulation, 2) AS regulation,
ROUND(ef_score, 2) AS ef_score,
ef_rank
FROM 'hfi_cc_2022.csv';Data Analysis
1. What are the top 10 countries with the highest levels of freedom in 2020?
DataFrameas
top10_2020
variable
-- Select the top 10 countries with the highest levels of freedom in 2020
SELECT
hf_rank,
countries,
hf_score,
pf_score,
ef_score
FROM hfi_cleaned
WHERE year = 2020
ORDER BY hf_rank
LIMIT 10;2. What is the most free region, in average, per year?
DataFrameas
df1
variable
-- Explore the different regions in the data base
SELECT
DISTINCT(region)
FROM hfi_cleaned;DataFrameas
free_regions
variable
-- First I create a CTE to get the average HFI score for every region, by year.
-- I use the ROW_NUMBER function to generate a rank by average HFI score
WITH avg_hfscore_region AS (
SELECT
year,
region,
ROUND(AVG(hf_score), 2) AS avg_hf_score,
ROW_NUMBER () OVER (PARTITION BY year ORDER BY avg_hf_score DESC) AS rank
FROM hfi_cleaned
GROUP BY year, region
ORDER BY year DESC)
-- Query the created CTE and, filter by rank and order it by year
SELECT
year,
region,
avg_hf_score
FROM avg_hfscore_region
WHERE rank = 1
GROUP BY year, region, avg_hf_score
ORDER BY year DESC;
/*We can see that, in average, North America was the freest region in the last 20 years,
but Western Europe surpased it in 2020.*/Visualization of the average Human freedom Index in the North America region in the last 20 years.
# Import Matplotlib package
import matplotlib.pyplot as plt
# Filter the data for North America region
north_america_data = free_regions[free_regions['region'] == 'North America']
# Plot the data
plt.figure(figsize=(12, 6))
plt.plot(north_america_data['year'], north_america_data['avg_hf_score'], marker='o', linestyle='-', color='b')
# Add titles and labels
plt.title('Average Human Freedom Score for North America (2000-2020)')
plt.xlabel('Year')
plt.grid(True)
# Show the plot
# Y-axis from 8.5 to 9
# X-acis from 2000 to 2019
plt.ylim([8.5, 9])
plt.xlim([2000, 2020])
plt.xticks(np.arange(2000, 2020, step=2))
plt.show()3. How many countries are in the first quartile of the human freedom rank, compared to the fourth quartile in 2020?
DataFrameas
df2
variable
/*Create a query that returns the count of countries
in the first and fourth quartile for the year 2020*/
SELECT
hf_quartile,
COUNT(countries) AS count_countries
FROM hfi_cleaned
WHERE year = 2020 AND hf_quartile IN (1, 4)
GROUP BY hf_quartile;
/*It appears that there is just 2 more countries
in the lowest quartile compared to the highest one*/