Skip to content

Introduction

In this project I will aim to replicate López & Sepúlveda (2022) "Los efectos de los choques de la demanda interna sobre la inflación en una economía pequeña y abierta: Chile en el período 2000-2021". This paper seeks to isolate the effects of domestic demand shocks from international exposure in Chile's overall inflation using a series of econometric specifications.

For me replicating this work using Stata would be trivial. The reason I'll be doing this in python is that I'm trying to migrate from one coding/technology to another, and this is a good way to practice these new skills.

Loading the raw data and building our dataset

I've downloaded a number of time series from The Central Bank of Chile's repository and FRED.

[data description]

From raw data to a proper dataset

Although I have seen the excel files so as to skip some useless rows, I think it's better to just load them as they come in order to practice some data wrangling. It is also useful in case you are loading some sort of big dataset which cannot be analyzed previously.

import pandas as pd
bcch_data = pd.read_excel("20230410_raw_data_bcentral.xlsx")
fred_data = pd.read_excel("20230410_raw_data_fred.xls")

Lets clean up the data from The Central Bank of Chile first. Let's have a look at it.

bcch_data.head()
bcch_data.tail()

We can see the first three rows are of no use, and while the column labels are conteined within these, there is too much text on them. We can also see that the tail of our dataframe contains observations and not comments or any other element we don't need for our analyzis. There are also some NA values at the end of the DataFrame, we'll deal with these later on.

So let's drop the first rows and rename each column.

bcch_data = bcch_data.iloc[3:, :]
bcch_data.columns = ['date', 'gdp', 'gdp_sa', 'cpi', 'cpi_nvo', 'cpi_vol', 
                     'tcn', 'tpm', 'pi_usa', 'pi_eur', 'pi_chn', 'pi', 'pi_nvo', 'pi_vol', 'pr_cu']

We need to check for duplicate rows. Although these are highly unlikely given the source of our data.

bcch_data.duplicated().sum()

No duplicates, now let's deal with data types.

bcch_data.info()

Need to convert the first column to dates and the rest to numerical types. We'll let pandas decide which data type suits best for each column.

bcch_data = bcch_data.convert_dtypes()
bcch_data.info()