Skip to content

Analysing expenses spreadsheet

Main questions:

  • Avg earning/spending
  • Max earned/max spent
  • Pivot table with spending per month in each category
  • Pivot table of percent of whole salary spent each month in each category
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/ju-mk/expenses/main/expenses_2023.csv')
expenses = pd.DataFrame(df)
expenses.info()

Preparing the data

Changing the data types accordingly

expenses['date'] = pd.to_datetime(expenses['date'], format='%d/%m/%Y')
expenses.dtypes
expenses.type = expenses.type.astype('category')
expenses.category = expenses.category.astype('category')
expenses.who = expenses.who.astype('category')
expenses.card_account = expenses.card_account.astype('category')
expenses.income_expense = expenses.income_expense.astype('category')

Merging same categories

expenses['category'] = expenses['category'].replace('Beauty/Health/medical', 'Beauty/Health/Medical')
expenses.category.value_counts()
Hidden output
import matplotlib.pyplot as plt

expenses.plot(kind = 'scatter', x = 'date', y = 'amount')

plt.show()

Average income x average expenses

expenses.head()

1 hidden cell
expenses['mm/yyyy'] = expenses['date'].dt.strftime('%m/%Y')

expenses = expenses.drop(['month_year'], axis=1)

expenses.head(3)
expenses = expenses.rename(columns={'mm/yyyy': 'month_year'})