Skip to content

Help your hedge fund manager!

You have two datasets at your disposal: Balance_Sheet.xlsx and Income_Statement.xlsx. Both these datasets have three columns in common:

  • "Company": The company's ticker name.
  • "comp_type" The type of industry the company in question belongs to. It is either "tech" for companies in the technology industry, "fmcg" for companies in the fast-moving consumer goods industry, and "real_est" for companies in the real estate industry.
  • "Year": The year the company's information is from.

The rest of the columns in the datasets contain information from the financial statement of the "Company" in question. Note that the columns in Balance_Sheet.xlsx only contain financial information from the balance sheet. Similarly, the columns in Income_Statement.xlsx only contain financial information from the income statement. The columns are named accordingly. For instance, the column "Total Liab" from Balance_Sheet.xlsx is the total liability.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

#Data Import

We start by importing data and loading it into a dataframe.

balance_sheet_df = pd.read_excel('data/Balance_Sheet.xlsx')
balance_sheet_df.head()
income_statement_df = pd.read_excel('data/Income_Statement.xlsx')
income_statement_df.head()

EDA

Data Quality Check

balance_sheet_df.info()
income_statement_df.info()

We have a few missing values in the balance sheet on the "Inventory" and "Short Term Investments" Fields. Hopefully we will not need these for our analysis. Both DFs with 60 entries Lets check if they're for the same years

balance_sheet_df.Year.unique()==income_statement_df.Year.unique()
sorted(balance_sheet_df.Year.unique())

Test passed !

Both DFs are for the same 5 years ranging from 2018 to 2022

Computing Ratios

Debt-to-equity Ratio

The Debt-to-Equity Ratio compares the total Liabilities of the company to the Shareholders Equity [1]

The formula is

Therefore, we will use the Total Liab and Total Stockholder Equity columns from balance_sheet_df to compute the ratios.

Ratios will be saved in a newly created DataFrame for this purposedf_ratios, within the levrage_ratio column.

[1] https://www.investopedia.com/terms/d/debtequityratio.asp

We first Initiate the new dataframe, by keeping Year and company columns, for sake of indexing.

df_ratios = balance_sheet_df.loc[:,['Year','company']].copy()