Skip to content

Help your hedge fund manager!

You have two datasets at your disposal: Balance_Sheet.xlsx and Income_Statement.xlsx. Both these datasets have three columns in common:

  • "Company": The company's ticker name.
  • "comp_type" The type of industry the company in question belongs to. It is either "tech" for companies in the technology industry, "fmcg" for companies in the fast-moving consumer goods industry, and "real_est" for companies in the real estate industry.
  • "Year": The year the company's information is from.

The rest of the columns in the datasets contain information from the financial statement of the "Company" in question. Note that the columns in Balance_Sheet.xlsx only contain financial information from the balance sheet. Similarly, the columns in Income_Statement.xlsx only contain financial information from the income statement. The columns are named accordingly. For instance, the column "Total Liab" from Balance_Sheet.xlsx is the total liability.

import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

balance_sheet = pd.read_excel('data/Balance_Sheet.xlsx')
income_statement = pd.read_excel('data/Income_Statement.xlsx')

print(balance_sheet.info())
print(income_statement.info())

# Year, comp_type, company are part of common 
df_ratios = pd.merge(balance_sheet, income_statement, on=['Year', 'comp_type', 'company'], suffixes=('bs', 'is'))

# leverage ratio and profitability ratio (net income / revenue)
df_ratios['leverage_ratio'] = df_ratios['Total Assets'] / df_ratios['Total Stockholder Equity']
df_ratios['profitability_ratio'] = (df_ratios['Total Revenue'] - df_ratios['Cost Of Goods Sold']) / df_ratios['Total Revenue']

# the industy which has the lowest profitanility ratio and the highest
ratios_company = df_ratios.groupby('comp_type', as_index=False)[['leverage_ratio', 'profitability_ratio']].mean()

lowest_profitability = str(ratios_company[ratios_company['profitability_ratio'] == ratios_company['profitability_ratio'].min()]['comp_type'].iloc[0])
highest_profitability = str(ratios_company[ratios_company['profitability_ratio'] == ratios_company['profitability_ratio'].max()]['comp_type'].iloc[0])
highest_leverage = str(ratios_company[ratios_company['leverage_ratio'] == ratios_company['leverage_ratio'].max()]['comp_type'].iloc[0])

print(ratios_company[ratios_company['comp_type'] == 'real_est'])

# relationship between leverage profitability and leverage
real_estate_company = df_ratios[df_ratios['comp_type'] == 'real_est']
sns.regplot(x='leverage_ratio', y='profitability_ratio', data=real_estate_company)
plt.show()

corr_leverage_profitability = df_ratios['profitability_ratio'].corr(df_ratios['leverage_ratio'])
if corr_leverage_profitability > 0:
    relationship = 'positive'
elif corr_leverage_profitability < 0:
    relationship = 'negative'
else:
    relationship = 'no relationship'
print(relationship)