Skip to content

ESG Analysis for Pfizer

Pfizer is one of the biggest multinational biopharmaceutical companies and medicine suppliers with a robust research capacity (MarketLine, 2021). Its revenue dramatically increased to over 24 billion U.S dollars with the Covid-19 vaccine in the Third Quarter of 2021(Statista, 2021).

Environmental, social and governance (ESG) scores in today’s markets evaluate the sustainable effort of companies, representing the deduction of footprint and organising eco-friendly activities, maintaining the relationship with humans, and testing the transparency of management structure (Dyllick and Hockerts, 2002). ESG scores have triggered a wider concern of the company's corporate social responsibility.

Chapter 1 Project Overview

Objects

This empirical analysis has 3 objectives.

  1. Identify Pfizer company’s position in the pharmaceutical industry.
  2. Visualise the trend of business aspects in Pfizer from 2016 to 2018.
  3. Apply linear regression to disclose the relationship between total assets and ESG scores.

Brief

The process includes data cleansing, modelling, visualisation, and combining statistics to detect the rationality of the equation.

The project shows that Pfizer has a strong foundation, including tremendous assets and sufficient employees; conversely, it is on a middle-level of the biopharmaceutical companies regarding return on assets and Tobin’s Q ratio, which means Pfizer needs to focus on arousing development potential. Moreover, the regression model indicates that the ESG score can efficiently promote total assets; the relationship is that total assets increase by e5.2366 million U.S. dollars when environmental and governance disclosure scores increase by one unit.

Dataset

The dataset and corresponding dictionary can be found in the Data folder. The dataset contains all S&P 1500 companies listed in the US stock market over three years 2016 - 2018.

Features for each company:

  • Ticker
  • Name
  • Year
  • ISIN Number
  • SIC Code
  • GICS Industry
  • Country or Territory of Domicile
  • Number of Employees
  • Total Assets
  • R&D Expense
  • R&D Expense Adjusted
  • Operating Expenses R&D
  • Cash and Cash Equivalients
  • Environmental Disclosure Score
  • Social Disclosure Score
  • Social Disclosure Score
  • Governance Disclosure Score
  • Tobin's Q Ratio
  • Return on Assets
  • Return on Common Equity
  • Gross Margin

Chapter 2 Data Gathering

Import Data

Data covers 326 industries. The data includes 4518 rows and 20 variables. Most are quantitative results and collect financial information like total assets, return on assets, and so on; it also includes behaviour scores from external institutions like environmental disclosure score, social disclosure score and governance disclosure score.

# Import all libraries that I will use in the project
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
import statsmodels.api as sm
from statsmodels.stats.outliers_influence import variance_inflation_factor
from statsmodels.formula.api import ols

# Create a dataframe and importing data
df = pd.read_csv("./SP1500_Raw Dataset_Data Analytics in Business Assignment_2021.csv")
df.count()
# Check the number of variables
df.shape[1]

# Check how many industries are involved
len(df["SIC Code"].unique())

Chapter 3 Biopharmaceutical Company

The chapter is to achieve the first objective - identify Pfizer's position in the biopharmaceutical industry.

  1. Use SIC Code of 2834 or 2836 to narrow down the range of companies (SIC Code 2843 - Pharmaceutical, SIC Code 2836 - Biological Products).
  2. Identify metrics
  • name
  • year
  • country or territory of domicile
  • total assets
  • the number of employees
  • return in assets
  • R&D expense adjusted
  • environmental score
  • Tobin's Q ratio
  • return on assets 3.Drop NaN values to guarantee the rationality of results and figures. 4.Calculate mean, median, maximum and minimum values. 5.Visulalise these metrics for comparison.

Table 1 shows the descriptive statistics of biopharmaceutical companies with chosen variables and puts Pfizer's data aside to locate its performance level. There are 21 biopharmaceutical companies, like AbbVie, Amgen, and so on. Two companies are in Ireleand (Endo Internation PLC and Perrigo Co PLC), and the headquarters of 19 companies are in America.

Pfizer is an American corporation, and its headquarter locates in the U.S. as well. Pfizer is generally higher than the average level except for Tobin's Q ratio evaluation; the median value of the cohort is 7.33, while Pfizer only scored 1.94. The data of Pfizer is near the maximum value in employees and total assets aspects.

Figure 1 uses 4 bar plots covering (a) total assets; (b) the number of employees per year; (c) return on assets; (d) Tobin's Q Ratio and gives ranking information about the position of Pfizer in the cohort. Pfizer ranks among the highest according to total assets and has sufficient employees; the number is lower than Johnson & Johnson with over 120,000 employees. Regarding prospects relevant indicators (Return on Assets and Tobin's Q Ratio), Pfizer does not have strengths over other companies; it is the fourth bottom of the cohort comparing Tobin's Q Ratio.

Chapter 4 Development of Pfizer

The chapter is to achieve the second objective - visulalise the trend of business aspects in Pfizer from 2016 to 2018.

  1. Creat a new dataframe for Pfizer
  2. Indentify the analysis variables
  • total assets
  • the number of employees
  • return on assets
  • Tobin's Q Ratio
  • R&D Expense Adjusted
  • Environmental disclosure score
  1. Use line charts to describe the developing trend of Pfizer from 2016 - 2018

Figure 2, with time-series line graph, demonstrates Pfizer's financial aspects. Firstly, the change of total assets from 2016 to 2018 has shown that Pfizer did not have excellent performance in 2018; the assets dropped from over 170,000 million U.S. dollars to below 160,000 million U.S. dollars while the number of employees experienced a slight decrease then increased to around 92,000 in 2018. Then using return on assets and Tobin's Q ratio as indicators of company development prospects demonstrates Pfizer tried to increase the influence and management of the company. However, Pfizer has not performed well with middle-level ranking results in Figure 1. Similarly, Pfizer continued to invest R&D and disclosure more environmental information.

Chapter 5 Relationship between total asset and ESG score

The chapter is to achieve the last objective - apply linear regression to disclose the relationship (total assets & ESG scores).

  1. Drop NaN values and outliers of independent variables (3-sigma method) - get 1,965 efficient rows
  2. Lag ESG scores to the next year (the effect of scores need time)
  3. Log total assets (the distribution is skewed)

Figure 3 visualises the scatterplots of environmental, social, governance disclosure scores V.S. the ln(total assets), respectively. Generally, the contribution of variables is average, and it is easy to observe a positive relationship between independent and dependent variables. The scatterplots can expect the coefficients are positive in the equation. Consequently, the least square generates the parameters and other indexes to quantity the relationships.

Table 2 presents the results of the regression. Of note, the social score variable is excluded in the equation because, in the first attempt, the P-value of the social disclosure score is 0.302, which means the relationship is not significant. Hence, the result only includes two indepdent variables: environmental and governance disclosure scores.

In equation (1), t-1 means lagging scores by one year. The coefficients are 0.0306 and 0.0550, P-values are less than 0.05, which means the relationship is significant. For every one unit increase in environmental and governance disclosure scores last year, total assets will increase by e5.2366 million U.S. dollars this year. The direct impact is significantly positive.

Chapter 6 Evaluation of model

The chapter is to prove the rationality of the model.

  1. The distribution of residuals
  2. The histogram of ln(total assets)
  3. Test multicollinearity with the VIF values

The figures show (1) the distribution of residuals is average in the left residual plots; similarly, the histogram is well-shaped and symmetric. The three graphs indicate that the errors are normally distributed, and heteroscedasticity is not a violation. (2) Multicollinearity is not a problem when the VIF values are around 2.2 in the VIF table.