Skip to content
AN ANALYSIS OF TITANIC DATASET USING PIVOT TABLES

EXPLORING TITANIC DATA USING PIVOT TABLES

The following is an analysis of the titanic dataset. The Titanic dataset is popular for data analysis and machine learning. It contains information about the passengers onboard the Titanic, including features like age, gender, fare, cabin, and survival status. This analysis is conducted with the aim of improving the use of pivot tables to compare data features and conduct numerical analysis on categorical variables.

## importing the necessary libraries and reading the data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('seaborn')
df = pd.read_csv("train.csv")
df.head()
## dropping certain columns to make it easier to analyse the data and demonstrate the abilities of pivot tables
df = df.drop(['PassengerId','Ticket','Name','Cabin','Embarked'], axis = 'columns')
## building the pivot tables using 'sex' column as the index
table = pd.pivot_table(data = df, index = ['Sex'])
table
## visualizing the data to compare features for both genders
table.plot(kind = 'bar')
plt.show()

' Clearly the female passengers paid remarkably more for the tickets than the male passengers' ' Number of older Male passengers is slightly more than older female passengers ' ' Slightly more female passengers survived than male passengers

RUNNING A MULTI-LEVEL INDEX

## using 'Sex' and 'Pclass'
table1 = pd.pivot_table(data = df, index = ['Sex','Pclass'])
table1
table1.plot(kind = 'bar')
plt.show()

'Using multiple indexes on the dataset enables us to concur that the ticket fare disparity between females and males was apparent in all passennger classes(pclass) on the titanic'

Different aggregate functions for different features using 'aggfunc'

## Comparing the mean age and total number of survivors for both genders and all Passenger classes
table3 = pd.pivot_table(data = df, index = ['Sex','Pclass'],aggfunc = {'Age':np.mean,'Survived':np.sum})
table3
table3.plot(kind = 'bar')
plt.show()

'More females survived compared to males and first class had highest number of survived females' 'Interestingly, number of survived males was the same in first class and third class , but number of survived males in second classs was significantly lower' 'We need to dive more into the survived data to understand'