Skip to content
0

Data overview

Data (source):

Abalone characteristics:
  • "sex" - M, F, and I (infant).
  • "length" - longest shell measurement.
  • "diameter" - perpendicular to the length.
  • "height" - measured with meat in the shell.
  • "whole_wt" - whole abalone weight.
  • "shucked_wt" - the weight of abalone meat.
  • "viscera_wt" - gut-weight.
  • "shell_wt" - the weight of the dried shell.
  • "rings" - number of rings in a shell cross-section.
  • "age" - the age of the abalone: the number of rings + 1.5.

Acknowledgments: Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn, and Wes B Ford (1994) "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait", Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288).

Deliverables

Create a report that covers the following:

  1. How does weight change with age for each of the three sex categories?
  2. Can you estimate an abalone's age using its physical characteristics?
  3. Investigate which variables are better predictors of age for abalones.

Importing the Data

import pandas as pd
abalone = pd.read_csv('./data/abalone.csv')
abalone

01: How weight changes with age for each of the three sex categories

Approach: The most suitable form of graph to observe the change in weight with age would be a Line Chart
P.S. We'll be doing the analysis based on the feature 'whole_wt'

Splitting the DataFrame based on the three Sexes

df = abalone
M_df = df[df['sex'] == 'M']
M_df = M_df[['whole_wt', 'age']]

I_df = df[df['sex'] == 'I']
I_df = I_df[['whole_wt', 'age']]

F_df = df[df['sex'] == 'F']
F_df = F_df[['whole_wt', 'age']]

Sorting the columns by Age

M_df.sort_values(by = ['age'], inplace = True)
I_df.sort_values(by = ['age'], inplace = True)
F_df.sort_values(by = ['age'], inplace = True)
print(M_df.head())
print(I_df.head())
print(F_df.head())

Plotting line chart for Sex 'M'

M_df = M_df.groupby(['age'], as_index=False)['whole_wt'].mean().rename({'whole_wt': 'avg_wt'}, axis=1)
Current Type: Line
Current X-axis: age
Current Y-axis: avg_wt
Current Color: None

Change in Whole Weight with Age for Sex 'M'