Skip to content
Competition - Abalone Seafood Farming
0
  • AI Chat
  • Code
  • Report
  • Can you estimate the age of an abalone?

    📖 Background

    You are working as an intern for an abalone farming operation in Japan. For operational and environmental reasons, it is an important consideration to estimate the age of the abalones when they go to market.

    Determining an abalone's age involves counting the number of rings in a cross-section of the shell through a microscope. Since this method is somewhat cumbersome and complex, you are interested in helping the farmers estimate the age of the abalone using its physical characteristics.

    💾 The data

    You have access to the following historical data (source):

    Abalone characteristics:
    • "sex" - M, F, and I (infant).
    • "length" - longest shell measurement.
    • "diameter" - perpendicular to the length.
    • "height" - measured with meat in the shell.
    • "whole_wt" - whole abalone weight.
    • "shucked_wt" - the weight of abalone meat.
    • "viscera_wt" - gut-weight.
    • "shell_wt" - the weight of the dried shell.
    • "rings" - number of rings in a shell cross-section.
    • "age" - the age of the abalone: the number of rings + 1.5.

    Acknowledgments: Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn, and Wes B Ford (1994) "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait", Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288).

    💪 Competition challenge

    Create a report that covers the following:

    1. How does weight change with age for each of the three sex categories?
    2. Can you estimate an abalone's age using its physical characteristics?
    3. Investigate which variables are better predictors of age for abalones.

    Estimate the Age of Abalone:

    Importing all the needed libraries. And creating a dataframe from abalone dataset.

    import pandas as pd
    import seaborn as sns
    import plotly.express as px
    import plotly.graph_objects as go
    from plotly.subplots import make_subplots
    import matplotlib.pyplot as plt

    Creating dataframe for abalone dataset and seperating input and output data columns.

    # creating a dataframe of the abalone dataset
    abalone = pd.read_csv('./data/abalone.csv')
    
    # splitting dependent and independent characteristics
    characteristics = abalone.iloc[:, :-1]
    age = abalone.iloc[:, -1]
    
    # printing abalone dataframe
    abalone.head()

    We can also convert sex to interger but as we wouldn't need it in calculation.

    Explore Dataset

    Let's explore dataset first.

    • Let's check the datatypes we are working with.
    • If there is any null value in the dataset.
    • And the statistics of the data.
    # getting the datatypes of characteristics (columns)
    abalone.info()

    We can see that 8 out of 10 columns in the dataset have float values, elxcept rings column which have interget values and sex which is an object.

    # checking for null values
    abalone.isnull().values.any()

    As we can see that there is no null value in the dataset, now let's check some stats.

    # getting some descriptive statistics for all numeric columns
    abalone.describe()
    # getting descriptive statistics for object column (sex)
    abalone.describe(include=object)
    ‌
    ‌
    ‌