Can you estimate the age of an abalone?
๐ฉ๐ผโ๐ผ Introduction
-- Backgrounds You are working as an intern for an abalone farming operation in Japan. For operational and environmental reasons, it is an important consideration to estimate the age of the abalones when they go to market.
Determining an abalone's age involves counting the number of rings in a cross-section of the shell through a microscope. Since this method is somewhat cumbersome and complex, you are interested in helping the farmers estimate the age of the abalone using its physical characteristics.
๐ง Loading packages
pip install category_encoders
pip install colored
import pandas as pd
import numpy as np
from termcolor import colored
from colored import fore, back, style
import seaborn as sns
import matplotlib.pyplot as plt
import scipy as sp
from scipy import stats
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.compose import TransformedTargetRegressor
from sklearn.compose import make_column_selector
from sklearn.compose import make_column_transformer
from sklearn.pipeline import make_pipeline,Pipeline
from sklearn.preprocessing import PowerTransformer
from sklearn.preprocessing import OneHotEncoder
import category_encoders as ce
import xgboost as xgb
from sklearn.linear_model import LinearRegression,Lasso,Ridge
from sklearn.metrics import mean_squared_error, r2_score
๐พ Loading data
abalone = pd.read_csv('./data/abalone.csv')
abalone.head()
Abalone characteristics:
- "sex" - M, F, and I (infant).
- "length" - longest shell measurement.
- "diameter" - perpendicular to the length.
- "height" - measured with meat in the shell.
- "whole_wt" - whole abalone weight.
- "shucked_wt" - the weight of abalone meat.
- "viscera_wt" - gut-weight.
- "shell_wt" - the weight of the dried shell.
- "rings" - number of rings in a shell cross-section.
- "age" - the age of the abalone: the number of rings + 1.5.
Acknowledgments: Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn, and Wes B Ford (1994) "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait", Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288).
# Check missing value
abalone.isna().sum()
๐ฉ๐ผโ๐ซ Constat:
๐๐ผโโ๏ธ Let's go to the analysis
Part A : How does weight change with age for each of the three sex categories?
๐ Description
๐ There are 3 sex categories that we have to consider with different weights:
๐ What are we trying to find?
๐ What will be the stages?
print(colored(' ๐งฎScatterPlot : Weight by age and sex category','grey',attrs=['bold']))
for column_ in abalone.columns[4:8]:
g = sns.FacetGrid(abalone, col="sex")
g.map(sns.scatterplot,'age', column_)
plt.show()
๐ฉ๐ผโ๐ซ Constat:
๐ How could this be proven?
๐๐ผFirst stage:
๐๐ผSecond stage:
โ
โ