Skip to content

Fashion Forward is a new AI-based e-commerce clothing retailer. They want to use image classification to automatically categorize new product listings, making it easier for customers to find what they're looking for. It will also assist in inventory management by quickly sorting items.

As a data scientist tasked with implementing a garment classifier, my primary objective is to develop a machine learning model capable of accurately categorizing images of clothing items into distinct garment types such as shirts, trousers, shoes, etc.

# Importing libraries
import numpy as np
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
from torchmetrics import Accuracy, Precision, Recall
import torch
import torchvision
from torchvision import datasets
import torchvision.transforms as transforms
# Load datasets
train_data = datasets.FashionMNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_data = datasets.FashionMNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())
train_data
test_data

View the dataset

class_names = [
    'T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat',
    'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'
]
fig, axes = plt.subplots(2, 5, figsize=(10, 5)) 

for i in range(10):
    image, label = train_data[i]
    ax = axes[i // 5, i % 5] 
    ax.imshow(image.squeeze(), cmap='gray')
    ax.set_title(class_names[label], fontsize=8)
    ax.axis('off')

plt.tight_layout()
plt.show()

Model Architecture

# Obtain the number of classes for prediction
num_classes = len(train_data.classes)
num_classes

🧠 GarmentClassifier CNN Model Architecture

This convolutional neural network (CNN) is designed to classify grayscale clothing images from the FashionMNIST dataset into 10 predefined garment categories.


πŸ“ Architecture Overview

The model has two main components:

  • Feature Extractor: Extracts visual patterns using convolutional layers
  • Classifier: Fully connected layer that maps extracted features to output classes

πŸ” Layer Breakdown

Layer TypeDescription
Conv2dInput: 1 channel β†’ 32 filters, 3Γ—3 kernel, stride 1, padding 1
ReLUAdds non-linearity
MaxPool2dReduces spatial dimension by half (2Γ—2)
Conv2d32 channels β†’ 64 filters, 3Γ—3 kernel, stride 1, padding 1
ReLUAdds non-linearity
MaxPool2dAgain reduces spatial dimension by half
FlattenConverts 3D feature maps into 1D vectors for dense layers
LinearFully connected layer with output size equal to number of classes

🧾 Input

  • Shape: (batch_size, 1, 28, 28)
  • Grayscale images from FashionMNIST
  • Preprocessed using normalization and resizing

🧾 Output

  • Shape: (batch_size, num_classes)
  • Raw class scores (logits) for each clothing category

πŸ“ Notes

  • Uses only basic CNN components for simplicity and transparency
  • No dropout or batch normalization layers included
  • Suitable for image classification tasks with small and structured datasets
# Defining the CNN architecture
class GarmentClassifier(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)

        self.fc = nn.Linear(64*5*5, num_classes)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

Training the Model

# Hyperparameters
batch_size = 64
learning_rate = 0.001
num_epochs = 1
β€Œ
β€Œ
β€Œ