Skip to content
DeepLearning
  • AI Chat
  • Code
  • Report
  • Mastering Backpropagation: A Comprehensive Guide for Neural Network

    Install relevant libraries

    !pip -q install numpy tensorflow keras

    Step 1: Loading and Preprocessing the MNIST Dataset

    Loading the Dataset

    from keras.utils import to_categorical
    from keras.datasets import mnist
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()

    Exploratory Analysis

    This section focuses on the visualization of some handwritten digits and the distribution of each label (from 0 to 9) in the training dataset

    import matplotlib.pyplot as plt
    import random
    print("Training data")
    print(f"- X = {train_images.shape}, y = {train_labels.shape}")
    print(f"- Hols {train_images.shape[0]/70000* 100}% of the overall data")
    
    print("\n")
    
    print("Testing data")
    print(f"- X = {test_images.shape}, y = {test_labels.shape}")
    print(f"- Hols {test_images.shape[0]/70000* 100}% of the overall data")
    Random Digits

    Now let's plot 9 random images from the training dataset.

    def plot_images(nb_images_to_plot, train_data):
        # Generate a list of random indices from the training data
        random_indices = random.sample(range(len(train_data)), nb_images_to_plot)
    
        # Plot each image using the random indices
        for i, idx in enumerate(random_indices):
            plt.subplot(330 + 1 + i)
            plt.imshow(train_data[idx], cmap=plt.get_cmap('gray'))
    
        plt.show()
    nb_images_to_plot = 9
    
    plot_images(nb_images_to_plot, train_images)
    Digits Distribution
    import numpy as np
    
    def plot_labels_distribution(data_labels):
        
        counts = np.bincount(data_labels)
    
        plt.style.use('seaborn-dark-palette')
    
        fig, ax = plt.subplots(figsize=(10,5))
        ax.bar(range(10), counts, width=0.8, align='center')
        ax.set(xticks=range(10), xlim=[-1, 10], title='Training data distribution')
    
        plt.show()
        
    
    plot_labels_distribution(train_labels)

    Note: we notice that all the ten digits are almost evenly distributed in the training dataset.