Countless hours of footage have been filmed by hidden cameras for a wildlife documentary. The footage needs to be examined to find out when animals have been captured in the footage. Classifying all captured images based on whether the feature animals could save weeks of work.
You are part of a data science team involved in making this documentary, and your task is to prepare an image processing pipeline. The pipeline will augment existing datasets of images featuring animals so that an object-detection model can be trained and used on the footage.
Data augmentation involves creating variations of the original images through transformations like rotation, scaling, and equalizing to increase the training dataset's diversity and models' robustness. This technique enriches the dataset without collecting more data, helping models generalize better to new, unseen images.
In this project, you will perform image processing operations on five images featuring animals. The output of this project will be a pipeline transforming images to produce augmented datasets.
Data
The COCO (Common Objects in Context) dataset is a dataset designed for training and evaluating computer vision models on a variety of tasks, including object detection. Five images featuring animals have been downloaded from this dataset.
The list file_names is already available and contains the names of the image files available in the current directory.
# Import Matplotlib to read and display images
import matplotlib.pyplot as plt
# Import the necessary modules from scikit-image for image transformation, and exposure adjustment
from skimage.transform import resize, rotate
from skimage.exposure import equalize_adapthist
# List of filenames for the images to be processed
file_names = ["000000546829.jpg","000000012062.jpg","000000417085.jpg","000000269314.jpg","000000575357.jpg"]import numpy as np
def image_processing(
file_names: list,
size: tuple = (250, 250),
rotation_angle: float = 0,
equalization_clip_limit: float = None
) -> list:
"""
Imports and transforms multiple images: resize, rotate, and optionally apply CLAHE.
Parameters:
file_names (list): List of image file paths.
size (tuple): Target resize shape (height, width). Default (250, 250).
rotation_angle (float): Rotation angle in degrees. Default 0.
equalization_clip_limit (float or None): CLAHE clip limit (0-1). If None, skip equalization.
Returns:
list: List of transformed images as numpy arrays.
"""
processed_images = []
for fname in file_names:
# Read image
img = plt.imread(fname)
# Resize
img_resized = resize(img, size, anti_aliasing=True, preserve_range=True)
# Rotate
img_rotated = rotate(img_resized, rotation_angle, resize=False, preserve_range=True)
# Optionally apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
if equalization_clip_limit is not None:
# equalize_adapthist expects float in [0,1]
img_float = img_rotated / 255.0 if img_rotated.max() > 1 else img_rotated
# If image is color, apply CLAHE to each channel
if img_float.ndim == 3 and img_float.shape[2] == 3:
img_eq = np.zeros_like(img_float)
for c in range(3):
img_eq[..., c] = equalize_adapthist(img_float[..., c], clip_limit=equalization_clip_limit)
else:
img_eq = equalize_adapthist(img_float, clip_limit=equalization_clip_limit)
img_final = img_eq
else:
img_final = img_rotated / 255.0 if img_rotated.max() > 1 else img_rotated
processed_images.append(img_final)
return processed_images
# Apply the function to the dataset
transformed_images = image_processing(
file_names=file_names,
size=(500, 500),
rotation_angle=15,
equalization_clip_limit=.01
)
# Visualize the transformed images
fig, axes = plt.subplots(1, len(transformed_images), figsize=(10, 10))
for ax, img, fname in zip(axes, transformed_images, file_names):
ax.imshow(img)
ax.set_title(fname)
ax.axis('off')
plt.tight_layout()
plt.show()