Project: Predicting Traffic Volume with PyTorch

Traffic data fluctuates constantly or is affected by time. Predicting it can be challenging, but this task will help sharpen your time-series skills. With deep learning, you can use abstract patterns in data that can help boost predictability.

Your task is to build a system that can be applied to help you predict traffic volume or the number of vehicles passing at a specific point and time. Determining this can help reduce road congestion, support new designs for roads or intersections, improve safety, and more! Or, you can use to help plan your commute to avoid traffic!

The dataset provided contains the hourly traffic volume on an interstate highway in Minnesota, USA. It also includes weather features and holidays, which often impact traffic volume.

Time to predict some traffic!

The data:

The dataset is collected and maintained by UCI Machine Learning Repository. The target variable is traffic_volume. The dataset contains the following and has already been normalized and saved into training and test sets:

train_scaled.csv, test_scaled.csv

Column	Type	Description
`temp`	Numeric	Average temp in kelvin
`rain_1h`	Numeric	Amount in mm of rain that occurred in the hour
`snow_1h`	Numeric	Amount in mm of snow that occurred in the hour
`clouds_all`	Numeric	Percentage of cloud cover
`date_time`	DateTime	Hour of the data collected in local CST time
`holiday_` (11 columns)	Categorical	US National holidays plus regional holiday, Minnesota State Fair
`weather_main_` (11 columns)	Categorical	Short textual description of the current weather
`weather_description_` (35 columns)	Categorical	Longer textual description of the current weather
`traffic_volume`	Numeric	Hourly I-94 ATR 301 reported westbound traffic volume
`hour_of_day`	Numeric	The hour of the day
`day_of_week`	Numeric	The day of the week (0=Monday, Sunday=6)
`day_of_month`	Numeric	The day of the month
`month`	Numeric	The number of the month
`traffic_volume`	Numeric	Hourly I-94 ATR 301 reported westbound traffic volume

# Import the relevant libraries
import numpy as np
import pandas as pd

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import TensorDataset, DataLoader

# Read the traffic data from the CSV training and test files
train_scaled_df = pd.read_csv('train_scaled.csv')
test_scaled_df = pd.read_csv('test_scaled.csv')

import torch
import torch.nn as nn
import torch.optim as optim

class trafficVolumePredictor(nn.Module):
    def __init__(self, input_size):
        super(trafficVolumePredictor, self).__init__()
        self.gru = nn.GRU(
            input_size=input_size,
            hidden_size=32,
            num_layers=2,
            batch_first=True  # Ensure the input has batch size as the first dimension
        )
        self.fc = nn.Linear(32, 1)

    def forward(self, x):
        if x.dim() == 2: 
            x = x.unsqueeze(1)
        h0 = torch.zeros(2, x.shape[0], 32).to(x.device)  # (num_layers, batch_size, hidden_dim)

        # Pass through GRU
        out, _ = self.gru(x, h0)
        out = self.fc(out[:, -1, :])
        return out

from torch.utils.data import DataLoader, TensorDataset

# Convert dataset into tensors
X_train = torch.tensor(train_scaled_df.drop(columns=['traffic_volume']).values, dtype=torch.float32)
y_train = torch.tensor(train_scaled_df['traffic_volume'].values, dtype=torch.float32)



# Reshape X_train and X_test to be 3D (batch_size, seq_length=1, input_size)
X_train = X_train.unsqueeze(1)
X_test = X_test.unsqueeze(1)

# Create dataset
train_dataset = TensorDataset(X_train, y_train)
test_dataset = TensorDataset(X_test, y_test)

# Define batch size
batch_size = 32

# Use DataLoader to create mini-batches
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

                         
                         
                         
# Define model, loss, and optimizer
model = trafficVolumePredictor(X_train.shape[2])
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop with mini-batches
model.train()
epochs = 10

for epoch in range(epochs):
    epoch_loss = 0  # Track loss for this epoch

    for batch_X, batch_y in train_loader:  # Iterate over mini-batches
        optimizer.zero_grad()
        pred = model(batch_X)
        loss = criterion(pred, batch_y.unsqueeze(1))  # Match dimensions
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()

    print(f"Epoch {epoch+1}, Loss: {epoch_loss / len(train_loader):.4f}")

final_training_loss = epoch_loss / len(train_loader)


model.eval()
test_mse = 0

with torch.no_grad():
    for batch_X, batch_y in test_loader:
        pred_test = model(batch_X)
        loss = criterion(pred_test, batch_y.unsqueeze(1))
        test_mse += loss.item()

test_mse /= len(test_loader)

print(f"Final training loss: {final_training_loss:.4f}")
print(f"Test MSE: {test_mse:.4f}")

Project: Predicting Traffic Volume with PyTorch

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}The data:

The data: