FraudDetectorDeepLearning
This is part three of a three-project series where I explore various approaches to classify credit card fraud with artificial intelligence. The other two projects can be found on my profile, titled FraudDetectorLogisticRegression and FraudDetectorVotingClassifier. Enjoy!
In this project, I used a neural network to detect credit card fraud.
Some Background
In the last project, I used a voting classifier model to classify credit card transactions in creditcard_sampledata.csv. Here’s the classification report and confusion matrix.
Classification report: precision recall f1-score support 0 1.00 1.00 1.00 2390 1 0.57 0.40 0.47 10 accuracy 1.00 2400 macro avg 0.78 0.70 0.73 2400 weighted avg 1.00 1.00 1.00 2400 Confusion matrix: [[2387 3] [ 6 4]]
This ensemble model achieved a better precision score than the logistic regression model explained in the first project of the series (0.57 vs. 0.12) but a worse recall score (0.80 vs. 0.40). Can deep learning be used to improve both of these metrics?
Deep neural networks are composed of multiple layers of interconnected neurons that process input data to extract features and make predictions. Each neuron in a layer receives input, applies a weighted sum followed by an activation function, and passes the result to the next layer. Through a process called backpropagation, the network adjusts the weights based on the error of the prediction compared to the actual outcome. This iterative training process helps the network learn complex patterns and relationships in the data.
Code Breakdown
creditcard_sampledata.csv contains information on credit card purchases. I started off by importing the necessary libraries and loading the dataset.
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import confusion_matrix, classification_report
import torch
import torch.nn as nn
import torch.optim as optim
from imblearn.over_sampling import SMOTE
from torch.utils.data import DataLoader, TensorDataset
# Load the dataset from a CSV file
df = pd.read_csv("creditcard_sampledata.csv")I defined the function prep_data() that returns the feature and target variables (with the target variable being fraud or non-fraud). I then split the dataset into training and test sets and used SMOTE to achieve a balanced number of observations in each class when training. Finally, I scaled the features.
# Preprocess the data by separating features (X) and the target variable (y)
def prep_data(df):
X = df.drop('Class', axis=1) # Drop the target column 'Class' to get features
y = df['Class'] # Extract the target column 'Class'
return X, y
X, y = prep_data(df)
# Split the dataset into training and testing sets
# 30% of the data will be used for testing, and 70% for training
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
# Apply SMOTE to handle class imbalance by oversampling the minority class in the training set
sm = SMOTE(random_state=42)
X_train, y_train = sm.fit_resample(X_train, y_train)
# Normalize the feature data to have zero mean and unit variance
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train) # Fit and transform the training data
X_test = scaler.transform(X_test) # Only transform the test dataNext, in order to utilize PyTorch for deep learning, I converted the feature and target variables to PyTorch tensors. I then instantiated a DataLoader to load batches of training data to the model when training.
# Convert the numpy arrays to PyTorch tensors for use in the neural network
X_train = torch.tensor(X_train, dtype=torch.float32)
X_test = torch.tensor(X_test, dtype=torch.float32)
y_train = torch.tensor(y_train.values, dtype=torch.float32)
y_test = torch.tensor(y_test.values, dtype=torch.float32)
# Create a DataLoader to handle batch processing of the training data
train_dataset = TensorDataset(X_train, y_train)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)I defined the neural network with three layers, an output layer, the ReLU activation function, and the Sigmoid activation function (since I’m dealing with a binary classification problem).
# Define a neural network model for fraud detection
class FraudDetectionNN(nn.Module):
def __init__(self, input_dim):
super(FraudDetectionNN, self).__init__()
# Define the layers of the neural network
self.fc1 = nn.Linear(input_dim, 128) # First fully connected layer
self.fc2 = nn.Linear(128, 64) # Second fully connected layer
self.fc3 = nn.Linear(64, 32) # Third fully connected layer
self.fc4 = nn.Linear(32, 1) # Output layer
self.relu = nn.ReLU() # ReLU activation function
self.sigmoid = nn.Sigmoid() # Sigmoid activation function for binary classification
def forward(self, x):
# Define the forward pass through the network
x = self.relu(self.fc1(x)) # Apply ReLU after first layer
x = self.relu(self.fc2(x)) # Apply ReLU after second layer
x = self.relu(self.fc3(x)) # Apply ReLU after third layer
x = self.sigmoid(self.fc4(x)) # Apply Sigmoid after output layer
return x
# Initialize the model with the input dimension (number of features)
input_dim = X_train.shape[1]
model = FraudDetectionNN(input_dim)I then defined my loss function and optimizer and trained the network for 50 epochs.
# Define the loss function and optimizer
criterion = nn.BCELoss() # Binary Cross Entropy Loss
optimizer = optim.Adam(model.parameters(), lr=0.001) # Adam optimizer
num_epochs = 50 # Number of epochs for training
# Train the neural network
for epoch in range(num_epochs):
model.train() # Set the model to training mode
for batch_X, batch_y in train_loader:
optimizer.zero_grad() # Clear the gradients
outputs = model(batch_X).squeeze() # Forward pass
loss = criterion(outputs, batch_y) # Compute the loss
loss.backward() # Backward pass to compute gradients
optimizer.step() # Update the model parameters
# Print the loss for each epoch
print(f'Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}')