Skip to content

Course Notes

Use this workspace to take notes, store code snippets, or build your own interactive cheatsheet! For courses that use data, the datasets will be available in the datasets folder.

# Import any packages you want to use here
import torch
import numpy as np
from numpy import array
list_a = [1, 2, 3, 4]

# Create a tensor from list_a
tensor_a = torch.tensor(list_a)

# Display the tensor device
print(tensor_a.device)

# Display the tensor data type
print(tensor_a.dtype)
array_a = array([[1, 1, 1], [2, 3, 4], [4, 5, 6]])
array_b = array([[7, 5, 4], [2, 2, 8], [6, 3, 8]])

# Create two tensors from the arrays
tensor_a = torch.tensor(array_a)
tensor_b = torch.tensor(array_b)

# Subtract tensor_b from tensor_a 
tensor_c = tensor_a - tensor_b

# Multiply each element of tensor_a with each element of tensor_b
tensor_d = tensor_a * tensor_b

# Add tensor_c with tensor_d
tensor_e = tensor_c + tensor_d
print(tensor_e)

Neural Networks

  1. Define architecture of the network in Pytorch; neurons and layers
  2. Load data with PyTorch DataLoader objects
  3. Define a loss function - measure difference between predictions and true labels. Accuracy is determined by weights and biases, parameters learned during training
  4. Set up an optimizer - update the network weights during training e.g.: Stochastic Gradient Descent
  5. Defining training loop: pass input data through the network for initial predictions run a forward pass; compute loss; run backpropagation to compute gradients; update weights and biases accordingly.
  6. Test trained network on a separate dataset to evaluate performance, using metrics such as accuracy

Logistic Regression

y_predictions = sigmoid x (weights x Dataset + biases

Sigmoid is an activation function, applies non-linear transformation to input. Applied to binary classificiation problems, outputs float between 0 and 1. torch.nn.Sigmoid()

Softmax is another activation function, outputs probability distribution. We use it for multi-class classification. torch.nn.Softmax(dim=-1)

We can use multiple linear operations in a network.

import torch.nn as nn
vector = torch.tensor([[6.0]])
sigmoid = nn.Sigmoid()
print(sigmoid(vector))
# nn.Linear(n_features, outputs)

scores = torch.tensor([[1.0, -6.0, 2.5, -0.3, 1.2, 0.8]])
softmax = nn.Softmax(dim=-1)
probabilities = softmax(scores)
print(probabilities)
# Implement a small neural network with exactly two linear layers
input_tensor = torch.Tensor([[1, 2, 3, 4, 5, 6, 7, 8]])
model_1 = nn.Sequential(nn.Linear(8,1),
                      nn.Sigmoid(),
                      nn.Linear(1,1))

output_1 = model_1(input_tensor)
print(output_1)

# multiclass
features = torch.tensor([[4.0, 6.0, 3.0, 2.0, 4.0 ]])
model_2 = nn.Sequential(nn.Linear(5,3), nn.Softmax(dim=-1))
output_2 = model_2(features)
print(output_2)

The TensorDataset class is used when your dataset can be loaded as a NumPy array. A TensorDataset can take one or more NumPy arrays as input.

from torch.utils.data import TensorDataset, Dataset, DataLoader

np_features = array(np.random.rand(12, 8))
np_target = array(np.random.rand(12, 1))

torch_features = torch.tensor(np_features)
torch_target = torch.tensor(np_target)

# Create a TensorDataset from two tensors
dataset = TensorDataset(torch_features, torch_target)

# Return the last element of this dataset
print(dataset[-1])
# custom dataset function
class WaterDataset(Dataset):
    def __init__(self, dataset_path):
        super(WaterDataset, self).__init__()
        df = pd.read_csv(dataset_path, index_col=0)
        self.data = df.to_numpy()
        
    def __len__(self):
        return self.data.shape[0]
    
    def __getitem__(self, idx):
        features = self.data[idx, :-1]
        label = self.data[idx, -1]
        return features, label
import pandas as pd
water_potability = pd.read_csv('datasets/water_potability.csv')
water_potability
# Load the different columns into two PyTorch tensors
features = torch.from_numpy(water_potability[["ph", "Sulfate", "Conductivity", "Organic_carbon"]].to_numpy())
target = torch.from_numpy(water_potability["Potability"].to_numpy())

# Create a dataset from the two generated tensors
dataset = TensorDataset(features, target)
print(dataset[-1])