Pytorch Walkthrough Notes

Starting Development with PyTorch

Created : 18/01/2022 | on Linux: 5.4.0-91-generic
Updated: 18/01/2022 | on Linux: 5.4.0-91-generic
Status: Draft

Check Installing and configuring PyTorch sections if you haven’t already

In this section, we will delve into the process of creating usable a model using PyTorch. We will start from data, build a simple model according to our use case and look at the outputs and how to interprit them.

The pipeline of a typical Neural Network follows a well-defined flow:

Data Formation: We start by organizing our data in a suitable format that aligns with our model’s requirements.
Data Preparation: Next, we prepare our data by performing necessary preprocessing steps to be able to interface with the model.
Data Propagation: Once our data is ready and prerpared, we propagate it through the layers of our network. Each layer applies a transformation to the incoming data, which could be a learned representation or a geometric transformation based on the shape of the subsequent layer.
Output Formulation: Finally, we formulate the output of our network, which could be a classification label, a regression value, or any other desired prediction.

It’s important to note that the nonlinearities between layers play a crucial role in preventing purely aesthetic transformations that offer no substantial change to the values that are being passed through. These nonlinearities discourage the network from having multiple redundant layers, where combining the transformations in those layers would be mathematically equivalent to a single matrix operation requiring only a single set of learnable parameters.

As we progress further, we will explore more intriguing concepts, such as networks with jump interconnects where layers are not connected in a strict linear progression.

I will now address the topics on data, model creation and model execution (training in thiis instance) in seperate subsections and in the end put everything together with some model output testing.

Data

PyTorch has two primitives to work with data, these are:

torch.utils.Dataset
torch.utils.DataLoader

Dataset stores samples and the corresponding labels while the DataLoader wraps an iterable over the Dataset. Once a DataLoader wraps over the Dataset it can support automated batching, sampling, shuffling and multiprocess data loading.

e.g. The code below downloads the FashionMNIST dataset, notice the train=True this means what is downloaded (training data in this instance). To not retrive training data we can set train=False

training_data = datasets.FashionMNIST(
                                       root="data",
                                       train=True,
                                       download=True,
                                       transform=ToTensor(),
)

in the code above we are referring to the FashionMNIST dataset. If it is unavailable datasets will download the dataset and save it under the name pointed by root.

once we have the data we can wrap it around a DataLoader object as shown in the code snippet below

batch_size=64
train_dataloader = DataLoader(training_data, batch_size=batch_size)

Creating Nural Network Models

To define a custom Neural Network in PyTorch. We need to fetch the required modules and assemble the blueprint of our Network. We can do that by following the steps below.

Create a Class that inherits the nn.module.
Define the Layers in the __init__ method of the class defined in the step above.
Specify the Data flow in the forward method of the class.
Move the NN to the GPU if the resource is available to us.

Imports

import torch
from torch import nn # import the nn module
from torch.utils.data import DataLoader # import DataLoader to itertively load dataset in batched form into the model
from torchvision import datasets # get visual processing datasets 
from torchvision.transforms import ToTensor # convert images to tensors 

Class Definition

class NeuralNetwork(nn.Module):
    def __init__(self):
        super(NeuralNetwork, self).__init__()
        self.flatten = nn.Flatten()
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, 512),
            nn.ReLU(),
            nn.Linear(512, 512),
            nn.ReLU(),
            nn.Linear(512, 10)
        )

        self.loss_fn = nn.CrossEntropyLoss()
        self.optimizer = torch.optim.SGD(self.parameters(), lr=1e-3)


    def forward(self, x):
            x = self.flatten(x)
            logits = self.linear_relu_stack(x)
            return logits

Train method

Train method uses the training set to make predictions, compare with the labels and backpropagate the prediction error to update the weights. the loss function here is used to get the error (difference between model output and the training labels) and the optimiser determines the semantics of the navigation through the error plane. (how the error is optimally reduced)

def train(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    model.train()
    for batch, (X, y ) in enumerate(dataloader):
        X, y = X.to(model.device), y.to(model.device)

        # compute prediction error 
        pred = model(X)
        loss = loss_fn(pred, y)

        # Backprop
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")

Putting everything together. The python script below downloads training and test data from the FashonMnist dataset batches into bundles of size n (e.g. 64) and runs some training and test loops so we can see the accuracy grow as every iteration. As its final action the model is saves as a state dictionary into a “.pth” file.

Loading models

Loading models can be easily done if we have the class definition for the models and the “.pth” file that consists the learned parameters. In the following

nn_refresher.ipynb

Source: PyTorch Tutorial