Starting Development with PyTorch
Created : 18/01/2022 | on Linux: 5.4.0-91-generic
Updated: 18/01/2022 | on Linux: 5.4.0-91-generic
Status: Draft
Check Installing and configuring PyTorch sections if you haven’t already
In this section, we will delve into the process of creating usable a model using PyTorch. We will start from data, build a simple model according to our use case and look at the outputs and how to interprit them.
The pipeline of a typical Neural Network follows a well-defined flow:
-
Data Formation: We start by organizing our data in a suitable format that aligns with our model’s requirements.
-
Data Preparation: Next, we prepare our data by performing necessary preprocessing steps to be able to interface with the model.
-
Data Propagation: Once our data is ready and prerpared, we propagate it through the layers of our network. Each layer applies a transformation to the incoming data, which could be a learned representation or a geometric transformation based on the shape of the subsequent layer.
-
Output Formulation: Finally, we formulate the output of our network, which could be a classification label, a regression value, or any other desired prediction.
It’s important to note that the nonlinearities between layers play a crucial role in preventing purely aesthetic transformations that offer no substantial change to the values that are being passed through. These nonlinearities discourage the network from having multiple redundant layers, where combining the transformations in those layers would be mathematically equivalent to a single matrix operation requiring only a single set of learnable parameters.
As we progress further, we will explore more intriguing concepts, such as networks with jump interconnects where layers are not connected in a strict linear progression.
I will now address the topics on data, model creation and model execution (training in thiis instance) in seperate subsections and in the end put everything together with some model output testing.
Data
PyTorch has two primitives to work with data, these are:
torch.utils.Dataset
torch.utils.DataLoader
Dataset stores samples and the corresponding labels while the DataLoader wraps an iterable over the Dataset. Once a DataLoader wraps over the Dataset it can support automated batching, sampling, shuffling and multiprocess data loading.
e.g. The code below downloads the FashionMNIST dataset, notice the train=True
this means what is downloaded (training data in this instance). To not retrive training data we can set train=False
training_data = datasets.FashionMNIST(
root="data",
train=True,
download=True,
transform=ToTensor(),
)
in the code above we are referring to the FashionMNIST dataset. If it is unavailable datasets
will download the dataset and save it under the name pointed by root
.
once we have the data we can wrap it around a DataLoader object as shown in the code snippet below
batch_size=64
train_dataloader = DataLoader(training_data, batch_size=batch_size)
Creating Nural Network Models
To define a custom Neural Network in PyTorch. We need to fetch the required modules and assemble the blueprint of our Network. We can do that by following the steps below.
- Create a Class that inherits the
nn.module
. - Define the Layers in the
__init__
method of the class defined in the step above. - Specify the Data flow in the
forward
method of the class. - Move the NN to the GPU if the resource is available to us.
Imports
import torch
from torch import nn # import the nn module
from torch.utils.data import DataLoader # import DataLoader to itertively load dataset in batched form into the model
from torchvision import datasets # get visual processing datasets
from torchvision.transforms import ToTensor # convert images to tensors
Class Definition
class NeuralNetwork(nn.Module):
def __init__(self):
super(NeuralNetwork, self).__init__()
self.flatten = nn.Flatten()
self.device = "cuda" if torch.cuda.is_available() else "cpu"
self.linear_relu_stack = nn.Sequential(
nn.Linear(28 * 28, 512),
nn.ReLU(),
nn.Linear(512, 512),
nn.ReLU(),
nn.Linear(512, 10)
)
self.loss_fn = nn.CrossEntropyLoss()
self.optimizer = torch.optim.SGD(self.parameters(), lr=1e-3)
def forward(self, x):
x = self.flatten(x)
logits = self.linear_relu_stack(x)
return logits
Train method
Train method uses the training set to make predictions, compare with the labels and backpropagate the prediction error to update the weights. the loss function
here is used to get the error (difference between model output and the training labels) and the optimiser determines the semantics of the navigation through the error plane. (how the error is optimally reduced)
def train(dataloader, model, loss_fn, optimizer):
size = len(dataloader.dataset)
model.train()
for batch, (X, y ) in enumerate(dataloader):
X, y = X.to(model.device), y.to(model.device)
# compute prediction error
pred = model(X)
loss = loss_fn(pred, y)
# Backprop
optimizer.zero_grad()
loss.backward()
optimizer.step()
if batch % 100 == 0:
loss, current = loss.item(), batch * len(X)
print(f"loss: {loss:>7f} [{current:>5d}/{size:>5d}]")
Putting everything together. The python script below downloads training and test data from the FashonMnist dataset batches into bundles of size n (e.g. 64) and runs some training and test loops so we can see the accuracy grow as every iteration. As its final action the model is saves as a state dictionary into a “.pth” file.
Loading models
Loading models can be easily done if we have the class definition for the models and the “.pth” file that consists the learned parameters. In the following
nn_refresher.ipynb
Source: PyTorch Tutorial