PyTorch Cheat Sheet ✔

Asad iqbal
8 min read6 days ago

--

Learn everything you need to know about PyTorch

Aspiring data whiz! Want to build awesome AI stuff like chatbots , image recognition software , or even self-driving cars ? Look no further than PyTorch!

PyTorch is a super cool open-source library created by Facebook’s AI experts that makes diving into machine learning (ML) and deep learning (DL) a breeze. This cheat sheet will be your one-stop shop to learn all the basics of PyTorch, making you a machine learning maestro in no time! ✨

Pandas Cheatsheet: https://www.deepnexus.tech/2024/07/python-cheatsheet.html

Git Commands cheatsheet: https://www.deepnexus.tech/2024/07/git-commands-cheat-sheet-master-your.html

Top 50+ SQL Interview Questions: https://www.deepnexus.tech/2024/07/top-50-sql-questions-asked-in-interview.html

Python CheatSheet: https://www.deepnexus.tech/2024/07/ultimate-python-cheat-sheet-master.html

Importing PyTorch

Here’s a breakdown of the essential imports you’ll need:

# Import the top-level package for core functionality
import torch

# Import neural network functionality
from torch import nn

# Import functional programming tools
import torch.nn.functional as F

# Import optimization functionality
import torch.optim as optim

# Import dataset functions
from torch.utils.data import TensorDataset, DataLoader

# Import evaluation metrics
import torchmetrics
  • import torch: This is the main course! It brings in all the core functionalities of PyTorch, like building and manipulating tensors (fancy multi-dimensional arrays) .
  • from torch import nn: Calling all neural network enthusiasts! This import serves up the building blocks for creating powerful neural networks .
  • import torch.nn.functional as F: Feeling fancy? This import provides functional programming tools for creating neural network layers on the fly, adding flexibility to your code ✨.
  • import torch.optim as optim: Training a neural network is like training a puppy - it needs some optimization! This import offers a variety of optimizers to help your network learn efficiently .
  • import torch.utils.data as data: Data is king in machine learning! This import provides tools to manage your datasets efficiently, making training a breeze .
  • import torchmetrics: Not sure how well your model performs? This import offers a toolbox of metrics to evaluate your network's accuracy and effectiveness .

Working with Tensors

Tensors are the fundamental building blocks in PyTorch, kind of like Lego bricks for your AI projects! Here’s how to create and play with them:

# Create tensor from list with tensor()
tnsr = torch.tensor([1, 3, 6, 10])

# Get data type of tensor elements with .dtype
tnsr.dtype # Returns torch.int64

# Get dimensions of tensor with .Size()
tnsr.shape # Returns torch.Size([4])

# Get memory location of tensor with .device
tnsr.device # Returns cpu or gpu

# Create a tensor of zeros with zeros()
tnsr_zrs = torch.zeros(2, 3)

# Create a random tensor with rand()
tnsr_rndm = torch.rand(size=(3, 4)) # Tensor has 3 rows, 4 columns

Datasets and Dataloaders

In machine learning, data is king! But feeding your data to your model efficiently can be a beast to handle. That’s where datasets and dataloaders come in, like trusty dragons by your side.

  1. Building a Dataset: Imagine a dataset as a collection of your training data, neatly organized. If your data is stored in a pandas DataFrame, you can use TensorDataset() to create a dataset from it. This function takes two arguments:
  • The features (independent variables) converted to a tensor of floating-point numbers (.float()) using torch.tensor().
  • The targets (dependent variables) also converted to a floating-point tensor.
# Create a dataset from a pandas DataFrame with TensorDataset()
X = df[feature_columns].values
y = df[target_column].values
dataset = TensorDataset(torch.tensor(X).float(), torch.tensor(y).float())

# Load the data in batches with DataLoader()
dataloader = DataLoader(dataset, batch_size=n, shuffle=True)

Preprocessing

Before feeding data into your neural network, some prep work is essential! This is called preprocessing. Here’s a cool feature of PyTorch for handling categorical variables (like text labels):

One-Hot Encoding with F.one_hot()

Imagine you have data with categories like “red,” “green,” and “blue.” A computer might struggle with these words. One-hot encoding is a trick to convert these categories into a special kind of tensor. This tensor will have one column for each possible category (in this case, “red,” “green,” and “blue”). For each data point, there will be a 1 in the column corresponding to its category, and zeros everywhere else.

Here’s how to use F.one_hot() in PyTorch to achieve this:

# One-hot encode categorical variables with one_hot()
F.one_hot(torch.tensor([0, 1, 2]), num_classes=3) # Returns tensor of 0s and 1s

Sequential Model Architecture

Now it’s time to construct your neural network architecture! PyTorch’s nn.Sequential class makes stacking layers a breeze. Here's a breakdown:

Linear Layers: The workhorses of your network! Use nn.Linear(m, n) to create a linear layer that takes m inputs and generates n outputs. Think of it as multiplying inputs by weights and adding a bias.

Peeking at the Weights and Biases: Curious about the inner workings of your linear layer? Use .weight to access the weight matrix and .bias to see the bias vector. These are what get updated during training!

Activation Functions: These are the non-linear heroes that add complexity to your network. Here are a few popular choices:

  • nn.Sigmoid(): Squishes values between 0 and 1, often used for binary classification (one-vs-all).
  • nn.Softmax(dim=-1): Softens outputs into probabilities that sum to 1, ideal for multi-class classification (one-of-many).
  • nn.ReLU(): Rectified Linear Unit, sets negative values to zero. Helps prevent vanishing gradients.
  • nn.LeakyReLU(negative_slope=0.05): Similar to ReLU but allows a small positive slope for non-zero gradients.

Dropout Layer: Fight overfitting (when your model memorizes training data too well) with nn.Dropout(p=0.5). This randomly drops a certain percentage of activations during training, encouraging the network to not rely on specific features too much.

Building the Model: Use nn.Sequential to chain your layers together in the desired order. Remember, the input size of each layer must match the output size of the previous layer! Here's an example:

# Create a linear layer with m inputs, n outputs with Linear()
lnr = nn.Linear(m, n)

# Get weight of layer with .weight
lnr.weight

# Get bias of layer with .bias
lnr.bias

# Create a sigmoid activation layer for binary classification with Sigmoid()
nn.Sigmoid()

# Create a softmax activation layer for multi-class classification with Softmax()
nn.Softmax(dim=-1)

# Create a rectified linear unit activation layer to avoid saturation with ReLU()
nn.ReLU()

# Create a leaky rectified linear unit activation layer to avoid saturation with LeakyReLU()
nn.LeakyReLU(negative_slope=0.05)

# Create a dropout layer to regularize and prevent overfitting with Dropout()
nn.Dropout(p=0.5)

# Create a sequential model from layers
model = nn.Sequential(
nn.Linear(n_features, i),
nn.Linear(i, j), # Input size must match output from previous layer
nn.Linear(j, n_classes),
nn.Softmax(dim=-1) # Activation layer comes last
)

Fitting a model and calculating loss

Now that you have your data prepped and your model built, it’s time to train it!

# Fit a model to input data with model where model is a variable created by, e.g., Sequential()
prediction = model(input_data).double()

# Get target values
actual = torch.tensor(target_values).double()

# Calculate the mean-squared error loss for regression with MSELoss()
mse_loss = nn.MSELoss()(prediction, actual) # Returns tensor(x)

# Calculate the L1 loss for robust regression with SmoothL1Loss()
l1_loss = nn.SmoothL1Loss()(prediction, actual) # Returns tensor(x)

# Calculate binary cross-entropy loss for binary classification with BCELoss()
bce_loss = nn.BCELoss()(prediction, actual) # Returns tensor(x)

# Calculate cross-entropy loss for multi-class classification with CrossEntropyLoss()
ce_loss = nn.CrossEntropyLoss()(prediction, actual) # Returns tensor(x)

# Calculate the gradients via backprogagation with .backward()
loss.backward()

Working with Optimizers

Training a neural network is like training an athlete — it needs the right tools and techniques to improve! This is where optimizers come in.

# Create a stochastic gradient descent optimizer with SGD(), setting learning rate and momentum
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.95)

# Update neuron parameters with .step()
optimizer.step()

In PyTorch, you can use the optim module to create different optimizers, each with its own approach to adjusting the weights and biases of your neural network. Here's a look at how to use a common optimizer called Stochastic Gradient Descent (SGD):

  1. Creating the Optimizer: Use optim.SGD() to create an SGD optimizer. Pass two important arguments:
  • model.parameters(): This tells the optimizer which parameters (weights and biases) in your model it should update.
  • lr=0.01: This is the learning rate, which controls how much the optimizer adjusts the parameters in each step. A smaller learning rate leads to more cautious updates, while a larger one can lead to faster learning (but also potentially instability).

The Training Loop

# Set model to training mode
model.train()
# Set a loss criterion and an optimizer
loss_criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.95)
# Loop over chunks of data in the training set
for data in dataloader:
# Set the gradients to zero with .zero_grad()
optimizer.zero_grad()
# Get features and targets for current chunk of data
features, targets = data
# Run a "forward pass" to fit the model to the data
predictions = model(data)
# Calculate loss
loss = loss_criterion(predictions, targets)
# Calculate gradients using backprogagation
loss.backward()
# Update the model parameters
optimizer.step()

The Evaluation Loop

After training your amazing PyTorch model, it’s time to see how well it performs!

# Set model to evaluation mode
model.eval()

# Create accuracy metric with Accuracy()
metric = torchmetrics.Accuracy(task="multiclass", num_classes=3)
# Loop of chunks of data in the validation set
for i, data in enumerate(dataloader, 0):
# Get features and targets for current chunk of data
features, targets = data
# Run a "forward pass" to fit the model to the data
predictions = model(data)
# Calculate accuracy over the batch
accuracy = metric(output, predictions.argmax(dim=-1))
# Calculate accuracy over all the validation data
accuracy = metric.compute()
print(f"Accuracy on all data: {accuracy}")
# Reset the metric for the next dataset (training or validation)
metric.reset()

Transfer Learning and Fine-Tuning

In machine learning, we can leverage pre-trained models to jumpstart our own projects. This is called transfer learning, and fine-tuning is a powerful technique within it. Here’s how PyTorch helps you do it:

Saving a Layer for Later: Trained a specific layer of a model and want to use it again later? Use torch.save() to serialize the layer's weights and biases into a file. Think of it as saving a blueprint for future AI projects!

torch.save(layer, 'layer.pth')  # Saves the layer to a file named 'layer.pth'

Loading a Saved Layer: Need to bring back a previously saved layer? torch.load() brings it back to life! This is useful for incorporating pre-trained layers into your new models.

new_layer = torch.load('layer.pth')  # Loads the layer from the 'layer.pth' file

Fine-Tuning with Freeze Power! Let’s say you have a pre-trained model and only want to fine-tune the final layers for your specific task. PyTorch lets you freeze the weights of earlier layers using .requires_grad = False. This prevents them from being updated during training, focusing the learning process on the final layers.

for name, param in model.named_parameters():
if name == "0.weight":
param.requires_grad = False # Freezes the weights of layer 0 (adjust layer number as needed)

# Now you can train your model with only the final layers updating!

By mastering transfer learning and fine-tuning, you can leverage pre-trained knowledge and save training time, making you a machine learning efficiency champion!

Thanks for reading✨; if you liked my content and want to support me, the best way to supporting me on Patreon

--

--