If you’re a Mac user and looking to leverage the power of your new Apple Silicon M2 chip for machine learning with PyTorch, you’re in luck. In this blog post, we’ll cover how to set up PyTorch and optimizing your training performance with GPU acceleration on your M2 chip.
We’ll also include some benchmark results to give you an idea of the potential speedup you can expect. So if you’re ready to get started with PyTorch on your M2 chip, read on!
How to Install
Note that the MPS acceleration is not available until macOS 12.3+
If you have the anaconda or miniconda installed. You can install it by using command conda install pytorch torchvision -c pytorch-nightly
Here is the GPU utilisation after using this version of pytorch to train the MNIST handwriting dataset.
Show Me the Code
This demo uses PyTorch to build a handwriting recognition model. It also uses the MNIST dataset, which consists of images of handwritten digits, and trains a convolutional neural network (CNN) to classify the images.
import torch from torch import nn from torch.utils.data import DataLoader from torchvision.datasets import MNIST from torchvision.transforms import ToTensor
defforward(self, x): # Pass the input through the convolutional layers x = self.conv1(x) x = self.pool(x) x = self.dropout1(x) x = self.conv2(x) x = self.pool(x) x = self.dropout2(x)
# Reshape the output for the fully connected layers x = x.view(-1, 32 * 7 * 7)
# Pass the output through the fully connected layers x = self.fc1(x) x = self.fc2(x)
# Define the data loaders train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True) test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)
# Define the model model = HandwritingRecognitionModel().to(device)
# Define the loss function and optimizer loss_fn = nn.CrossEntropyLoss() optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
# Train the model for 10 epochs for epoch inrange(10): # Set the model to training mode model.train()
# Iterate over the training data for images, labels in train_loader: images, labels = images.to(device), labels.to(device) # Pass the input through the model outputs = model(images)
# Compute the loss loss = loss_fn(outputs, labels)
# Backpropagate the error loss.backward()
# Update the model parameters optimizer.step()
# Set the model to evaluation mode model.eval()
# Evaluate the model on the validation set with torch.no_grad(): correct = 0 total = 0
for images, labels in test_loader: images, labels = images.to(device), labels.to(device) # Pass the input through the model outputs = model(images)
# Get the predicted labels _, predicted = torch.max(outputs.data, 1)
# Update the total and correct counts total += labels.size(0) correct += (predicted == labels).sum()
________________________________________________________ Executed in 141.26 secs fish external usr time 202.16 secs 0.07 millis 202.16 secs sys time 69.79 secs 1.19 millis 69.79 secs
Both the CPU and GPU in this benchmark were on the same M2 chip.
The time spent with the CPU was 141.26 seconds, about 2.5 times the GPU version.
Although it’s not too much of an improvement if compared to the newest NVIDIA GPUs, it is still a great leap for Mac users in the Machine Learning field.