summaryrefslogtreecommitdiff
path: root/.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb
diff options
context:
space:
mode:
Diffstat (limited to '.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb')
-rw-r--r--.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb956
1 files changed, 710 insertions, 246 deletions
diff --git a/.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb b/.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb
index 95d56f7..3ea080e 100644
--- a/.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb
+++ b/.ipynb_checkpoints/quickstart_tutorial-checkpoint.ipynb
@@ -1,283 +1,747 @@
{
- "cells": [
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "%matplotlib inline"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "\n`Learn the Basics <intro.html>`_ ||\n**Quickstart** ||\n`Tensors <tensorqs_tutorial.html>`_ ||\n`Datasets & DataLoaders <data_tutorial.html>`_ ||\n`Transforms <transforms_tutorial.html>`_ ||\n`Build Model <buildmodel_tutorial.html>`_ ||\n`Autograd <autogradqs_tutorial.html>`_ ||\n`Optimization <optimization_tutorial.html>`_ ||\n`Save & Load Model <saveloadrun_tutorial.html>`_\n\nQuickstart\n===================\nThis section runs through the API for common tasks in machine learning. Refer to the links in each section to dive deeper.\n\nWorking with data\n-----------------\nPyTorch has two `primitives to work with data <https://pytorch.org/docs/stable/data.html>`_:\n``torch.utils.data.DataLoader`` and ``torch.utils.data.Dataset``.\n``Dataset`` stores the samples and their corresponding labels, and ``DataLoader`` wraps an iterable around\nthe ``Dataset``.\n\n\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "import torch\nfrom torch import nn\nfrom torch.utils.data import DataLoader\nfrom torchvision import datasets\nfrom torchvision.transforms import ToTensor, Lambda, Compose\nimport matplotlib.pyplot as plt"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "PyTorch offers domain-specific libraries such as `TorchText <https://pytorch.org/text/stable/index.html>`_,\n`TorchVision <https://pytorch.org/vision/stable/index.html>`_, and `TorchAudio <https://pytorch.org/audio/stable/index.html>`_,\nall of which include datasets. For this tutorial, we will be using a TorchVision dataset.\n\nThe ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like\nCIFAR, COCO (`full list here <https://pytorch.org/vision/stable/datasets.html>`_). In this tutorial, we\nuse the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and\n``target_transform`` to modify the samples and labels respectively.\n\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "# Download training data from open datasets.\ntraining_data = datasets.FashionMNIST(\n root=\"data\",\n train=True,\n download=True,\n transform=ToTensor(),\n)\n\n# Download test data from open datasets.\ntest_data = datasets.FashionMNIST(\n root=\"data\",\n train=False,\n download=True,\n transform=ToTensor(),\n)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We pass the ``Dataset`` as an argument to ``DataLoader``. This wraps an iterable over our dataset, and supports\nautomatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e. each element\nin the dataloader iterable will return a batch of 64 features and labels.\n\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "batch_size = 64\n\n# Create data loaders.\ntrain_dataloader = DataLoader(training_data, batch_size=batch_size)\ntest_dataloader = DataLoader(test_data, batch_size=batch_size)\n\nfor X, y in test_dataloader:\n print(\"Shape of X [N, C, H, W]: \", X.shape)\n print(\"Shape of y: \", y.shape, y.dtype)\n break"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Read more about `loading data in PyTorch <data_tutorial.html>`_.\n\n\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "--------------\n\n\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Creating Models\n------------------\nTo define a neural network in PyTorch, we create a class that inherits\nfrom `nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_. We define the layers of the network\nin the ``__init__`` function and specify how data will pass through the network in the ``forward`` function. To accelerate\noperations in the neural network, we move it to the GPU if available.\n\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "# Get cpu or gpu device for training.\ndevice = \"cuda\" if torch.cuda.is_available() else \"cpu\"\nprint(\"Using {} device\".format(device))\n\n# Define model\nclass NeuralNetwork(nn.Module):\n def __init__(self):\n super(NeuralNetwork, self).__init__()\n self.flatten = nn.Flatten()\n self.linear_relu_stack = nn.Sequential(\n nn.Linear(28*28, 512),\n nn.ReLU(),\n nn.Linear(512, 512),\n nn.ReLU(),\n nn.Linear(512, 10),\n nn.ReLU()\n )\n\n def forward(self, x):\n x = self.flatten(x)\n logits = self.linear_relu_stack(x)\n return logits\n\nmodel = NeuralNetwork().to(device)\nprint(model)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Read more about `building neural networks in PyTorch <buildmodel_tutorial.html>`_.\n\n\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "--------------\n\n\n"
- ]
- },
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "%matplotlib inline"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "\n",
+ "`Learn the Basics <intro.html>`_ ||\n",
+ "**Quickstart** ||\n",
+ "`Tensors <tensorqs_tutorial.html>`_ ||\n",
+ "`Datasets & DataLoaders <data_tutorial.html>`_ ||\n",
+ "`Transforms <transforms_tutorial.html>`_ ||\n",
+ "`Build Model <buildmodel_tutorial.html>`_ ||\n",
+ "`Autograd <autogradqs_tutorial.html>`_ ||\n",
+ "`Optimization <optimization_tutorial.html>`_ ||\n",
+ "`Save & Load Model <saveloadrun_tutorial.html>`_\n",
+ "\n",
+ "Quickstart\n",
+ "===================\n",
+ "This section runs through the API for common tasks in machine learning. Refer to the links in each section to dive deeper.\n",
+ "\n",
+ "Working with data\n",
+ "-----------------\n",
+ "PyTorch has two `primitives to work with data <https://pytorch.org/docs/stable/data.html>`_:\n",
+ "``torch.utils.data.DataLoader`` and ``torch.utils.data.Dataset``.\n",
+ "``Dataset`` stores the samples and their corresponding labels, and ``DataLoader`` wraps an iterable around\n",
+ "the ``Dataset``.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import torch\n",
+ "from torch import nn\n",
+ "from torch.utils.data import DataLoader\n",
+ "from torchvision import datasets\n",
+ "from torchvision.transforms import ToTensor, Lambda, Compose\n",
+ "import matplotlib.pyplot as plt"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "PyTorch offers domain-specific libraries such as `TorchText <https://pytorch.org/text/stable/index.html>`_,\n",
+ "`TorchVision <https://pytorch.org/vision/stable/index.html>`_, and `TorchAudio <https://pytorch.org/audio/stable/index.html>`_,\n",
+ "all of which include datasets. For this tutorial, we will be using a TorchVision dataset.\n",
+ "\n",
+ "The ``torchvision.datasets`` module contains ``Dataset`` objects for many real-world vision data like\n",
+ "CIFAR, COCO (`full list here <https://pytorch.org/vision/stable/datasets.html>`_). In this tutorial, we\n",
+ "use the FashionMNIST dataset. Every TorchVision ``Dataset`` includes two arguments: ``transform`` and\n",
+ "``target_transform`` to modify the samples and labels respectively.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Optimizing the Model Parameters\n----------------------------------------\nTo train a model, we need a `loss function <https://pytorch.org/docs/stable/nn.html#loss-functions>`_\nand an `optimizer <https://pytorch.org/docs/stable/optim.html>`_.\n\n"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz\n"
+ ]
},
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "47918efb82854fc7a269ce73230391b0",
+ "version_major": 2,
+ "version_minor": 0
},
- "outputs": [],
- "source": [
- "loss_fn = nn.CrossEntropyLoss()\noptimizer = torch.optim.SGD(model.parameters(), lr=1e-3)"
+ "text/plain": [
+ " 0%| | 0/26421880 [00:00<?, ?it/s]"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and\nbackpropagates the prediction error to adjust the model's parameters.\n\n"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+ "\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz\n"
+ ]
},
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "9abecd52d9144d53bd028f14a2cfd60b",
+ "version_major": 2,
+ "version_minor": 0
},
- "outputs": [],
- "source": [
- "def train(dataloader, model, loss_fn, optimizer):\n size = len(dataloader.dataset)\n for batch, (X, y) in enumerate(dataloader):\n X, y = X.to(device), y.to(device)\n\n # Compute prediction error\n pred = model(X)\n loss = loss_fn(pred, y)\n\n # Backpropagation\n optimizer.zero_grad()\n loss.backward()\n optimizer.step()\n\n if batch % 100 == 0:\n loss, current = loss.item(), batch * len(X)\n print(f\"loss: {loss:>7f} [{current:>5d}/{size:>5d}]\")"
+ "text/plain": [
+ " 0%| | 0/29515 [00:00<?, ?it/s]"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "We also check the model's performance against the test dataset to ensure it is learning.\n\n"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+ "\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz\n"
+ ]
},
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "df61f428b0c44a818d2ab0f64420d9b3",
+ "version_major": 2,
+ "version_minor": 0
},
- "outputs": [],
- "source": [
- "def test(dataloader, model, loss_fn):\n size = len(dataloader.dataset)\n num_batches = len(dataloader)\n model.eval()\n test_loss, correct = 0, 0\n with torch.no_grad():\n for X, y in dataloader:\n X, y = X.to(device), y.to(device)\n pred = model(X)\n test_loss += loss_fn(pred, y).item()\n correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n test_loss /= num_batches\n correct /= size\n print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")"
+ "text/plain": [
+ " 0%| | 0/4422102 [00:00<?, ?it/s]"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "The training process is conducted over several iterations (*epochs*). During each epoch, the model learns\nparameters to make better predictions. We print the model's accuracy and loss at each epoch; we'd like to see the\naccuracy increase and the loss decrease with every epoch.\n\n"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw\n",
+ "\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz\n",
+ "Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz\n"
+ ]
},
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "418ca86b3df24c84979a54ca66cebe56",
+ "version_major": 2,
+ "version_minor": 0
},
- "outputs": [],
- "source": [
- "epochs = 5\nfor t in range(epochs):\n print(f\"Epoch {t+1}\\n-------------------------------\")\n train(train_dataloader, model, loss_fn, optimizer)\n test(test_dataloader, model, loss_fn)\nprint(\"Done!\")"
+ "text/plain": [
+ " 0%| | 0/5148 [00:00<?, ?it/s]"
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Read more about `Training your model <optimization_tutorial.html>`_.\n\n\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "--------------\n\n\n"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw\n",
+ "\n"
+ ]
},
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Saving Models\n-------------\nA common way to save a model is to serialize the internal state dictionary (containing the model parameters).\n\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "torch.save(model.state_dict(), \"model.pth\")\nprint(\"Saved PyTorch Model State to model.pth\")"
- ]
- },
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/home/ta180m/.local/lib/python3.9/site-packages/torchvision/datasets/mnist.py:498: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /build/python-pytorch/src/pytorch-1.9.0-opt/torch/csrc/utils/tensor_numpy.cpp:174.)\n",
+ " return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Download training data from open datasets.\n",
+ "training_data = datasets.FashionMNIST(\n",
+ " root=\"data\",\n",
+ " train=True,\n",
+ " download=True,\n",
+ " transform=ToTensor(),\n",
+ ")\n",
+ "\n",
+ "# Download test data from open datasets.\n",
+ "test_data = datasets.FashionMNIST(\n",
+ " root=\"data\",\n",
+ " train=False,\n",
+ " download=True,\n",
+ " transform=ToTensor(),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We pass the ``Dataset`` as an argument to ``DataLoader``. This wraps an iterable over our dataset, and supports\n",
+ "automatic batching, sampling, shuffling and multiprocess data loading. Here we define a batch size of 64, i.e. each element\n",
+ "in the dataloader iterable will return a batch of 64 features and labels.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Loading Models\n----------------------------\n\nThe process for loading a model includes re-creating the model structure and loading\nthe state dictionary into it.\n\n"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Shape of X [N, C, H, W]: torch.Size([64, 1, 28, 28])\n",
+ "Shape of y: torch.Size([64]) torch.int64\n"
+ ]
+ }
+ ],
+ "source": [
+ "batch_size = 64\n",
+ "\n",
+ "# Create data loaders.\n",
+ "train_dataloader = DataLoader(training_data, batch_size=batch_size)\n",
+ "test_dataloader = DataLoader(test_data, batch_size=batch_size)\n",
+ "\n",
+ "for X, y in test_dataloader:\n",
+ " print(\"Shape of X [N, C, H, W]: \", X.shape)\n",
+ " print(\"Shape of y: \", y.shape, y.dtype)\n",
+ " break"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Read more about `loading data in PyTorch <data_tutorial.html>`_.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "--------------\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Creating Models\n",
+ "------------------\n",
+ "To define a neural network in PyTorch, we create a class that inherits\n",
+ "from `nn.Module <https://pytorch.org/docs/stable/generated/torch.nn.Module.html>`_. We define the layers of the network\n",
+ "in the ``__init__`` function and specify how data will pass through the network in the ``forward`` function. To accelerate\n",
+ "operations in the neural network, we move it to the GPU if available.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "model = NeuralNetwork()\nmodel.load_state_dict(torch.load(\"model.pth\"))"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Using cpu device\n",
+ "NeuralNetwork(\n",
+ " (flatten): Flatten(start_dim=1, end_dim=-1)\n",
+ " (linear_relu_stack): Sequential(\n",
+ " (0): Linear(in_features=784, out_features=512, bias=True)\n",
+ " (1): ReLU()\n",
+ " (2): Linear(in_features=512, out_features=512, bias=True)\n",
+ " (3): ReLU()\n",
+ " (4): Linear(in_features=512, out_features=10, bias=True)\n",
+ " (5): ReLU()\n",
+ " )\n",
+ ")\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Get cpu or gpu device for training.\n",
+ "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+ "print(\"Using {} device\".format(device))\n",
+ "\n",
+ "# Define model\n",
+ "class NeuralNetwork(nn.Module):\n",
+ " def __init__(self):\n",
+ " super(NeuralNetwork, self).__init__()\n",
+ " self.flatten = nn.Flatten()\n",
+ " self.linear_relu_stack = nn.Sequential(\n",
+ " nn.Linear(28*28, 512),\n",
+ " nn.ReLU(),\n",
+ " nn.Linear(512, 512),\n",
+ " nn.ReLU(),\n",
+ " nn.Linear(512, 10),\n",
+ " nn.ReLU()\n",
+ " )\n",
+ "\n",
+ " def forward(self, x):\n",
+ " x = self.flatten(x)\n",
+ " logits = self.linear_relu_stack(x)\n",
+ " return logits\n",
+ "\n",
+ "model = NeuralNetwork().to(device)\n",
+ "print(model)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Read more about `building neural networks in PyTorch <buildmodel_tutorial.html>`_.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "--------------\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Optimizing the Model Parameters\n",
+ "----------------------------------------\n",
+ "To train a model, we need a `loss function <https://pytorch.org/docs/stable/nn.html#loss-functions>`_\n",
+ "and an `optimizer <https://pytorch.org/docs/stable/optim.html>`_.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "loss_fn = nn.CrossEntropyLoss()\n",
+ "optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "In a single training loop, the model makes predictions on the training dataset (fed to it in batches), and\n",
+ "backpropagates the prediction error to adjust the model's parameters.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def train(dataloader, model, loss_fn, optimizer):\n",
+ " size = len(dataloader.dataset)\n",
+ " for batch, (X, y) in enumerate(dataloader):\n",
+ " X, y = X.to(device), y.to(device)\n",
+ "\n",
+ " # Compute prediction error\n",
+ " pred = model(X)\n",
+ " loss = loss_fn(pred, y)\n",
+ "\n",
+ " # Backpropagation\n",
+ " optimizer.zero_grad()\n",
+ " loss.backward()\n",
+ " optimizer.step()\n",
+ "\n",
+ " if batch % 100 == 0:\n",
+ " loss, current = loss.item(), batch * len(X)\n",
+ " print(f\"loss: {loss:>7f} [{current:>5d}/{size:>5d}]\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "We also check the model's performance against the test dataset to ensure it is learning.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [],
+ "source": [
+ "def test(dataloader, model, loss_fn):\n",
+ " size = len(dataloader.dataset)\n",
+ " num_batches = len(dataloader)\n",
+ " model.eval()\n",
+ " test_loss, correct = 0, 0\n",
+ " with torch.no_grad():\n",
+ " for X, y in dataloader:\n",
+ " X, y = X.to(device), y.to(device)\n",
+ " pred = model(X)\n",
+ " test_loss += loss_fn(pred, y).item()\n",
+ " correct += (pred.argmax(1) == y).type(torch.float).sum().item()\n",
+ " test_loss /= num_batches\n",
+ " correct /= size\n",
+ " print(f\"Test Error: \\n Accuracy: {(100*correct):>0.1f}%, Avg loss: {test_loss:>8f} \\n\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "The training process is conducted over several iterations (*epochs*). During each epoch, the model learns\n",
+ "parameters to make better predictions. We print the model's accuracy and loss at each epoch; we'd like to see the\n",
+ "accuracy increase and the loss decrease with every epoch.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "This model can now be used to make predictions.\n\n"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Epoch 1\n",
+ "-------------------------------\n",
+ "loss: 1.758146 [ 0/60000]\n",
+ "loss: 1.820034 [ 6400/60000]\n",
+ "loss: 1.846449 [12800/60000]\n",
+ "loss: 1.975245 [19200/60000]\n",
+ "loss: 1.612495 [25600/60000]\n",
+ "loss: 1.748993 [32000/60000]\n",
+ "loss: 1.628008 [38400/60000]\n",
+ "loss: 1.655061 [44800/60000]\n",
+ "loss: 1.770255 [51200/60000]\n",
+ "loss: 1.654287 [57600/60000]\n",
+ "Test Error: \n",
+ " Accuracy: 37.7%, Avg loss: 1.749445 \n",
+ "\n",
+ "Epoch 2\n",
+ "-------------------------------\n",
+ "loss: 1.670408 [ 0/60000]\n",
+ "loss: 1.743051 [ 6400/60000]\n",
+ "loss: 1.773547 [12800/60000]\n",
+ "loss: 1.924395 [19200/60000]\n",
+ "loss: 1.529726 [25600/60000]\n",
+ "loss: 1.692361 [32000/60000]\n",
+ "loss: 1.559834 [38400/60000]\n",
+ "loss: 1.593531 [44800/60000]\n",
+ "loss: 1.712157 [51200/60000]\n",
+ "loss: 1.605115 [57600/60000]\n",
+ "Test Error: \n",
+ " Accuracy: 38.1%, Avg loss: 1.694516 \n",
+ "\n",
+ "Epoch 3\n",
+ "-------------------------------\n",
+ "loss: 1.607648 [ 0/60000]\n",
+ "loss: 1.684907 [ 6400/60000]\n",
+ "loss: 1.716139 [12800/60000]\n",
+ "loss: 1.888849 [19200/60000]\n",
+ "loss: 1.474264 [25600/60000]\n",
+ "loss: 1.652733 [32000/60000]\n",
+ "loss: 1.514825 [38400/60000]\n",
+ "loss: 1.549373 [44800/60000]\n",
+ "loss: 1.670293 [51200/60000]\n",
+ "loss: 1.571395 [57600/60000]\n",
+ "Test Error: \n",
+ " Accuracy: 39.0%, Avg loss: 1.653676 \n",
+ "\n",
+ "Epoch 4\n",
+ "-------------------------------\n",
+ "loss: 1.561757 [ 0/60000]\n",
+ "loss: 1.640771 [ 6400/60000]\n",
+ "loss: 1.669458 [12800/60000]\n",
+ "loss: 1.862879 [19200/60000]\n",
+ "loss: 1.435348 [25600/60000]\n",
+ "loss: 1.623189 [32000/60000]\n",
+ "loss: 1.482370 [38400/60000]\n",
+ "loss: 1.515045 [44800/60000]\n",
+ "loss: 1.638349 [51200/60000]\n",
+ "loss: 1.545919 [57600/60000]\n",
+ "Test Error: \n",
+ " Accuracy: 39.9%, Avg loss: 1.621615 \n",
+ "\n",
+ "Epoch 5\n",
+ "-------------------------------\n",
+ "loss: 1.525517 [ 0/60000]\n",
+ "loss: 1.604991 [ 6400/60000]\n",
+ "loss: 1.630397 [12800/60000]\n",
+ "loss: 1.841878 [19200/60000]\n",
+ "loss: 1.406707 [25600/60000]\n",
+ "loss: 1.599460 [32000/60000]\n",
+ "loss: 1.456716 [38400/60000]\n",
+ "loss: 1.485950 [44800/60000]\n",
+ "loss: 1.612476 [51200/60000]\n",
+ "loss: 1.525381 [57600/60000]\n",
+ "Test Error: \n",
+ " Accuracy: 40.7%, Avg loss: 1.595456 \n",
+ "\n",
+ "Done!\n"
+ ]
+ }
+ ],
+ "source": [
+ "epochs = 5\n",
+ "for t in range(epochs):\n",
+ " print(f\"Epoch {t+1}\\n-------------------------------\")\n",
+ " train(train_dataloader, model, loss_fn, optimizer)\n",
+ " test(test_dataloader, model, loss_fn)\n",
+ "print(\"Done!\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Read more about `Training your model <optimization_tutorial.html>`_.\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "--------------\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Saving Models\n",
+ "-------------\n",
+ "A common way to save a model is to serialize the internal state dictionary (containing the model parameters).\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "collapsed": false
- },
- "outputs": [],
- "source": [
- "classes = [\n \"T-shirt/top\",\n \"Trouser\",\n \"Pullover\",\n \"Dress\",\n \"Coat\",\n \"Sandal\",\n \"Shirt\",\n \"Sneaker\",\n \"Bag\",\n \"Ankle boot\",\n]\n\nmodel.eval()\nx, y = test_data[0][0], test_data[0][1]\nwith torch.no_grad():\n pred = model(x)\n predicted, actual = classes[pred[0].argmax(0)], classes[y]\n print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Saved PyTorch Model State to model.pth\n"
+ ]
+ }
+ ],
+ "source": [
+ "torch.save(model.state_dict(), \"model.pth\")\n",
+ "print(\"Saved PyTorch Model State to model.pth\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Loading Models\n",
+ "----------------------------\n",
+ "\n",
+ "The process for loading a model includes re-creating the model structure and loading\n",
+ "the state dictionary into it.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
+ }
+ },
+ "outputs": [
{
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "Read more about `Saving & Loading your model <saveloadrun_tutorial.html>`_.\n\n\n"
+ "data": {
+ "text/plain": [
+ "<All keys matched successfully>"
]
+ },
+ "execution_count": 13,
+ "metadata": {},
+ "output_type": "execute_result"
}
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.6.13"
+ ],
+ "source": [
+ "model = NeuralNetwork()\n",
+ "model.load_state_dict(torch.load(\"model.pth\"))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "This model can now be used to make predictions.\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 17,
+ "metadata": {
+ "collapsed": false,
+ "jupyter": {
+ "outputs_hidden": false
}
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Predicted: \"Ankle boot\", Actual: \"Ankle boot\"\n"
+ ]
+ }
+ ],
+ "source": [
+ "classes = [\n",
+ " \"T-shirt/top\",\n",
+ " \"Trouser\",\n",
+ " \"Pullover\",\n",
+ " \"Dress\",\n",
+ " \"Coat\",\n",
+ " \"Sandal\",\n",
+ " \"Shirt\",\n",
+ " \"Sneaker\",\n",
+ " \"Bag\",\n",
+ " \"Ankle boot\",\n",
+ "]\n",
+ "\n",
+ "model.eval()\n",
+ "x, y = test_data[0][0], test_data[0][1]\n",
+ "with torch.no_grad():\n",
+ " pred = model(x)\n",
+ " predicted, actual = classes[pred[0].argmax(0)], classes[y]\n",
+ " print(f'Predicted: \"{predicted}\", Actual: \"{actual}\"')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Read more about `Saving & Loading your model <saveloadrun_tutorial.html>`_.\n",
+ "\n",
+ "\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
},
- "nbformat": 4,
- "nbformat_minor": 0
-} \ No newline at end of file
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.9.6"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}