02-快速入門：使用PyTorch進行機器學習和深度學習的基本工作流程（筆記代碼）

析模界 2023-10-23 發(fā)布于四川

展開全文

文為[PyTorch工作流基礎^[1]]的學習筆記，對原文進行了翻譯和編輯，本系列課程介紹和目錄在《使用PyTorch進行深度學習系列》課程介紹^[2]。
文章將最先在我的博客^[3]發(fā)布，其他平臺因為限制不能實時修改。
在微信公眾號內(nèi)無法嵌入超鏈接，可以點擊閱讀原文^[4]獲得更好的閱讀體驗。

1. 數(shù)據(jù)（準備和加載）

將數(shù)據(jù)拆分為訓練集和測試集

2. 構(gòu)建模型

檢查 PyTorch 模型的內(nèi)容
使用 `torch.inference_mode()` 進行預測

3. 訓練模型

在 PyTorch 中創(chuàng)建損失函數(shù)和優(yōu)化器
在 PyTorch 中創(chuàng)建優(yōu)化循環(huán)
訓練循環(huán)
測試循環(huán)

4. 使用經(jīng)過訓練的 PyTorch 模型進行預測（推理）
5. 保存和加載 PyTorch 模型

保存 PyTorch 模型的 `state_dict()`
加載已保存的 PyTorch 模型的 `state_dict

6. 合并代碼

6.1 數(shù)據(jù)
6.2 構(gòu)建 PyTorch 線性模型
6.3 訓練
6.4 進行預測
6.5 保存模型

7.拓展資料
8.感謝

在本章，我們將通過訓練和使用線性回歸模型來介紹標準 PyTorch 工作流程。

我們將得到 torch 、 torch.nn （ nn 代表神經(jīng)網(wǎng)絡，這個包包含在 PyTorch 中創(chuàng)建神經(jīng)網(wǎng)絡的構(gòu)建塊）和 matplotlib 。

import torch
from torch import nn # nn contains all of PyTorch's building blocks for neural networks
import matplotlib.pyplot as plt

# Check PyTorch version
torch.__version__
>>> '2.0.1'

1. 數(shù)據(jù)（準備和加載）

機器學習中的“數(shù)據(jù)”幾乎可以是你能想象到的任何東西。數(shù)字表（如大型 Excel 電子表格）、任何類型的圖像、視頻、音頻文件（如歌曲或播客）、蛋白質(zhì)結(jié)構(gòu)、文本等。

我們將使用線性回歸來創(chuàng)建具有已知參數(shù)（可以通過模型學習的東西）的數(shù)據(jù)，然后使用 PyTorch 來查看是否可以構(gòu)建模型來使用梯度下降(gradient descent)來估計這些參數(shù)。

# 創(chuàng)建已知參數(shù)
weight = 0.7
bias = 0.3

# 創(chuàng)建數(shù)據(jù)
start = 0
end = 1
step = 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias

X[:10], y[:10]

輸出[3]：

(tensor([[0.0000],
         [0.0200],
         [0.0400],
         [0.0600],
         [0.0800],
         [0.1000],
         [0.1200],
         [0.1400],
         [0.1600],
         [0.1800]]),
 tensor([[0.3000],
         [0.3140],
         [0.3280],
         [0.3420],
         [0.3560],
         [0.3700],
         [0.3840],
         [0.3980],
         [0.4120],
         [0.4260]]))

現(xiàn)在我們將著手構(gòu)建一個可以學習 X （特征）和 y （標簽）之間關(guān)系的模型。

將數(shù)據(jù)拆分為訓練集和測試集

機器學習項目中最重要的步驟之一是創(chuàng)建訓練和測試集（以及需要時的驗證集）。

通常，數(shù)據(jù)集可以分為三類：訓練集（Training Set）、驗證集（Validation Set）和測試集（Test Set）。它們各自具有不同的作用和用途。

訓練集（Training Set），占比約60-80%：訓練集是用來訓練深度學習模型的主要數(shù)據(jù)集。它包含了大量的樣本數(shù)據(jù)，用于模型的參數(shù)優(yōu)化和學習。通過在訓練集上反復迭代訓練，模型可以逐漸學習到數(shù)據(jù)的特征、模式和規(guī)律，從而提高其性能和準確性。
驗證集（Validation Set），占比約10~20%: 驗證集是用于模型的選擇和調(diào)優(yōu)的數(shù)據(jù)集。在訓練過程中，我們需要對模型進行調(diào)整和參數(shù)的選擇，以使其在未見過的數(shù)據(jù)上達到最佳的表現(xiàn)。驗證集提供了一個獨立的樣本集，用于評估模型在未知數(shù)據(jù)上的性能。通過在驗證集上驗證模型的準確性和泛化能力，我們可以調(diào)整模型的超參數(shù)、網(wǎng)絡架構(gòu)或其他相關(guān)參數(shù)，進而改善模型的表現(xiàn)。
測試集（Test Set），占比約10~20%:: 測試集是用于評估訓練好的模型性能的數(shù)據(jù)集。它與訓練集和驗證集是相互獨立的，包含了模型之前未見過的樣本數(shù)據(jù)。通過將測試集輸入已經(jīng)訓練好的模型，并對其進行預測和分類，我們可以對模型的性能進行客觀評估。測試集的結(jié)果可以提供對模型在真實世界數(shù)據(jù)上的表現(xiàn)估計，判斷模型是否能夠很好地泛化和應用于實際場景。

現(xiàn)在，我們可以手動切分 X 和 y 張量來創(chuàng)建它們。

train_split = int(0.8 * len(X)) # 80% of data used for training set, 20% for testing 
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

len(X_train), len(y_train), len(X_test), len(y_test)
>>> (40, 40, 10, 10)

現(xiàn)在我們有 40 個用于訓練的樣本 ( X_train 和 y_train ) 和 10 個用于測試的樣本 ( X_test 和 y_test )。

我們創(chuàng)建的模型將嘗試學習 X_train 和 y_train 之間的關(guān)系，然后我們將評估它在 X_test 和 y_test的表現(xiàn)。

但現(xiàn)在我們的數(shù)據(jù)只是頁面上的數(shù)字。讓我們創(chuàng)建一個函數(shù)來可視化它。

def plot_predictions(train_data=X_train, 
                     train_labels=y_train, 
                     test_data=X_test, 
                     test_labels=y_test, 
                     predictions=None):
  '''
  Plots training data, test data and compares predictions.
  '''
  plt.figure(figsize=(10, 7))

  # Plot training data in blue
  plt.scatter(train_data, train_labels, c='b', s=4, label='Training data')
  
  # Plot test data in green
  plt.scatter(test_data, test_labels, c='g', s=4, label='Testing data')

  if predictions is not None:
    # Plot the predictions in red (predictions were made on the test data)
    plt.scatter(test_data, predictions, c='r', s=4, label='Predictions')

  # Show the legend
  plt.legend(prop={'size': 14});

No description has been provided for this image

Note: 在機器學習中的進行可視化是一個好方法。

2. 構(gòu)建模型

現(xiàn)在我們已經(jīng)有了一些數(shù)據(jù)，讓我們構(gòu)建一個模型來使用藍點來預測綠點。

我們使用 PyTorch 復制標準線性回歸模型。如果您不熟悉 Python 類的使用，我建議閱讀 Python 3 中的面向?qū)ο缶幊讨改?/span>^[5]。

# 創(chuàng)建一個線性回歸模型類
class LinearRegressionModel(nn.Module):  # <- 繼承PyTorch中nn.Module（神經(jīng)網(wǎng)絡）類
    def __init__(self):
        super().__init__()
        self.weights = nn.Parameter(torch.randn(1,  # <- 從隨機權(quán)重開始（這將隨著模型學習而調(diào)整）
                                                dtype=torch.float),  # <- PyTorch默認使用float32類型
                                                requires_grad=True)  # <- 是否可以使用梯度下降來更新此值？

        self.bias = nn.Parameter(torch.randn(1,  # <- 從隨機偏差開始（這將隨著模型學習而調(diào)整）
                                            dtype=torch.float),  # <- PyTorch默認使用float32類型
                                            requires_grad=True)  # <- 是否可以使用梯度下降來更新此值？

    # forward方法定義模型的計算過程
    def forward(self, x: torch.Tensor) -> torch.Tensor:  # <- 'x' 是輸入數(shù)據(jù)（例如訓練/測試特征）
        return self.weights * x + self.bias  # <- 這是線性回歸公式（y = mx + b）

我們來拆解上述代碼：

PyTorch 有四個基本模塊，您可以使用它來創(chuàng)建您可以想象的幾乎任何類型的神經(jīng)網(wǎng)絡。

它們是 `torch.nn`^[6], `torch.optim`^[7], `torch.utils.data.Dataset`^[8] 和 `torch.utils.data.DataLoader`^[9].?，F(xiàn)在，我們將重點關(guān)注前兩個，稍后再討論另外兩個。

PyTorch模塊	它有什么用？
torch.nn	包含計算圖的所有構(gòu)建塊（本質(zhì)上是以特定方式執(zhí)行的一系列計算）。
torch.nn.Parameter	存儲可與 nn.Module 一起使用的張量。如果自動計算 requires_grad=True 梯度（用于通過梯度下降更新模型參數(shù)），這通常被稱為“自動梯度”。
torch.nn.Module	所有神經(jīng)網(wǎng)絡模塊的基類，神經(jīng)網(wǎng)絡的所有構(gòu)建塊都是子類。如果你在PyTorch中構(gòu)建神經(jīng)網(wǎng)絡，你的模型應該子類化 nn.Module 。需要實現(xiàn) forward() 方法。
torch.optim	包含各種優(yōu)化算法（這些算法告訴存儲在 nn.Parameter 中的模型參數(shù)如何最好地改變以改善梯度下降，從而減少損失）。
def forward()	所有的 nn.Module 子類都需要一個 forward() 方法，這定義了將對傳遞給特定 nn.Module 的數(shù)據(jù)進行的計算（例如：上述線性回歸公式）。

資源：在 PyTorch Cheat Sheet^[10] 中查看更多這些基本模塊及其用例。

檢查 PyTorch 模型的內(nèi)容

torch.manual_seed(42)

# 創(chuàng)建一個模型的實例化對象
model_0 = LinearRegressionModel()

# 檢查Parameter(s) 
list(model_0.parameters())

>>> 
[Parameter containing:
 tensor([0.3367], requires_grad=True),
 Parameter containing:
 tensor([0.1288], requires_grad=True)]

我們還可以使用 `.state_dict()`^[11] 獲取模型的狀態(tài)（模型包含的內(nèi)容）。

# List named parameters 
model_0.state_dict()
>>> OrderedDict([('weights', tensor([0.3367])), ('bias', tensor([0.1288]))])

請注意 model_0.state_dict() 中的 weights 和 bias 的值是隨機分配的初始值。

本質(zhì)上，我們希望從隨機參數(shù)開始，讓模型將它們更新為最適合我們數(shù)據(jù)的參數(shù)（我們在創(chuàng)建直線數(shù)據(jù)時設置的硬編碼 weight 和 bias 值）。

因為我們的模型從隨機值開始，所以現(xiàn)在它的預測能力很差。

使用 `torch.inference_mode()` 進行預測

為了檢查這一點，我們可以將測試數(shù)據(jù) X_test 傳遞給它，看看它預測 y_test 的準確程度。當我們將數(shù)據(jù)傳遞給模型時，它將通過模型的 forward() 方法并使用我們定義的計算生成結(jié)果。

In [10]:

# Make predictions with model
with torch.inference_mode(): 
    y_preds = model_0(X_test)

# Note: in older PyTorch code you might also see torch.no_grad()
# with torch.no_grad():
#   y_preds = model_0(X_test)

您可能注意到我們使用 `torch.inference_mode()`^[12] 作為上下文管理器來進行預測。

torch.inference_mode() 關(guān)閉了很多東西（例如梯度跟蹤，這對于訓練是必需的，但對于推理不是必需的）以使前向傳遞（數(shù)據(jù)通過 forward() 方法）更快。

注意：在較舊的 PyTorch 代碼中，您可能還會看到 torch.no_grad() 用于推理。雖然 torch.inference_mode() 和 torch.no_grad() 執(zhí)行類似的操作，但 torch.inference_mode() 更新、可能更快并且更受歡迎。

我們已經(jīng)做了一些預測，讓我們看看它們是什么樣子的。

In [11]:：

# Check the predictions
print(f'Number of testing samples: {len(X_test)}') 
print(f'Number of predictions made: {len(y_preds)}')
print(f'Predicted values:\n{y_preds}')

Out[11]：

Number of testing samples: 10
Number of predictions made: 10
Predicted values:
tensor([[0.3982],
        [0.4049],
        [0.4116],
        [0.4184],
        [0.4251],
        [0.4318],
        [0.4386],
        [0.4453],
        [0.4520],
        [0.4588]])

請注意每個測試樣本都有一個預測值。

對于我們的直線，一個 X 值映射到一個 y 值。

然而，機器學習模型非常靈活。您可以將 100 個 X 值映射到一個、兩個、三個或 10 個 y 值。例如使用100的特征 X 去判斷兩到三個類別 y 。

我們的預測仍然是頁面上的數(shù)字，讓我們使用上面創(chuàng)建的 plot_predictions() 函數(shù)將它們可視化。

In [12]:

plot_predictions(predictions=y_preds)

Out[13]:

tensor([[0.4618],
        [0.4691],
        [0.4764],
        [0.4836],
        [0.4909],
        [0.4982],
        [0.5054],
        [0.5127],
        [0.5200],
        [0.5272]])

因為的模型只是使用隨機參數(shù)值來進行預測，這些預測的紅點看起來完全不準，同時我們的模型也沒有經(jīng)過反向傳播等多次訓練，所以數(shù)據(jù)會偏離的非常厲害。

3. 訓練模型

現(xiàn)在我們的模型正在使用隨機參數(shù)進行計算進行預測，它基本上是猜測的（隨機）。為了解決這個問題，我們可以更新其內(nèi)部參數(shù)（我也將參數(shù)稱為模式），即我們使用 nn.Parameter() 和 torch.randn() 隨機設置的 weights 和 bias 值，更好地代表數(shù)據(jù)。

我們可以對此進行硬編碼（因為我們知道默認值 weight=0.7 和 bias=0.3 ），但是就沒意義了。很多時候您不知道模型的理想?yún)?shù)是什么。相反，編寫代碼來查看模型是否可以嘗試自行解決這些問題要有趣得多。

在 PyTorch 中創(chuàng)建損失函數(shù)和優(yōu)化器

為了讓我們的模型能夠自行更新其參數(shù)，我們需要在代碼中添加更多內(nèi)容。創(chuàng)建一個損失函數(shù)loss function，也是一個優(yōu)化器optimizer。

功能	作用	在PyTorch如何使用？	常用方法
損失函數(shù)	衡量模型預測的錯誤程度（例如 y_preds ）與真值標簽（例如， y_test ）。越低越好。	PyTorch在 torch.nn^[13] 中有很多內(nèi)置的損失函數(shù)。	平均絕對誤差（MAE）用于回歸問題（ torch.nn.L1Loss()^[14] ）。二進制交叉熵用于二進制分類問題（ torch.nn.BCELoss()^[15] ）。
優(yōu)化器	告訴模型如何更新其內(nèi)部參數(shù)以最大限度地降低損失。	您可以在 torch.optim^[16] 中找到各種優(yōu)化函數(shù)實現(xiàn)。	隨機梯度下降（ torch.optim.SGD()^[17] ）。Adam優(yōu)化器（ torch.optim.Adam() ）。

有關(guān)梯度下降法可以觀看【【官方雙語】深度學習之梯度下降法 Part 2 】^[18]了解更多信息：

對于我們的問題，由于我們要預測一個數(shù)字，因此我們使用 PyTorch 中的 MAE（位于 torch.nn.L1Loss() 下）作為損失函數(shù)。

平均絕對誤差（MAE，在 PyTorch 中： torch.nn.L1Loss ）測量兩點（預測和標簽)之間的絕對差異，然后取所有示例的平均值。

我們將使用 SGD， torch.optim.SGD(params, lr) ，其中：

params 是您要優(yōu)化的目標模型參數(shù)（例如我們之前隨機設置的 weights 和 bias 值）。
lr 是您希望優(yōu)化器更新參數(shù)的學習率，每一步優(yōu)化器應該改變參數(shù)的程度由學習率控制。較高的學習率會導致更大的參數(shù)更新，可以加快收斂速度，但可能會導致不穩(wěn)定性增加。較低的學習率會導致較小的參數(shù)更新，可能需要更長的時間才能達到收斂。學習率被認為是一個超參數(shù)（因為它是由機器學習工程師設置的）。學習率的常見起始值是 0.01 、 0.001 、 0.0001 ，但是，這些值也可以隨著時間的推移進行調(diào)整（這稱為學習率調(diào)度^[19]）。

In [14]: ：

# Create the loss function
loss_fn = nn.L1Loss() # MAE loss is same as L1Loss

# Create the optimizer
optimizer = torch.optim.SGD(params=model_0.parameters(), # parameters of target model to optimize
                            lr=0.01) # 學習率 learning rate

在 PyTorch 中創(chuàng)建優(yōu)化循環(huán)

現(xiàn)在我們有了損失函數(shù)和優(yōu)化器，現(xiàn)在是創(chuàng)建訓練循環(huán)（和測試循環(huán)）的時候了。

通過多次讓機器去學習 features 和 labels 之間的關(guān)系成為訓練循環(huán)。測試循環(huán)則用于評估模型在訓練數(shù)據(jù)上是否準確（模型在訓練期間永遠不會看到測試數(shù)據(jù)）。

其中每一個都稱為“循環(huán)”'loop'，因為我們希望我們的模型查看（循環(huán)）每個數(shù)據(jù)集中的每個樣本。

訓練循環(huán)

對于訓練循環(huán)，我們將構(gòu)建以下步驟：

Step	步驟名稱	作用	代碼示例
1	向前傳播 Forward pass	該模型一次遍歷所有訓練數(shù)據(jù)，執(zhí)行其 forward() 函數(shù)計算。	model(x_train)
2	Calculate the loss 計算損失	將模型的輸出（預測）與實際情況進行比較，并進行評估，以查看它們的錯誤程度。	loss = loss_fn(y_pred, y_train)
3	Zero gradients 歸零梯度	優(yōu)化器的梯度設置為零（默認情況下會累積），因此可以針對特定的訓練步驟重新計算它們。	optimizer.zero_grad()
4	Perform backpropagation on the loss 對損失執(zhí)行反向傳播	計算每個要更新的模型參數(shù)的損失梯度（每個參數(shù)帶有 requires_grad=True ）。這被稱為反向傳播，因此稱為“向后”。	loss.backward()
5	Update the optimizer (gradient descent) 更新優(yōu)化器（梯度下降）	使用 requires_grad=True 更新關(guān)于損失梯度的參數(shù)，以改善它們。	optimizer.step()

希望這符合你的需求！

**Note: **關(guān)于上述步驟的順序
上面是一個很好的默認順序，但您可能會看到略有不同的順序。一些經(jīng)驗法則：

在對其執(zhí)行反向傳播 ( loss.backward() ) 之前計算損失 ( loss = ... )。

在步進 ( optimizer.step() ) 之前將梯度為零 ( optimizer.zero_grad() )。

對損失執(zhí)行反向傳播 ( loss.backward() ) 后，步進優(yōu)化器 ( optimizer.step() )。

如需幫助了解機器學習的原理比如反向傳播，梯度下降等只是，強烈建議觀看觀看官方雙語深度學習之反向傳播算法上/下 Part 3 】^[20]。

測試循環(huán)

至于測試循環(huán)（評估我們的模型），典型步驟包括：

Forward pass，Calculate the loss，Calulate evaluation metrics (optional)

請注意，測試循環(huán)不包含執(zhí)行反向傳播 ( loss.backward() ) 或步進優(yōu)化器 ( optimizer.step() )，這是因為模型中的參數(shù)在測試期間沒有更改，它們已經(jīng)已經(jīng)計算過了。對于測試，我們只對模型前向傳遞的輸出感興趣。讓我們將上述所有內(nèi)容放在一起，并訓練我們的模型 100 個 epoch（前向傳遞數(shù)據(jù)），我們將每 10 個 epoch 對其進行評估。

In [15]:

torch.manual_seed(42)

# Set the number of epochs (how many times the model will pass over the training data)
epochs = 100

# Create empty loss lists to track values
train_loss_values = []
test_loss_values = []
epoch_count = []

for epoch in range(epochs):
    ### Training

    # Put model in training mode (this is the default state of a model)
    model_0.train()

    # 1. Forward pass on train data using the forward() method inside 
    y_pred = model_0(X_train)
    # print(y_pred)

    # 2. Calculate the loss (how different are our models predictions to the ground truth)
    loss = loss_fn(y_pred, y_train)

    # 3. Zero grad of the optimizer
    optimizer.zero_grad()

    # 4. Loss backwards
    loss.backward()

    # 5. Progress the optimizer
    optimizer.step()

    ### Testing

    # Put the model in evaluation mode
    model_0.eval()

    with torch.inference_mode():
      # 1. Forward pass on test data
      test_pred = model_0(X_test)

      # 2. Caculate loss on test data
      test_loss = loss_fn(test_pred, y_test.type(torch.float)) # predictions come in torch.float datatype, so comparisons need to be done with tensors of the same type

      # Print out what's happening
      if epoch % 10 == 0:
            epoch_count.append(epoch)
            train_loss_values.append(loss.detach().numpy())
            test_loss_values.append(test_loss.detach().numpy())
            print(f'Epoch: {epoch} | MAE Train Loss: {loss} | MAE Test Loss: {test_loss} ')

Out[15]:

Epoch: 0 | MAE Train Loss: 0.31288138031959534 | MAE Test Loss: 0.48106518387794495 
Epoch: 10 | MAE Train Loss: 0.1976713240146637 | MAE Test Loss: 0.3463551998138428 
Epoch: 20 | MAE Train Loss: 0.08908725529909134 | MAE Test Loss: 0.21729660034179688 
Epoch: 30 | MAE Train Loss: 0.053148526698350906 | MAE Test Loss: 0.14464017748832703 
Epoch: 40 | MAE Train Loss: 0.04543796554207802 | MAE Test Loss: 0.11360953003168106 
Epoch: 50 | MAE Train Loss: 0.04167863354086876 | MAE Test Loss: 0.09919948130846024 
Epoch: 60 | MAE Train Loss: 0.03818932920694351 | MAE Test Loss: 0.08886633068323135 
Epoch: 70 | MAE Train Loss: 0.03476089984178543 | MAE Test Loss: 0.0805937647819519 
Epoch: 80 | MAE Train Loss: 0.03132382780313492 | MAE Test Loss: 0.07232122868299484 
Epoch: 90 | MAE Train Loss: 0.02788739837706089 | MAE Test Loss: 0.06473556160926819

看起來我們的損失隨著每個時期的推移而下降，讓我們繪制它來找出答案。

In [16]:：

# Plot the loss curves
plt.plot(epoch_count, train_loss_values, label='Train loss')
plt.plot(epoch_count, test_loss_values, label='Test loss')
plt.title('Training and test loss curves')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend();

損失是衡量模型錯誤程度的標準，因此越低越好。

由于我們的損失函數(shù)和優(yōu)化器，模型的內(nèi)部參數(shù)（ weights 和 bias ）隨著每次epoch迭代而更新，以更好地反映數(shù)據(jù)中的基礎模式。損失曲線顯示損失隨著時間的推移而下降。

讓我們檢查模型的 .state_dict() ，看看我們的模型與我們?yōu)闄?quán)重和偏差設置的原始值有多接近。

In [17]:：

# 查找模型參數(shù)
print('現(xiàn)在模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）：')
print(model_0.state_dict())
print('\n原始模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）')
print(f'weights: {weight}, bias: {bias}')

現(xiàn)在模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）：
OrderedDict([('weights', tensor([0.5784])), ('bias', tensor([0.3513]))])

原始模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）
weights: 0.7, bias: 0.3

我們的模型非常接近計算 weight 和 bias 的精確原始值（如果我們訓練它更長時間，它可能會更接近）。

這就是機器學習和深度學習的整體思想，有一些理想值來描述我們的數(shù)據(jù)，我們可以訓練一個模型以編程方式計算它們，而不是手動計算它們。

4. 使用經(jīng)過訓練的 PyTorch 模型進行預測（推理）

使用 PyTorch 模型進行預測（也稱為執(zhí)行推理）時需要記住三件事：

將模型設置為評估模式 ( model.eval() )。
使用推理模式上下文管理器 ( with torch.inference_mode(): ... ) 進行預測。
所有預測都應使用同一設備上的對象進行（例如僅 GPU 上的數(shù)據(jù)和模型或僅 CPU 上的數(shù)據(jù)和模型）。

前兩項確保 PyTorch 在訓練期間，不需要的計算和設置都被關(guān)閉。第3個確保您不會遇到跨設備錯誤。

In [18]: 在[18]中：

# 1. 將模型設置為評估模式
model_0.eval()

# 2. 設置為推理模式
with torch.inference_mode():
  # 3. 確保所有的對象在同一設備
  # 以防萬一，可以使用to(device)同一設備
  # model_0.to(device)
  # X_test = X_test.to(device)
  y_preds = model_0(X_test)
  y_preds

Out[18]:

tensor([[0.8141],
        [0.8256],
        [0.8372],
        [0.8488],
        [0.8603],
        [0.8719],
        [0.8835],
        [0.8950],
        [0.9066],
        [0.9182]])

好的！我們已經(jīng)使用經(jīng)過訓練的模型進行了一些預測，現(xiàn)在繪制出來看看？

In [19]:

plot_predictions(predictions=y_preds)

看上去不錯。

5. 保存和加載 PyTorch 模型

要在 PyTorch 中保存和加載模型，有三種主要方法^[21]（以下所有內(nèi)容均取自 PyTorch 保存和加載模型指南）：

以下是返回的 Markdown 表格：

PyTorch方法	它有什么用？
torch.save	使用Python的pickle實用程序?qū)⑿蛄谢瘜ο髲椭频酱疟P?？梢允褂胻orch.save保存模型、張量和其他各種Python對象（如字典）。
torch.load	使用pickle的unpickle功能將pickle Python對象文件（如模型、張量或字典）重新打包并加載到內(nèi)存中。您還可以設置將對象加載到哪個設備（CPU、GPU等）。
torch.nn.Module.load_state_dict	使用已保存的state_dict()對象加載模型的參數(shù)字典（model.state_dict()）。

注意：正如 Python 的 `pickle` 文檔中^[22]所述， pickle 模塊不安全。這意味著您應該只解封（加載）您信任的數(shù)據(jù)。這也適用于加載 PyTorch 模型。僅使用您信任的來源保存的 PyTorch 模型。

保存 PyTorch 模型的 `state_dict()`

保存和加載模型以進行推理（進行預測）的推薦方法^[23]是保存和加載模型的 state_dict() 。

我們通過步驟保存：

使用 Python 的 pathlib 模塊創(chuàng)建一個名為 models 的目錄，用于保存模型。os模塊也可以。
創(chuàng)建一個文件路徑來保存模型。
我們將調(diào)用 torch.save(obj, f) ，其中 obj 是目標模型的 state_dict() ， f 是保存模型的文件名。

注意：PyTorch 保存的模型或?qū)ο笸ǔＲ?.pt 或 .pth 結(jié)尾，例如 saved_model_01.pth 。

In [20]:

from pathlib import Path

# 1. 創(chuàng)建和設定用來保存 模型 的文件夾 models
MODEL_PATH = Path('models')
MODEL_PATH.mkdir(parents=True, exist_ok=True)

# 2.定義model文件路徑
MODEL_NAME = '01_pytorch_workflow_model_0.pth'
MODEL_SAVE_PATH = MODEL_PATH / MODEL_NAME

# 3. 保存模型的 state dict 
print(f'Saving model to: {MODEL_SAVE_PATH}')
torch.save(obj=model_0.state_dict(), # only saves the models learned parameters
           f=MODEL_SAVE_PATH)

加載已保存的 PyTorch 模型的 `state_dict

使用 torch.nn.Module.load_state_dict(torch.load(f)) 加載它，其中 f 是我們保存的模型 state_dict() 。

為什么在 torch.nn.Module.load_state_dict() 內(nèi)調(diào)用 torch.load() ？

因為我們只保存了模型的 state_dict() （這是學習參數(shù)的字典），而不是整個模型，所以我們首先必須使用 torch.load() 加載 state_dict() ，然后通過將 state_dict() 轉(zhuǎn)換為我們模型的新實例（它是 nn.Module 的子類）。

為什么不保存整個模型？

然而，保存整個模型^[24]而不僅僅是 state_dict() 更直觀，引用 PyTorch 文檔（斜體是我的）：

這種方法（保存整個模型）的缺點是序列化數(shù)據(jù)綁定到特定的類以及保存模型時使用的確切目錄結(jié)構(gòu)......
因此，在其他項目中使用或重構(gòu)后，您的代碼可能會以各種方式損壞。

因此，我們使用靈活的方法僅保存和加載 state_dict() ，它基本上也是模型參數(shù)的字典。

讓我們通過創(chuàng)建 LinearRegressionModel() 的另一個實例來測試它，它是 torch.nn.Module 的子類，因此將具有內(nèi)置方法 load_state_dict() 。

In [22]:

# 實例化模型 同時也會重新生成weight和bias
loaded_model_0 = LinearRegressionModel()

# 加載模型
loaded_model_0.load_state_dict(torch.load(f=MODEL_SAVE_PATH))

我們對加載的模型進行了預測，并且對比與之前的預測是否相同。

In [23]: 在[23]中：

# 1. 打開評估模式
loaded_model_0.eval()

# 2. 預測
with torch.inference_mode():
    loaded_model_preds = loaded_model_0(X_test) 
    
# 和之前的模型作對比
y_preds == loaded_model_preds

Out[24]:

tensor([[True],
        [True],
        [True],
        [True],
        [True],
        [True],
        [True],
        [True],
        [True],
        [True]])

看起來加載的模型預測與之前的模型預測（保存之前進行的預測）相同。

注意： 還有更多保存和加載 PyTorch 模型的方法，但我將把這些留作課外和進一步閱讀。有關(guān)更多信息，請參閱 PyTorch 保存和加載模型指南^[25]。

6. 合并代碼

導入庫和設置device變量。

# Import PyTorch and matplotlib
import torch
from torch import nn # nn contains all of PyTorch's building blocks for neural networks
import matplotlib.pyplot as plt

# Check PyTorch version
print(torch.__version__)

# Setup device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
print(f'Using device: {device}')

6.1 數(shù)據(jù)

（1）加載數(shù)據(jù)

首先，我們將對一些 weight 和 bias 值進行硬編碼。然后我們將創(chuàng)建 0 到 1 之間的數(shù)字范圍，這些將是我們的 X 值。最后，我們將使用 X 值以及 weight 和 bias 值通過線性回歸公式創(chuàng)建 y （ y = weight * X + bias ）。

In [27]:

# Create weight and bias
weight = 0.7
bias = 0.3

# Create range values
start = 0
end = 1
step = 0.02

# Create X and y (features and labels)
X = torch.arange(start, end, step).unsqueeze(dim=1) # without unsqueeze, errors will happen later on (shapes within linear layers)
y = weight * X + bias 
X[:10], y[:10]

Out[27]:

(tensor([[0.0000],
         [0.0200],
         [0.0400],
         [0.0600],
         [0.0800],
         [0.1000],
         [0.1200],
         [0.1400],
         [0.1600],
         [0.1800]]),
 tensor([[0.3000],
         [0.3140],
         [0.3280],
         [0.3420],
         [0.3560],
         [0.3700],
         [0.3840],
         [0.3980],
         [0.4120],
         [0.4260]]))

（2）分割數(shù)據(jù)集

# Split data
train_split = int(0.8 * len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

len(X_train), len(y_train), len(X_test), len(y_test)

（3）可視化

# 用我們上文定義過的plot_predictions函數(shù)可視化
plot_predictions(X_train, y_train, X_test, y_test)

6.2 構(gòu)建 PyTorch 線性模型

我們將創(chuàng)建與之前相同風格的模型，除了這一次，我們將使用 `nn.Linear(in_features, out_features)`^[26] 來手動定義模型的權(quán)重和偏差參數(shù)，而不是使用 nn.Parameter() 手動定義模型的權(quán)重和偏差參數(shù)為了我們。其中 in_features 是輸入數(shù)據(jù)的維度數(shù)， out_features 是您希望將其輸出到的維度數(shù)。在我們的例子中，這兩個都是 1 因為我們的數(shù)據(jù)每個標簽 ( y ) 有 1 輸入特征 ( X )。

comparison of nn.Parameter Linear Regression model and nn.Linear Linear Regression model

使用 nn.Parameter 與使用 nn.Linear 創(chuàng)建線性回歸模型。還有很多 torch.nn 模塊具有預構(gòu)建計算的示例，包括許多流行且有用的神經(jīng)網(wǎng)絡層。

In [30]: 在[30]中：

# 繼承 nn.Module 類構(gòu)建模型
class LinearRegressionModelV2(nn.Module):
    def __init__(self):
        super().__init__()
        # 使用 nn.Linear() 創(chuàng)建mo'xing
        self.linear_layer = nn.Linear(in_features=1, 
                                      out_features=1)
    
    # 定義向前傳播的計算方式
    def forward(self, x: torch.Tensor) -> torch.Tensor: 
        return self.linear_layer(x)

# 可選，使用manual_seed設置固定的隨機值
torch.manual_seed(42)
model_1 = LinearRegressionModelV2()
model_1, model_1.state_dict()

Out[30]:

(LinearRegressionModelV2(
   (linear_layer): Linear(in_features=1, out_features=1, bias=True)
 ),
 OrderedDict([('linear_layer.weight', tensor([[0.7645]])),
              ('linear_layer.bias', tensor([0.8300]))]))

注意 model_1.state_dict() 的輸出， nn.Linear() 層為我們創(chuàng)建了一個隨機的 weight 和 bias 參數(shù)?，F(xiàn)在讓我們使用 .to(device) 將模型放在傳遞到可用的 GPU 上。

In [31]: 在[31]中：

# Check model device
print(next(model_1.parameters()).device)

# 傳遞到我們之前定義的device上
model_1.to(device) 
print(next(model_1.parameters()).device)

>>>
device(type='cpu')
device(type='cuda', index=0)

6.3 訓練

# Create loss function
loss_fn = nn.L1Loss()

# Create optimizer
optimizer = torch.optim.SGD(params=model_1.parameters(), # optimize newly created model's parameters
                            lr=0.01)

torch.manual_seed(42)

# Set the number of epochs 
epochs = 1000 

# Put data on the available device
# Without this, error will happen (not all model/data on device)
X_train = X_train.to(device)
X_test = X_test.to(device)
y_train = y_train.to(device)
y_test = y_test.to(device)

for epoch in range(epochs):
    ### Training
    model_1.train() # train mode is on by default after construction

    # 1. Forward pass
    y_pred = model_1(X_train)

    # 2. Calculate loss
    loss = loss_fn(y_pred, y_train)

    # 3. Zero grad optimizer
    optimizer.zero_grad()

    # 4. Loss backward
    loss.backward()

    # 5. Step the optimizer
    optimizer.step()

    ### Testing
    model_1.eval() # put the model in evaluation mode for testing (inference)
    # 1. Forward pass
    with torch.inference_mode():
        test_pred = model_1(X_test)
    
        # 2. Calculate the loss
        test_loss = loss_fn(test_pred, y_test)

    if epoch % 100 == 0:
        print(f'Epoch: {epoch} | Train loss: {loss} | Test loss: {test_loss}')

OUT：

Epoch: 0 | Train loss: 0.5551779866218567 | Test loss: 0.5739762187004089
Epoch: 100 | Train loss: 0.006215683650225401 | Test loss: 0.014086711220443249
Epoch: 200 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 300 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 400 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 500 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 600 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 700 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 800 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882
Epoch: 900 | Train loss: 0.0012645035749301314 | Test loss: 0.013801801018416882

注意：由于機器學習的隨機性，根據(jù)您的模型是在 CPU 還是 GPU 上訓練，您可能會得到略有不同的結(jié)果（不同的損失和預測值）。即使您在任一設備上使用相同的隨機種子，情況也是如此。如果差異很大，您可能需要查找錯誤，但是，如果差異很?。ɡ硐肭闆r下是這樣），您可以忽略它。

這個損失看起來相當?shù)汀?/p>

讓我們檢查我們的模型已經(jīng)學習的參數(shù)，并將它們與我們硬編碼的原始參數(shù)進行比較。

In [35]:

# 查詢模型的參數(shù)
from pprint import pprint # pprint = pretty print, see: https://docs./3/library/pprint.html 
print('現(xiàn)在模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）：')
pprint(model_1.state_dict())
print('\n原始模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）')
print(f'weights: {weight}, bias: {bias}')

OUT:

現(xiàn)在模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）：
OrderedDict([('linear_layer.weight', tensor([[0.6968]], device='cuda:0')),
             ('linear_layer.bias', tensor([0.3025], device='cuda:0'))])

原始模型的內(nèi)部參數(shù)（ `weights` 和 `bias` ）
weights: 0.7, bias: 0.3

6.4 進行預測

# 1. 將模型設置為評估模式
model_0.eval()

# 2. 設置為推理模式
with torch.inference_mode():
  # 3. 確保所有的對象在同一設備
  # 以防萬一，可以使用to(device)同一設備
  # model_0.to(device)
  # X_test = X_test.to(device)
  y_preds = model_0(X_test)
  y_preds

OUT:

tensor([[0.8600],
        [0.8739],
        [0.8878],
        [0.9018],
        [0.9157],
        [0.9296],
        [0.9436],
        [0.9575],
        [0.9714],
        [0.9854]], device='cuda:0')

現(xiàn)在讓我們繪制模型的預測。

注意：許多數(shù)據(jù)科學庫（例如 pandas、matplotlib 和 NumPy）無法使用存儲在 GPU 上的數(shù)據(jù)。因此，當嘗試使用這些庫之一中的函數(shù)且張量數(shù)據(jù)未存儲在 CPU 上時，您可能會遇到一些問題。要解決此問題，您可以在目標張量上調(diào)用 .cpu() 以在 CPU 上返回目標張量的副本。

In [37]:：

# plot_predictions(predictions=y_preds) # -> won't work... data not on CPU

# Put data on the CPU and plot it
plot_predictions(predictions=y_preds.cpu())

6.5 保存模型

from pathlib import Path

# 1. Create models directory 
MODEL_PATH = Path('models')
MODEL_PATH.mkdir(parents=True, exist_ok=True)

# 2. Create model save path 
MODEL_NAME = '01_pytorch_workflow_model_1.pth'
MODEL_SAVE_PATH = MODEL_PATH / MODEL_NAME

# 3. Save the model state dict 
print(f'Saving model to: {MODEL_SAVE_PATH}')
torch.save(obj=model_1.state_dict(), # only saving the state_dict() only saves the models learned parameters
           f=MODEL_SAVE_PATH)

7.拓展資料：

強烈推薦：3brown1blue^[27]的關(guān)于深度學習的三個視頻，深入淺出的方式展示了深度學習的神經(jīng)網(wǎng)絡結(jié)構(gòu)在做什么：

【【官方雙語】深度學習之神經(jīng)網(wǎng)絡的結(jié)構(gòu) Part 1 ver 2.0】 https://www.bilibili.com/video/BV1bx411M7Zx/?share_source=copy_web&vd_source=bbeafbcfe326916409d46b815d8cb3a3
【【官方雙語】深度學習之梯度下降法 Part 2 ver 0.9 beta】 https://www.bilibili.com/video/BV1Ux411j7ri/?share_source=copy_web&vd_source=bbeafbcfe326916409d46b815d8cb3a3
【【官方雙語】深度學習之反向傳播算法上/下 Part 3 ver 0.9 beta】 https://www.bilibili.com/video/BV16x411V7Qg/?share_source=copy_web&vd_source=bbeafbcfe326916409d46b815d8cb3a3

從數(shù)學原理解釋深度學習的書籍：《動手學深度學習（第二版）》^[28]，結(jié)合本章可以閱讀3.1線性回歸^[29]

8.感謝

感謝原作者 Daniel Bourke，訪問https://www./^[30]可以閱讀英文原文，點擊原作者的Github倉庫：https://github.com/mrdbourke/pytorch-deep-learning/^[31]可以獲得幫助和其他信息。

本文同樣遵守遵守 MIT license^[32]，不受任何限制，包括但不限于權(quán)利

使用、復制、修改、合并、發(fā)布、分發(fā)、再許可和/或出售。但需標明原始作者的許可信息：renhai-lab：https://cdn./。

如果你覺得本系列文章有用，歡迎關(guān)注博客，點贊和收藏，也歡迎在評論區(qū)討論：

我的博客^[33]
我的GITHUB^[34]
我的GITEE^[35]
微信公眾號: renhai-lab
我的知乎^[36]

參考資料

[1]

[PyTorch工作流基礎: https://www./01_pytorch_workflow/

[2]

《使用PyTorch進行深度學習系列》課程介紹: https://cdn./archives/DL-Home

[3]

我的博客: https://cdn./categories/deep-learning

[4]

閱讀原文: https://cdn./archives/DL-02-pytorch-workflow

[5]

Python 3 中的面向?qū)ο缶幊讨改? https://www.runoob.com/python3/python3-class.html

[6]

torch.nn: https://pytorch.org/docs/stable/nn.html

[7]

torch.optim: https://pytorch.org/docs/stable/optim.html

[8]

torch.utils.data.Dataset: https://pytorch.org/docs/stable/data.html#torch.utils.data.Dataset

[9]

torch.utils.data.DataLoader: https://pytorch.org/docs/stable/data.html

[10]

PyTorch Cheat Sheet: https://pytorch.org/tutorials/beginner/ptcheat.html

[11]

.state_dict(): https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.state_dict

[12]

torch.inference_mode(): https://pytorch.org/docs/stable/generated/torch.inference_mode.html

[13]

torch.nn: https://pytorch.org/docs/stable/nn.html#loss-functions

[14]

torch.nn.L1Loss(): https://pytorch.org/docs/stable/generated/torch.nn.L1Loss.html

[15]

torch.nn.BCELoss(): https://pytorch.org/docs/stable/generated/torch.nn.BCELoss.html

[16]

torch.optim: https://pytorch.org/docs/stable/optim.html

[17]

torch.optim.SGD(): https://pytorch.org/docs/stable/generated/torch.optim.SGD.html#torch.optim.SGD

[18]

【【官方雙語】深度學習之梯度下降法 Part 2 】: https://www.bilibili.com/video/BV1Ux411j7ri/?share_source=copy_web&vd_source=bbeafbcfe326916409d46b815d8cb3a3

[19]

學習率調(diào)度: https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate

[20]

官方雙語深度學習之反向傳播算法上/下 Part 3 】: https://www.bilibili.com/video/BV16x411V7Qg/?share_source=copy_web&vd_source=bbeafbcfe326916409d46b815d8cb3a3

[21]

三種主要方法: https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference

[22]

Python 的 pickle 文檔中: https://docs./3/library/pickle.html

[23]

推薦方法: https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-loading-model-for-inference

[24]

保存整個模型: https://pytorch.org/tutorials/beginner/saving_loading_models.html#save-load-entire-model

[25]

PyTorch 保存和加載模型指南: https://pytorch.org/tutorials/beginner/saving_loading_models.html#saving-and-loading-models

[26]

nn.Linear(in_features, out_features): https://pytorch.org/docs/stable/generated/torch.nn.Linear.html

[27]

3brown1blue: https://space.bilibili.com/88461692

[28]

《動手學深度學習（第二版）》: http://zh./

[29]

3.1線性回歸: http://zh./chapter_linear-networks/linear-regression.html#id2

[30]

https://www./: https://www./

[31]

https://github.com/mrdbourke/pytorch-deep-learning/: https://github.com/mrdbourke/pytorch-deep-learning/

[32]

MIT license: https://github.com/renhai-lab/pytorch-deep-learning/blob/cb770bbe688f5950421a76c8b3a47aaa00809c8c/LICENSE

[33]

我的博客: https://cdn./

[34]

我的GITHUB: https://github.com/renhai-lab

[35]

我的GITEE: https:///renhai-lab

[36]

我的知乎: https://www.zhihu.com/people/Ing_ideas

本站是提供個人知識管理的網(wǎng)絡存儲空間，所有內(nèi)容均由用戶發(fā)布，不代表本站觀點。請注意甄別內(nèi)容中的聯(lián)系方式、誘導購買等信息，謹防詐騙。如發(fā)現(xiàn)有害或侵權(quán)內(nèi)容，請點擊一鍵舉報。

轉(zhuǎn)藏 分享

QQ空間 QQ好友新浪微博微信

獻花（0） +1

來自：析模界 > 《網(wǎng)摘》

舉報/認領

0條評論

發(fā)表

請遵守用戶評論公約

類似文章 更多

析模界

關(guān)注對話

TA的最新館藏

Fluent UDF與表達式功能特點比較
使用谷歌Gemini 2.0
利用ParticleWorks模擬齒輪甩油 ParticleWorks是一...
利用ParticleWorks模擬齒輪甩油
利用SpaceClaim創(chuàng)建四旋翼無人機流體域利用SpaceClaim創(chuàng)...
利用SpaceClaim創(chuàng)建四旋翼無人機流體域

喜歡該文的人也喜歡更多

熱門閱讀換一換

一区二区三区日韩精品-日韩经典一区二区三区-五月激情综合丁香婷婷-欧美精品中文字幕专区

02-快速入門：使用PyTorch進行機器學習和深度學習的基本工作流程（筆記 代碼）