在 C++中加载 TorchScript 模型 ¶

创建于：2025 年 4 月 1 日 | 最后更新：2025 年 4 月 1 日 | 最后验证：2024 年 11 月 5 日

警告

TorchScript 不再处于活跃开发状态。

如其名所示，PyTorch 的主要接口是 Python 编程语言。虽然 Python 对于需要动态性和迭代便捷性的许多场景来说是一个合适且首选的语言，但在许多情况下，Python 的这些特性并不理想。后者经常适用的一个环境就是生产环境——一个低延迟和严格部署要求的地方。对于生产场景，C++通常是首选的语言，即使只是为了将其绑定到 Java、Rust 或 Go 等其他语言。以下段落将概述 PyTorch 提供从现有 Python 模型到纯 C++可加载和执行的可序列化表示的方法，无需依赖 Python。

第一步：将您的 PyTorch 模型转换为 Torch Script

PyTorch 模型从 Python 到 C++的旅程由 Torch Script 实现，它是 PyTorch 模型的一种表示形式，可以被 Torch Script 编译器理解、编译和序列化。如果您是从现有的使用纯“eager”API 编写的 PyTorch 模型开始，您必须首先将您的模型转换为 Torch Script。在以下最常见的情况下，这只需要很少的努力。如果您已经有了 Torch Script 模块，您可以跳到本教程的下一部分。

将 PyTorch 模型转换为 Torch Script 有两种方法。第一种称为跟踪，这是一种通过使用示例输入评估模型一次并记录这些输入通过模型流动的结构的机制。这适用于使用控制流有限度的模型。第二种方法是在您的模型中添加显式注释，以告知 Torch Script 编译器它可以直接解析和编译您的模型代码，前提是遵守 Torch Script 语言施加的约束。

提示

你可以在官方 Torch Script 参考中找到这两种方法的完整文档，以及关于如何选择的进一步指导。

通过 Tracing 转换为 Torch Script

要通过 tracing 将 PyTorch 模型转换为 Torch Script，你必须将你的模型实例以及一个示例输入传递给 torch.jit.trace 函数。这将生成一个包含你的模型评估跟踪的 torch.jit.ScriptModule 对象，该对象嵌入在模块的 forward 方法中：

import torch
import torchvision

# An instance of your model.
model = torchvision.models.resnet18()

# An example input you would normally provide to your model's forward() method.
example = torch.rand(1, 3, 224, 224)

# Use torch.jit.trace to generate a torch.jit.ScriptModule via tracing.
traced_script_module = torch.jit.trace(model, example)

跟踪后的 ScriptModule 现在可以像常规 PyTorch 模块一样进行评估：

In[1]: output = traced_script_module(torch.ones(1, 3, 224, 224))
In[2]: output[0, :5]
Out[2]: tensor([-0.2698, -0.0381,  0.4023, -0.3010, -0.0448], grad_fn=<SliceBackward>)

通过注释转换为 Torch Script

在某些情况下，例如如果您的模型使用特定的控制流形式，您可能希望直接使用 Torch Script 编写您的模型，并相应地注释您的模型。例如，假设您有一个以下纯 Pytorch 模型：

import torch

class MyModule(torch.nn.Module):
    def __init__(self, N, M):
        super(MyModule, self).__init__()
        self.weight = torch.nn.Parameter(torch.rand(N, M))

    def forward(self, input):
        if input.sum() > 0:
          output = self.weight.mv(input)
        else:
          output = self.weight + input
        return output

由于该模块的 forward 方法使用了依赖于输入的控制流，因此它不适合进行跟踪。相反，我们可以将其转换为 ScriptModule 。要将模块转换为 ScriptModule ，需要使用 torch.jit.script 编译模块，如下所示：

class MyModule(torch.nn.Module):
    def __init__(self, N, M):
        super(MyModule, self).__init__()
        self.weight = torch.nn.Parameter(torch.rand(N, M))

    def forward(self, input):
        if input.sum() > 0:
          output = self.weight.mv(input)
        else:
          output = self.weight + input
        return output

my_module = MyModule(10,20)
sm = torch.jit.script(my_module)

如果您需要排除您的 nn.Module 中的某些方法，因为这些方法使用了 TorchScript 尚不支持的功能，您可以使用 @torch.jit.ignore 进行注释

sm 是一个实例，已准备好进行序列化。

步骤 2：将脚本模块序列化到文件中 ¶

一旦你手头有了 ScriptModule ，无论是从 PyTorch 模型的跟踪或注释中获取，你就可以将其序列化到文件中。稍后，你将能够从这个文件中加载模块并在 C++ 中执行它，而无需依赖 Python。假设我们想要序列化之前跟踪示例中显示的 ResNet18 模型。要执行此序列化，只需在模块上调用 save 并传递一个文件名：

traced_script_module.save("traced_resnet_model.pt")

这将在你的工作目录中生成一个 traced_resnet_model.pt 文件。如果你还想序列化 sm ，请调用 sm.save("my_module_model.pt") 。我们现在正式离开了 Python 的领域，准备进入 C++ 的领域。

第 3 步：在 C++中加载脚本模块

要在 C++中加载您的序列化 PyTorch 模型，您的应用程序必须依赖于 PyTorch C++ API，也称为 LibTorch。LibTorch 发行版包含一系列共享库、头文件和 CMake 构建配置文件。虽然 CMake 不是依赖 LibTorch 的必要条件，但它是一种推荐的方法，并且在未来将得到良好的支持。在本教程中，我们将使用 CMake 和 LibTorch 构建一个最小的 C++应用程序，该应用程序仅加载并执行一个序列化的 PyTorch 模型。

最小 C++应用程序

让我们先讨论加载模块的代码。以下代码将做到这一点：

#include <torch/script.h> // One-stop header.

#include <iostream>
#include <memory>

int main(int argc, const char* argv[]) {
  if (argc != 2) {
    std::cerr << "usage: example-app <path-to-exported-script-module>\n";
    return -1;
  }


  torch::jit::script::Module module;
  try {
    // Deserialize the ScriptModule from a file using torch::jit::load().
    module = torch::jit::load(argv[1]);
  }
  catch (const c10::Error& e) {
    std::cerr << "error loading the model\n";
    return -1;
  }

  std::cout << "ok\n";
}

The <torch/script.h> header includes all relevant includes from the LibTorch library necessary to run the example. Our application accepts the file path to a serialized PyTorch ScriptModule as its only command line argument and then proceeds to deserialize the module using the torch::jit::load() function, which takes this file path as input. In return we receive a torch::jit::script::Module object. We will examine how to execute it in a moment.

Depending on LibTorch and Building the Application¶

假设我们将上述代码存储到一个名为 example-app.cpp 的文件中。构建它的最小化 CMakeLists.txt 可能看起来非常简单：

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(custom_ops)

find_package(Torch REQUIRED)

add_executable(example-app example-app.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}")
set_property(TARGET example-app PROPERTY CXX_STANDARD 17)

构建示例应用程序的最后一步是获取 LibTorch 发行版。您可以从 PyTorch 网站的下载页面获取最新的稳定版本。如果您下载并解压最新存档，应该会收到一个包含以下目录结构的文件夹：

libtorch/
  bin/
  include/
  lib/
  share/

lib/ 文件夹包含您必须链接的共享库，
include/ 文件夹包含您的程序需要包含的头文件，
share/ 文件夹包含启用上述简单 find_package(Torch) 命令所需的 CMake 配置，

提示

在 Windows 上，调试和发布版本不兼容 ABI。如果您计划以调试模式构建项目，请尝试使用 LibTorch 的调试版本。同时，请确保在下面的 cmake --build . 行中指定正确的配置。

最后一步是构建应用程序。为此，假设我们的示例目录布局如下：

example-app/
  CMakeLists.txt
  example-app.cpp

我们现在可以运行以下命令从 example-app/ 文件夹中构建应用程序：

mkdir build
cd build
cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
cmake --build . --config Release

其中 /path/to/libtorch 应该是解压后的 LibTorch 分布的完整路径。如果一切顺利，它看起来应该像这样：

root@4b5a67132e81:/example-app# mkdir build
root@4b5a67132e81:/example-app# cd build
root@4b5a67132e81:/example-app/build# cmake -DCMAKE_PREFIX_PATH=/path/to/libtorch ..
-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Looking for pthread_create
-- Looking for pthread_create - not found
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - found
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /example-app/build
root@4b5a67132e81:/example-app/build# make
Scanning dependencies of target example-app
[ 50%] Building CXX object CMakeFiles/example-app.dir/example-app.cpp.o
[100%] Linking CXX executable example-app
[100%] Built target example-app

如果我们将之前创建的 ResNet18 模型 traced_resnet_model.pt 的路径提供给生成的 example-app 二进制文件，我们应该得到一个友好的“ok”。请注意，如果您尝试使用 my_module_model.pt 运行此示例，您将收到一个错误，表明您的输入形状不兼容。 my_module_model.pt 期望 1D 而不是 4D。

root@4b5a67132e81:/example-app/build# ./example-app <path_to_model>/traced_resnet_model.pt
ok

步骤 4：在 C++中执行脚本模块

成功加载我们的序列化 ResNet18 后，我们现在只需几行代码就可以执行它了！让我们将这些行添加到我们的 C++应用程序的 main() 函数中：

// Create a vector of inputs.
std::vector<torch::jit::IValue> inputs;
inputs.push_back(torch::ones({1, 3, 224, 224}));

// Execute the model and turn its output into a tensor.
at::Tensor output = module.forward(inputs).toTensor();
std::cout << output.slice(/*dim=*/1, /*start=*/0, /*end=*/5) << '\n';

前两行设置模型的输入。我们创建一个 torch::jit::IValue （一个类型擦除的值类型 script::Module 方法接受和返回）的向量，并添加一个输入。为了创建输入张量，我们使用 torch::ones() ，这是 C++ API 中的 torch.ones 的等效。然后我们运行 script::Module 的 forward 方法，将我们创建的输入向量传递给它。作为回报，我们得到一个新的 IValue ，我们通过调用 toTensor() 将其转换为张量。

提示

要了解有关 torch::ones 和 PyTorch C++ API 等功能的更多信息，请参阅其文档：https://maskerprc.github.io/cppdocs。PyTorch C++ API 提供了与 Python API 几乎相同的功能，允许您像在 Python 中一样进一步操作和处理张量。

在最后一行，我们打印输出结果的前五项。由于我们在本教程的早期阶段以相同的输入在 Python 中向我们的模型提供了输入，因此我们应该理想地看到相同的输出。让我们通过重新编译我们的应用程序并使用相同的序列化模型运行它来尝试一下：

root@4b5a67132e81:/example-app/build# make
Scanning dependencies of target example-app
[ 50%] Building CXX object CMakeFiles/example-app.dir/example-app.cpp.o
[100%] Linking CXX executable example-app
[100%] Built target example-app
root@4b5a67132e81:/example-app/build# ./example-app traced_resnet_model.pt
-0.2698 -0.0381  0.4023 -0.3010 -0.0448
[ Variable[CPUFloatType]{1,5} ]

以下为 Python 中之前的输出参考：

tensor([-0.2698, -0.0381,  0.4023, -0.3010, -0.0448], grad_fn=<SliceBackward>)

看起来很匹配！

提示

要将您的模型移动到 GPU 内存中，您可以写入 model.to(at::kCUDA); 。确保模型输入也位于 CUDA 内存中，可以通过调用 tensor.to(at::kCUDA) 实现，这将返回一个新的 CUDA 内存中的张量。

步骤 5：获取帮助和探索 API ¶

本教程可能已经让您对 PyTorch 模型从 Python 到 C++的路径有了基本的理解。通过本教程中描述的概念，您应该能够从纯“急切”的 PyTorch 模型，到 Python 中的编译 ScriptModule ，再到磁盘上的序列化文件，最后到 C++中的可执行 script::Module ，完成整个循环。

当然，我们还没有涵盖许多概念。例如，您可能会发现自己想要扩展 ScriptModule ，在 C++或 CUDA 中实现自定义操作符，并在您的纯 C++生产环境中执行此自定义操作符。好消息是：这是可能的，并且得到了良好的支持！目前，您可以探索这个文件夹中的示例，我们将在不久的将来发布教程。在此期间，以下链接可能一般会有所帮助：

火炬脚本参考：https://maskerprc.github.io/docs/master/jit.html
PyTorch C++ API 文档：https://maskerprc.github.io/cppdocs/
PyTorch Python API 文档：https://maskerprc.github.io/docs/

如有问题或疑问，您可以使用我们的论坛或 GitHub 问题来联系。