Windows 常见问题解答¶

从源代码编译

包含可选组件

Windows PyTorch 支持两种组件：MKL 和 MAGMA。以下是使用它们的构建步骤。

REM Make sure you have 7z and curl installed.

REM Download MKL files
curl https://s3.amazonaws.com/ossci-windows/mkl_2020.2.254.7z -k -O
7z x -aoa mkl_2020.2.254.7z -omkl

REM Download MAGMA files
REM version available:
REM 2.5.4 (CUDA 10.1 10.2 11.0 11.1) x (Debug Release)
REM 2.5.3 (CUDA 10.1 10.2 11.0) x (Debug Release)
REM 2.5.2 (CUDA 9.2 10.0 10.1 10.2) x (Debug Release)
REM 2.5.1 (CUDA 9.2 10.0 10.1 10.2) x (Debug Release)
set CUDA_PREFIX=cuda102
set CONFIG=release
curl -k https://s3.amazonaws.com/ossci-windows/magma_2.5.4_%CUDA_PREFIX%_%CONFIG%.7z -o magma.7z
7z x -aoa magma.7z -omagma

REM Setting essential environment variables
set "CMAKE_INCLUDE_PATH=%cd%\mkl\include"
set "LIB=%cd%\mkl\lib;%LIB%"
set "MAGMA_HOME=%cd%\magma"

加速 Windows CUDA 构建

Visual Studio 目前不支持并行自定义任务。作为替代方案，我们可以使用 Ninja 来并行化 CUDA 构建任务。只需几行代码即可使用。

REM Let's install ninja first.
pip install ninja

REM Set it as the cmake generator
set CMAKE_GENERATOR=Ninja

一个关键的安装脚本

您可以查看这个脚本集。它将为您指明方向。

扩展 ¶

CFFI 扩展 ¶

对 CFFI 扩展的支持非常实验性。您必须在 Extension 对象中指定额外的 libraries 以在 Windows 上构建。

ffi = create_extension(
    '_ext.my_lib',
    headers=headers,
    sources=sources,
    define_macros=defines,
    relative_to=__file__,
    with_cuda=with_cuda,
    extra_compile_args=["-std=c99"],
    libraries=['ATen', '_C'] # Append cuda libraries when necessary, like cudart
)

Cpp 扩展 ¶

与之前版本相比，这种扩展具有更好的支持。然而，它仍然需要一些手动配置。首先，您应该打开 VS 2017 的 x86_x64 交叉工具命令提示符。然后，您就可以开始编译过程了。

安装¶

win-32 通道中未找到该软件包。 ¶

Solving environment: failed

PackagesNotFoundError: The following packages are not available from current channels:

- pytorch

Current channels:
- https://conda.anaconda.org/pytorch/win-32
- https://conda.anaconda.org/pytorch/noarch
- https://repo.continuum.io/pkgs/main/win-32
- https://repo.continuum.io/pkgs/main/noarch
- https://repo.continuum.io/pkgs/free/win-32
- https://repo.continuum.io/pkgs/free/noarch
- https://repo.continuum.io/pkgs/r/win-32
- https://repo.continuum.io/pkgs/r/noarch
- https://repo.continuum.io/pkgs/pro/win-32
- https://repo.continuum.io/pkgs/pro/noarch
- https://repo.continuum.io/pkgs/msys2/win-32
- https://repo.continuum.io/pkgs/msys2/noarch

PyTorch 在 32 位系统上无法工作。请使用 64 位 Windows 和 Python 版本。

导入错误

from torch._C import *

ImportError: DLL load failed: The specified module could not be found.

问题是由缺少必要的文件引起的。实际上，我们已经包括了几乎所有 PyTorch conda 包所需的必要文件，除了 VC2017 redistributable 和一些 mkl 库。你可以通过输入以下命令来解决。

conda install -c peterjc123 vc vs2017_runtime
conda install mkl_fft intel_openmp numpy mkl

对于 wheel 包，由于我们没有打包一些库和 VS2017 redistributable 文件，请确保你手动安装它们。VS 2017 redistributable 安装程序可以下载。你还应该注意你的 Numpy 安装。确保它使用 MKL 而不是 OpenBLAS。你可能需要输入以下命令。

pip install numpy mkl intel-openmp mkl_fft

另一个可能的原因可能是你正在使用没有 NVIDIA 图形卡的 GPU 版本。请将你的 GPU 包替换为 CPU 版本。

from torch._C import *

ImportError: DLL load failed: The operating system cannot run %1.

这实际上是一个 Anaconda 的上游问题。当你使用 conda-forge 通道初始化你的环境时，这个问题就会出现。你可以通过以下命令修复 intel-openmp 库。

conda install -c defaults intel-openmp -f

使用方法（多进程）¶

没有 if-clause 保护的错误处理

RuntimeError:
       An attempt has been made to start a new process before the
       current process has finished its bootstrapping phase.

   This probably means that you are not using fork to start your
   child processes and you have forgotten to use the proper idiom
   in the main module:

       if __name__ == '__main__':
           freeze_support()
           ...

   The "freeze_support()" line can be omitted if the program
   is not going to be frozen to produce an executable.

在 Windows 上， multiprocessing 的实现与使用 spawn 代替 fork 不同。因此，我们必须用 if-clause 包装代码以防止代码多次执行。请将你的代码重构为以下结构。

import torch

def main()
    for i, data in enumerate(dataloader):
        # do something here

if __name__ == '__main__':
    main()

多进程错误“管道已损坏” ¶

ForkingPickler(file, protocol).dump(obj)

BrokenPipeError: [Errno 32] Broken pipe

当子进程在父进程完成发送数据之前结束，就会出现这个问题。可能是你的代码有问题。你可以通过将 num_worker 的 DataLoader 减少到零来调试你的代码，看看问题是否仍然存在。

多进程错误“驱动程序已关闭” ¶

Couldn’t open shared file mapping: <torch_14808_1591070686>, error code: <1455> at torch\lib\TH\THAllocator.c:154

[windows] driver shut down

请更新你的显卡驱动程序。如果这个问题仍然存在，可能是你的显卡太旧或者计算量太大，无法处理。请根据这篇帖子更新 TDR 设置。

CUDA IPC 操作

THCudaCheck FAIL file=torch\csrc\generic\StorageSharing.cpp line=252 error=63 : OS call failed or operation not supported on this OS

这些在 Windows 上不受支持。像在 CUDA 张量上做多进程这样的操作无法成功，有两种替代方案。

1. 不要使用 multiprocessing 。将 num_worker 的 DataLoader 设置为 0。

2. 相反，共享 CPU 张量。确保你的自定义 DataSet 返回 CPU 张量。