torch.cuda.comm.scatter¶

torch.cuda.comm.scatter(tensor, devices=None, chunk_sizes=None, dim=0, streams=None, *, out=None)[source][source]¶

将张量分散到多个 GPU 上。

参数:

张量（Tensor）- 要散列的张量。可以是 CPU 或 GPU。
devices（可迭代对象[torch.device, str 或 int]，可选）- 在其中散列的 GPU 设备集合。
chunk_sizes（可迭代对象[int]，可选）- 要放置在每个设备上的数据块大小。其长度应与 devices 匹配，总和应等于 tensor.size(dim) 。如果未指定，则将 tensor 平均分成等大小的块。
dim（int，可选）- 用于对 tensor 进行分块的一个维度。默认： 0 。
streams (Iterable[torch.cuda.Stream], optional) – 在其中执行 scatter 的 Streams 的可迭代对象。如果未指定，将使用默认流。
out (Sequence[Tensor], optional, keyword-only) – 存储输出结果的 GPU 张量序列。这些张量的大小必须与 tensor 匹配，除了 dim ，其总大小必须等于 tensor.size(dim) 。

注意

Exactly one of devices and out must be specified. When out is specified, chunk_sizes must not be specified and will be inferred from sizes of out.

返回值:

如果指定了 devices ，
包含 tensor 的块元组，放置在 devices 上。

如果指定了 out ，
包含 out 个张量的元组，每个张量包含 tensor 个片段。

torch.cuda.comm.scatter¶

文档

教程

资源