torch.compiler

torch.compiler 是一个命名空间，其中包含一些供用户使用的内部编译器方法。该命名空间的主要功能是 torch.compile。

torch.compile 是在 PyTorch 2.x 中引入的一个函数，旨在解决 PyTorch 中的准确图捕获问题，并帮助软件工程师更快地运行他们的 PyTorch 程序。该功能用 Python 编写，标志着 PyTorch 从 C++ 向 Python 的转变。

torch.compile 使用以下底层技术：

TorchDynamo (torch._dynamo) 是一个内部API，它使用CPython的Frame Evaluation API功能安全地捕获PyTorch图。为PyTorch用户提供的外部可用方法通过torch.compiler 命名空间提供。
TorchInductor 是默认的 torch.compile 深度学习编译器，能够为多个加速器和后端生成快速代码。为了通过 torch.compile 实现性能提升，你需要使用一个后端编译器。对于 NVIDIA、AMD 和 Intel 的 GPU，它利用 OpenAI Triton 作为关键构建模块。
AOT Autograd 不仅捕获用户级别的代码，还捕获反向传播过程，从而实现提前完成反向传递的记录。这样就可以利用 TorchInductor 加速正向和反向传递。

注意

在某些情况下，术语 torch.compile、TorchDynamo 和 torch.compiler 在本文档中可能互换使用。

如上所述，为了更快地运行工作流，torch.compile 通过 TorchDynamo 需要一个将捕获的图转换为快速机器码的后端。不同的后端可以提供各种优化效果。默认后端称为 TorchInductor，也被称为 inductor。TorchDynamo 列出了由我们的合作伙伴开发的支持后端列表，可以通过运行 torch.compiler.list_backends() 查看每个后端及其可选依赖项。

一些最常见的后端包括：

训练和推理后端

后端	描述
`torch.compile(m, backend="inductor")`	使用 TorchInductor 后端。详情请参阅
`torch.compile(m, backend="cudagraphs")`	CUDA 图和 AOT 自动微分。了解更多
`torch.compile(m, backend="ipex")`	在CPU上使用IPEX。了解详情
`torch.compile(m, backend="onnxrt")`	在 CPU/GPU 上使用 ONNX Runtime 进行训练。了解更多

仅用于推理的后端

后端	描述
`torch.compile(m, backend="tensorrt")`	使用 Torch-TensorRT 进行推理优化。需要在调用脚本中导入 `import torch_tensorrt` 以注册后端。了解更多
`torch.compile(m, backend="ipex")`	使用 IPEX 在 CPU 上进行推理。了解更多
`torch.compile(m, backend="tvm")`	使用 Apache TVM 进行推理优化。了解更多
`torch.compile(m, backend="openvino")`	使用 OpenVINO 进行推理优化。详情请点击这里

了解更多

PyTorch 入门指南

PyTorch开发者的深入探索

PyTorch 后端供应商操作指南