torch.func.functional_call

torch.func.functional_call(module, parameter_and_buffer_dicts, args, kwargs=None, *, tie_weights=True, strict=False)

使用提供的参数和缓冲区替换模块的原有参数和缓冲区，从而执行功能调用。

注意

如果模块具有活动的参数化，可以在 parameter_and_buffer_dicts 参数中传递一个名称与常规参数名相同的值来完全禁用该参数化。如果你想应用参数化函数到传递的值，请将键设置为 {submodule_name}.parametrizations.{parameter_name}.original。

注意

如果模块对参数和缓冲区进行就地操作，这些更改将体现在 parameter_and_buffer_dicts 输入中。

示例：

>>> a = {'foo': torch.zeros(())}
>>> mod = Foo()  # does self.foo = self.foo + 1
>>> print(mod.foo)  # tensor(0.)
>>> functional_call(mod, a, torch.ones(()))
>>> print(mod.foo)  # tensor(0.)
>>> print(a['foo'])  # tensor(1.)

注意

如果模块具有绑定权重，则functional_call 是否遵守这些绑定取决于 tie_weights 标志。

示例：

>>> a = {'foo': torch.zeros(())}
>>> mod = Foo()  # has both self.foo and self.foo_tied which are tied. Returns x + self.foo + self.foo_tied
>>> print(mod.foo)  # tensor(1.)
>>> mod(torch.zeros(()))  # tensor(2.)
>>> functional_call(mod, a, torch.zeros(()))  # tensor(0.) since it will change self.foo_tied too
>>> functional_call(mod, a, torch.zeros(()), tie_weights=False)  # tensor(1.)--self.foo_tied is not updated
>>> new_a = {'foo': torch.zeros(()), 'foo_tied': torch.zeros(())}
>>> functional_call(mod, new_a, torch.zeros()) # tensor(0.)

多个字典的传递示例

a = ({'weight': torch.ones(1, 1)}, {'buffer': torch.zeros(1)})  # two separate dictionaries
mod = nn.Bar(1, 1)  # return self.weight @ x + self.buffer
print(mod.weight)  # tensor(...)
print(mod.buffer)  # tensor(...)
x = torch.randn((1, 1))
print(x)
functional_call(mod, a, x)  # same as x
print(mod.weight)  # same as before functional_call

这是将grad变换应用到模型参数的一个示例。

import torch
import torch.nn as nn
from torch.func import functional_call, grad

x = torch.randn(4, 3)
t = torch.randn(4, 3)
model = nn.Linear(3, 3)

def compute_loss(params, x, t):
    y = functional_call(model, params, x)
    return nn.functional.mse_loss(y, t)

grad_weights = grad(compute_loss)(dict(model.named_parameters()), x, t)

注意

如果用户在梯度变换之外不需要进行梯度跟踪，可以分离所有参数以提高性能和减少内存 usage。

请注意"memory usage"可直接译为“内存使用”，保持术语一致性。因此更正后的句子应为：

如果用户在梯度变换之外不需要进行梯度跟踪，可以分离所有参数以提高性能并优化内存使用。

示例：

>>> detached_params = {k: v.detach() for k, v in model.named_parameters()}
>>> grad_weights = grad(compute_loss)(detached_params, x, t)
>>> grad_weights.grad_fn  # None--it's not tracking gradients outside of grad

这意味着用户无法调用grad_weight.backward()。但是，如果他们在转换过程中不需要自动求梯度跟踪，这样可以减少内存使用并提高运行速度。

参数

module (torch.nn.Module) – 需要调用的模块
parameters_and_buffer_dicts (Dict[str, Tensor] 或 tuple of Dict[str, Tensor]) – 用于模块调用的参数。如果给定一个字典元组，这些字典必须具有不同的键，以便所有字典可以一起使用。
args (Any 或 tuple) – 传递给模块调用的参数。如果不是元组，则视为单个参数。
kwargs (dict) – 模块调用所需的关键词参数
tie_weights (bool, 可选) – 如果为 True，原始模型中绑定的参数和缓冲区在重新参数化版本中也将被视为绑定。因此，如果传递给这些绑定的不同值，则会引发错误。如果为 False，则除非两个权重的值相同，否则不会尊重原始绑定的参数和缓冲区。默认值：True。
strict (bool, 可选) – 如果为 True，传入的参数和缓冲区必须与原始模块中的完全匹配。否则，如果有任何缺失或额外的关键字，则会引发错误。默认值：False。

返回值

调用 module 的结果。

返回类型

任何