PyTorch 入门指南
学习 PyTorch
图像和视频
音频
后端
强化学习
在生产环境中部署 PyTorch 模型
Profiling PyTorch
代码变换与FX
前端API
扩展 PyTorch
模型优化
并行和分布式训练
边缘端的 ExecuTorch
推荐系统
多模态

使用自定义 C++ 类扩展 TorchScript

TorchScript 已不再积极开发。

本教程是custom operator教程的后续内容,介绍了我们为同时将C++类绑定到TorchScript和Python而构建的API。该API与pybind11非常相似,如果您熟悉该系统,大部分概念可以直接迁移过来。

在 C++ 中实现并绑定类

在本教程中,我们将定义一个简单的 C++ 类,该类在成员变量中维护持久状态。

// This header is all you need to do the C++ portions of this
// tutorial
#include<torch/script.h>
// This header is what defines the custom class registration
// behavior specifically. script.h already includes this, but
// we include it here so you know it exists in case you want
// to look at the API or implementation.
#include<torch/custom_class.h>

#include<string>
#include<vector>

template<classT>
structMyStackClass:torch::CustomClassHolder{
std::vector<T>stack_;
MyStackClass(std::vector<T>init):stack_(init.begin(),init.end()){}

voidpush(Tx){
stack_.push_back(x);
}
Tpop(){
autoval=stack_.back();
stack_.pop_back();
returnval;
}

c10::intrusive_ptr<MyStackClass>clone()const{
returnc10::make_intrusive<MyStackClass>(stack_);
}

voidmerge(constc10::intrusive_ptr<MyStackClass>&c){
for(auto&elem:c->stack_){
push(elem);
}
}
};

需要注意以下几点:

  • torch/custom_class.h 是您需要包含的头文件,用于使用自定义类扩展 TorchScript。

  • 请注意,每当我们处理自定义类的实例时,我们都是通过 c10::intrusive_ptr<> 的实例来操作的。可以将 intrusive_ptr 视为类似于 std::shared_ptr 的智能指针,但引用计数直接存储在对象中,而不是存储在单独的元数据块中(如 std::shared_ptr 所做的那样)。torch::Tensor 内部使用相同的指针类型;自定义类也必须使用此指针类型,以便我们能够一致地管理不同的对象类型。

  • 第二点需要注意的是,用户定义的类必须继承自 torch::CustomClassHolder。这确保了自定义类有空间来存储引用计数。

现在让我们来看看如何使这个类对 TorchScript 可见,这个过程称为类的绑定

// Notice a few things:
// - We pass the class to be registered as a template parameter to
//   `torch::class_`. In this instance, we've passed the
//   specialization of the MyStackClass class ``MyStackClass<std::string>``.
//   In general, you cannot register a non-specialized template
//   class. For non-templated classes, you can just pass the
//   class name directly as the template parameter.
// - The arguments passed to the constructor make up the "qualified name"
//   of the class. In this case, the registered class will appear in
//   Python and C++ as `torch.classes.my_classes.MyStackClass`. We call
//   the first argument the "namespace" and the second argument the
//   actual class name.
TORCH_LIBRARY(my_classes,m){
m.class_<MyStackClass<std::string>>("MyStackClass")
// The following line registers the contructor of our MyStackClass
// class that takes a single `std::vector<std::string>` argument,
// i.e. it exposes the C++ method `MyStackClass(std::vector<T> init)`.
// Currently, we do not support registering overloaded
// constructors, so for now you can only `def()` one instance of
// `torch::init`.
.def(torch::init<std::vector<std::string>>())
// The next line registers a stateless (i.e. no captures) C++ lambda
// function as a method. Note that a lambda function must take a
// `c10::intrusive_ptr<YourClass>` (or some const/ref version of that)
// as the first argument. Other arguments can be whatever you want.
.def("top",[](constc10::intrusive_ptr<MyStackClass<std::string>>&self){
returnself->stack_.back();
})
// The following four lines expose methods of the MyStackClass<std::string>
// class as-is. `torch::class_` will automatically examine the
// argument and return types of the passed-in method pointers and
// expose these to Python and TorchScript accordingly. Finally, notice
// that we must take the *address* of the fully-qualified method name,
// i.e. use the unary `&` operator, due to C++ typing rules.
.def("push",&MyStackClass<std::string>::push)
.def("pop",&MyStackClass<std::string>::pop)
.def("clone",&MyStackClass<std::string>::clone)
.def("merge",&MyStackClass<std::string>::merge)
;
}

使用 CMake 将示例构建为 C++ 项目

现在,我们将使用 CMake 构建系统来编译上述 C++ 代码。首先,将我们目前涉及的所有 C++ 代码放入一个名为 class.cpp 的文件中。然后,编写一个简单的 CMakeLists.txt 文件并将其放在同一目录下。以下是 CMakeLists.txt 文件的内容:

cmake_minimum_required(VERSION3.1FATAL_ERROR)
project(custom_class)

find_package(TorchREQUIRED)

# Define our library target
add_library(custom_classSHAREDclass.cpp)
set(CMAKE_CXX_STANDARD14)
# Link against LibTorch
target_link_libraries(custom_class"${TORCH_LIBRARIES}")

同时,创建一个 build 目录。您的文件结构应如下所示:

custom_class_project/
  class.cpp
  CMakeLists.txt
  build/

我们假设您已经按照上一个教程中描述的方式设置了您的环境。接下来,请调用cmake,然后使用make来构建项目:

$cdbuild
$cmake-DCMAKE_PREFIX_PATH="$(python-c'import torch.utils; print(torch.utils.cmake_prefix_path)')"..
*-TheCcompileridentificationisGNU7.3.1
*-TheCXXcompileridentificationisGNU7.3.1
*-CheckforworkingCcompiler:/opt/rh/devtoolset-7/root/usr/bin/cc
*-CheckforworkingCcompiler:/opt/rh/devtoolset-7/root/usr/bin/cc--works
*-DetectingCcompilerABIinfo
*-DetectingCcompilerABIinfo-done
*-DetectingCcompilefeatures
*-DetectingCcompilefeatures-done
*-CheckforworkingCXXcompiler:/opt/rh/devtoolset-7/root/usr/bin/c++
*-CheckforworkingCXXcompiler:/opt/rh/devtoolset-7/root/usr/bin/c++--works
*-DetectingCXXcompilerABIinfo
*-DetectingCXXcompilerABIinfo-done
*-DetectingCXXcompilefeatures
*-DetectingCXXcompilefeatures-done
*-Lookingforpthread.h
*-Lookingforpthread.h-found
*-Lookingforpthread_create
*-Lookingforpthread_create-notfound
*-Lookingforpthread_createinpthreads
*-Lookingforpthread_createinpthreads-notfound
*-Lookingforpthread_createinpthread
*-Lookingforpthread_createinpthread-found
*-FoundThreads:TRUE
*-Foundtorch:/torchbind_tutorial/libtorch/lib/libtorch.so
*-Configuringdone
*-Generatingdone
*-Buildfileshavebeenwrittento:/torchbind_tutorial/build
$make-j
Scanningdependenciesoftargetcustom_class
[50%]BuildingCXXobjectCMakeFiles/custom_class.dir/class.cpp.o
[100%]LinkingCXXsharedlibrarylibcustom_class.so
[100%]Builttargetcustom_class

您会发现,在构建目录中现在(以及其他内容)有一个动态库文件。在 Linux 上,这个文件可能命名为 libcustom_class.so。因此,文件树应该如下所示:

custom_class_project/
  class.cpp
  CMakeLists.txt
  build/
    libcustom_class.so

在 Python 和 TorchScript 中使用 C++ 类

现在我们已经将类及其注册编译为 .so 文件,我们可以将该 .so 文件加载到 Python 中并进行测试。以下是一个演示脚本:

importtorch

# `torch.classes.load_library()` allows you to pass the path to your .so file
# to load it in and make the custom C++ classes available to both Python and
# TorchScript
torch.classes.load_library("build/libcustom_class.so")
# You can query the loaded libraries like this:
print(torch.classes.loaded_libraries)
# prints {'/custom_class_project/build/libcustom_class.so'}

# We can find and instantiate our custom C++ class in python by using the
# `torch.classes` namespace:
#
# This instantiation will invoke the MyStackClass(std::vector<T> init)
# constructor we registered earlier
s = torch.classes.my_classes.MyStackClass(["foo", "bar"])

# We can call methods in Python
s.push("pushed")
assert s.pop() == "pushed"

# Test custom operator
s.push("pushed")
torch.ops.my_classes.manipulate_instance(s)  # acting as s.pop()
assert s.top() == "bar" 

# Returning and passing instances of custom classes works as you'd expect
s2 = s.clone()
s.merge(s2)
for expected in ["bar", "foo", "bar", "foo"]:
    assert s.pop() == expected

# We can also use the class in TorchScript
# For now, we need to assign the class's type to a local in order to
# annotate the type on the TorchScript function. This may change
# in the future.
MyStackClass = torch.classes.my_classes.MyStackClass


@torch.jit.script
defdo_stacks(s: MyStackClass):  # We can pass a custom class instance
    # We can instantiate the class
    s2 = torch.classes.my_classes.MyStackClass(["hi", "mom"])
    s2.merge(s)  # We can call a method on the class
    # We can also return instances of the class
    # from TorchScript function/methods
    return s2.clone(), s2.top()


stack, top = do_stacks(torch.classes.my_classes.MyStackClass(["wow"]))
assert top == "wow"
for expected in ["wow", "mom", "hi"]:
    assert stack.pop() == expected

使用自定义类保存、加载和运行 TorchScript 代码

我们也可以在 C++ 进程中使用 libtorch 来使用自定义注册的 C++ 类。举个例子,让我们定义一个简单的 nn.Module,它会实例化并调用 MyStackClass 类上的方法:

importtorch

torch.classes.load_library('build/libcustom_class.so')


classFoo(torch.nn.Module):
    def__init__(self):
        super().__init__()

    defforward(self, s: str) -> str:
        stack = torch.classes.my_classes.MyStackClass(["hi", "mom"])
        return stack.pop() + s


scripted_foo = torch.jit.script(Foo())
print(scripted_foo.graph)

scripted_foo.save('foo.pt')

foo.pt 现在包含我们刚刚定义的序列化 TorchScript 程序。

接下来,我们将定义一个新的 CMake 项目,以展示如何加载此模型及其所需的 .so 文件。有关如何执行此操作的完整说明,请参阅 在 C++ 中加载 TorchScript 模型教程

与之前类似,我们创建一个包含以下内容的文件结构:

cpp_inference_example/
  infer.cpp
  CMakeLists.txt
  foo.pt
  build/
  custom_class_project/
    class.cpp
    CMakeLists.txt
    build/

请注意,我们已经复制了序列化的 foo.pt 文件,以及来自上述 custom_class_project 的源代码树。我们将把 custom_class_project 作为依赖项添加到这个 C++ 项目中,以便将自定义类构建到二进制文件中。

让我们用以下内容填充 infer.cpp

#include<torch/script.h>

#include<iostream>
#include<memory>

intmain(intargc,constchar*argv[]){
torch::jit::Modulemodule;
try{
// Deserialize the ScriptModule from a file using torch::jit::load().
module=torch::jit::load("foo.pt");
}
catch(constc10::Error&e){
std::cerr<<"error loading the model\n";
return-1;
}

std::vector<c10::IValue>inputs={"foobarbaz"};
autooutput=module.forward(inputs).toString();
std::cout<<output->string()<<std::endl;
}

同样地,让我们定义我们的 CMakeLists.txt 文件:

cmake_minimum_required(VERSION3.1FATAL_ERROR)
project(infer)

find_package(TorchREQUIRED)

add_subdirectory(custom_class_project)

# Define our library target
add_executable(inferinfer.cpp)
set(CMAKE_CXX_STANDARD14)
# Link against LibTorch
target_link_libraries(infer"${TORCH_LIBRARIES}")
# This is where we link in our libcustom_class code, making our
# custom class available in our binary.
target_link_libraries(infer-Wl,--no-as-neededcustom_class)

你知道该怎么做:cd buildcmake,然后 make

$cdbuild
$cmake-DCMAKE_PREFIX_PATH="$(python-c'import torch.utils; print(torch.utils.cmake_prefix_path)')"..
*-TheCcompileridentificationisGNU7.3.1
*-TheCXXcompileridentificationisGNU7.3.1
*-CheckforworkingCcompiler:/opt/rh/devtoolset-7/root/usr/bin/cc
*-CheckforworkingCcompiler:/opt/rh/devtoolset-7/root/usr/bin/cc--works
*-DetectingCcompilerABIinfo
*-DetectingCcompilerABIinfo-done
*-DetectingCcompilefeatures
*-DetectingCcompilefeatures-done
*-CheckforworkingCXXcompiler:/opt/rh/devtoolset-7/root/usr/bin/c++
*-CheckforworkingCXXcompiler:/opt/rh/devtoolset-7/root/usr/bin/c++--works
*-DetectingCXXcompilerABIinfo
*-DetectingCXXcompilerABIinfo-done
*-DetectingCXXcompilefeatures
*-DetectingCXXcompilefeatures-done
*-Lookingforpthread.h
*-Lookingforpthread.h-found
*-Lookingforpthread_create
*-Lookingforpthread_create-notfound
*-Lookingforpthread_createinpthreads
*-Lookingforpthread_createinpthreads-notfound
*-Lookingforpthread_createinpthread
*-Lookingforpthread_createinpthread-found
*-FoundThreads:TRUE
*-Foundtorch:/local/miniconda3/lib/python3.7/site-packages/torch/lib/libtorch.so
*-Configuringdone
*-Generatingdone
*-Buildfileshavebeenwrittento:/cpp_inference_example/build
$make-j
Scanningdependenciesoftargetcustom_class
[25%]BuildingCXXobjectcustom_class_project/CMakeFiles/custom_class.dir/class.cpp.o
[50%]LinkingCXXsharedlibrarylibcustom_class.so
[50%]Builttargetcustom_class
Scanningdependenciesoftargetinfer
[75%]BuildingCXXobjectCMakeFiles/infer.dir/infer.cpp.o
[100%]LinkingCXXexecutableinfer
[100%]Builttargetinfer

现在我们可以运行我们令人兴奋的 C++ 可执行文件了:

$./infer
momfoobarbaz

太棒了!

自定义类与 IValue 的相互转换

您可能还需要将自定义类移入或移出 IValue,例如从 TorchScript 方法中获取或返回 IValue,或者希望在 C++ 中实例化一个自定义类属性。要从自定义 C++ 类实例创建 IValue,可以这样做:

  • torch::make_custom_class<T>() 提供了一个与 c10::intrusive_ptr<T> 类似的 API,它会接受你提供的任意一组参数,调用与这组参数匹配的 T 的构造函数,并将该实例包装后返回。然而,与仅返回自定义类对象的指针不同,它返回一个包装了该对象的 IValue。然后,你可以直接将这个 IValue 传递给 TorchScript。

  • 如果你已经有一个指向你类的 intrusive_ptr,你可以直接使用 IValue(intrusive_ptr<T>) 构造函数从中构造一个 IValue

IValue 转换回自定义类:

  • IValue::toCustomClass<T>() 将返回一个指向 IValue 包含的自定义类的 intrusive_ptr<T>。在内部,该函数会检查 T 是否已注册为自定义类,并确认 IValue 确实包含一个自定义类。您可以通过调用 isCustomClass() 手动检查 IValue 是否包含自定义类。

为自定义 C++ 类定义序列化/反序列化方法

如果您尝试保存一个将自定义绑定的 C++ 类作为属性的 ScriptModule,您将会收到以下错误:

# export_attr.py
importtorch

torch.classes.load_library('build/libcustom_class.so')


classFoo(torch.nn.Module):
    def__init__(self):
        super().__init__()
        self.stack = torch.classes.my_classes.MyStackClass(["just", "testing"])

    defforward(self, s: str) -> str:
        return self.stack.pop() + s


scripted_foo = torch.jit.script(Foo())

scripted_foo.save('foo.pt')
loaded = torch.jit.load('foo.pt')

print(loaded.stack.pop())
$pythonexport_attr.py
RuntimeError:CannotserializecustomboundC++class__torch__.torch.classes.my_classes.MyStackClass.Pleasedefineserializationmethodsviadef_pickleforthisclass.(pushIValueImplat../torch/csrc/jit/pickler.cpp:128)

这是因为 TorchScript 无法自动确定从您的 C++ 类中保存哪些信息。您必须手动指定。实现方式是在类上使用 class_ 的特殊 def_pickle 方法定义 __getstate____setstate__ 方法。

__getstate____setstate__ 在 TorchScript 中的语义与 Python pickle 模块中的语义相同。您可以了解更多关于我们如何使用这些方法的信息。

以下是一个示例,展示了我们可以将 def_pickle 调用添加到 MyStackClass 的注册中,以包含序列化方法:

// class_<>::def_pickle allows you to define the serialization
// and deserialization methods for your C++ class.
// Currently, we only support passing stateless lambda functions
// as arguments to def_pickle
.def_pickle(
// __getstate__
// This function defines what data structure should be produced
// when we serialize an instance of this class. The function
// must take a single `self` argument, which is an intrusive_ptr
// to the instance of the object. The function can return
// any type that is supported as a return value of the TorchScript
// custom operator API. In this instance, we've chosen to return
// a std::vector<std::string> as the salient data to preserve
// from the class.
[](constc10::intrusive_ptr<MyStackClass<std::string>>&self)
*>std::vector<std::string>{
returnself->stack_;
},
// __setstate__
// This function defines how to create a new instance of the C++
// class when we are deserializing. The function must take a
// single argument of the same type as the return value of
// `__getstate__`. The function must return an intrusive_ptr
// to a new instance of the C++ class, initialized however
// you would like given the serialized state.
[](std::vector<std::string>state)
*>c10::intrusive_ptr<MyStackClass<std::string>>{
// A convenient way to instantiate an object and get an
// intrusive_ptr to it is via `make_intrusive`. We use
// that here to allocate an instance of MyStackClass<std::string>
// and call the single-argument std::vector<std::string>
// constructor with the serialized state.
returnc10::make_intrusive<MyStackClass<std::string>>(std::move(state));
});

我们在 pickle API 中采取了与 pybind11 不同的方法。pybind11 使用一个特殊的函数 pybind11::pickle(),并将其传递给 class_::def(),而我们则为此目的提供了一个单独的方法 def_pickle。这是因为 torch::jit::pickle 这个名字已经被占用了,我们不想引起混淆。

通过这种方式定义(反)序列化行为后,我们的脚本现在可以成功运行:

$python../export_attr.py
testing

定义接收或返回绑定 C++ 类的自定义操作符

一旦您定义了一个自定义的 C++ 类,您也可以将该类作为参数或返回类型用于自定义操作符(即自由函数)。假设您有以下自由函数:

c10::intrusive_ptr<MyStackClass<std::string>>manipulate_instance(constc10::intrusive_ptr<MyStackClass<std::string>>&instance){
instance->pop();
returninstance;
}

您可以在 TORCH_LIBRARY 块中运行以下代码来注册它:

m.def(
"manipulate_instance(__torch__.torch.classes.my_classes.MyStackClass x) -> __torch__.torch.classes.my_classes.MyStackClass Y",
manipulate_instance
);

请参考自定义操作教程以获取有关注册API的更多详细信息。

完成此操作后,您可以像以下示例一样使用该操作:

classTryCustomOp(torch.nn.Module):
    def__init__(self):
        super(TryCustomOp, self).__init__()
        self.f = torch.classes.my_classes.MyStackClass(["foo", "bar"])

    defforward(self):
        return torch.ops.my_classes.manipulate_instance(self.f)

注册一个以 C++ 类作为参数的运算符时,要求该自定义类必须已经注册。您可以通过确保自定义类的注册和自由函数的定义位于同一个 TORCH_LIBRARY 代码块中,并且自定义类的注册在先来强制执行这一要求。未来,我们可能会放宽这一限制,使得它们可以以任意顺序注册。

总结

本教程向您介绍了如何将 C++ 类暴露给 TorchScript(以及扩展的 Python),如何注册其方法,如何从 Python 和 TorchScript 中使用该类,以及如何保存和加载使用该类的代码,并在独立的 C++ 进程中运行该代码。现在,您已经准备好使用 C++ 类扩展您的 TorchScript 模型,这些类可以与第三方 C++ 库接口,或者实现任何需要在 Python、TorchScript 和 C++ 之间无缝衔接的用例。

如往常一样,如果您遇到任何问题或有疑问,可以使用我们的论坛GitHub issues 与我们联系。此外,我们的常见问题(FAQ)页面可能包含有用的信息。

本页目录