-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tensorrt convert init #10144
tensorrt convert init #10144
Conversation
void RegisterOpConverters(); | ||
|
||
// convert a fluid Mul op to tensorrt fc layer without bias | ||
static void ConvertMul(const framework::OpDesc& op); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Each op a static member function? better to be a normal function, or this class will have O(N) static member functions, weried.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 使用注册机制。
@@ -0,0 +1,51 @@ | |||
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is a convert
subdirectory needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为convert目录下会有convert_mul, convert_conv2d等多个op文件,所以采用目录形式。
} | ||
|
||
void TensorRTConverter::RegisterOpConverters() { | ||
op_registry_["mul"] = ConvertMul; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a normal functor, just like fluid::Operator
:
struct MulOpConverter : public OpConverter {
void operator()(const OpDesc&);
};
REGISTER_TRT_OP_CONVETER("mul", MulOpConveter);
this way, it seems more natural to add more converters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done 采用注册机制。
GetOpConverter()[#op_type] = new convert_class; \ | ||
} \ | ||
}; \ | ||
convert_class##Register convert_class##reg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static convert_class##Register convert_class##reg;
namespace inference { | ||
namespace tensorrt { | ||
|
||
class ConverterBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ConvertBase
just have data, no need to be a base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
virtual void Convert(const framework::OpDesc& op) = 0; | ||
}; | ||
|
||
static std::unordered_map<std::string, OpConverter*>& GetOpConverter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpConverter
can be a unique class without inherit ConverterBase
,
and it has no relation with TensorRTConverter
, so no need to inherit the same class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,36 @@ | |||
/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to have the same file name with the operators.
For example, for conv2d_op.h
, there is a convert/conv2d_op.h
, so that one can see the relationship between them by search files.
}; \ | ||
convert_class##Register convert_class##reg; | ||
|
||
class TensorRTConverter : public ConverterBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TensorRTConveter
can be a unique class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
void Convert(const framework::OpDesc& op) { | ||
std::string type = op.Type(); | ||
OpConverter& op_converter = this->register_op_converter_[type]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto& OpConverter& op_converter = OpConveter::register_op_converter_[type];
maybe this works.
register_op_conveter_ is a static member of OpConverter
@@ -0,0 +1,2 @@ | |||
nv_library(tensorrt_convert SRCS convert.cc mul_op.cc conv2d_op.cc DEPS dynload_cuda) | |||
nv_test(test_tensorrt_convert SRCS test_convert.cc DEPS tensorrt paddle_fluid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
paddle_fluid
is quite a large dependency, and slow for compiling this test.
Just a small issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some changes, and it seems that the static member should be defined in .cc
file again, so it can link.
I also remove the template from the static method Register
.
diff --git a/paddle/fluid/inference/tensorrt/convert/convert.cc b/paddle/fluid/inference/tensorrt/convert/convert.cc
index 78a72b1..7b6123a 100644
--- a/paddle/fluid/inference/tensorrt/convert/convert.cc
+++ b/paddle/fluid/inference/tensorrt/convert/convert.cc
@@ -26,6 +26,9 @@ void TensorRTConverter::ConvertBlock(const framework::BlockDesc& block) {
}
}
+
+std::unordered_map<std::string, OpConverter> OpConverter::register_op_converter_;
+
} // namespace tensorrt
} // namespace inference
} // namespace paddle
diff --git a/paddle/fluid/inference/tensorrt/convert/convert.h b/paddle/fluid/inference/tensorrt/convert/convert.h
index 953086a..b9a8a01 100644
--- a/paddle/fluid/inference/tensorrt/convert/convert.h
+++ b/paddle/fluid/inference/tensorrt/convert/convert.h
@@ -32,13 +32,12 @@ class OpConverter {
void Convert(const framework::OpDesc& op) {
std::string type = op.Type();
- OpConverter& op_converter = this->register_op_converter_[type];
+ OpConverter& op_converter = OpConverter::register_op_converter_[type];
op_converter.Convert(op);
}
- template <typename T>
- static void Register(const std::string key) {
- register_op_converter_[key] = T();
+ static void Register(const std::string key, const OpConverter& v) {
+ register_op_converter_[key] = v;
}
static std::unordered_map<std::string, OpConverter> register_op_converter_;
@@ -52,7 +51,7 @@ class OpConverter {
#define REGISTER_TRT_OP_CONVETER(op_type, convert_class) \
class convert_class : public OpConverter { \
public: \
- convert_class() { OpConverter::Register<convert_class>(#op_type); } \
+ convert_class() { OpConverter::Register(#op_type, convert_class()); } \
void Convert(const framework::OpDesc& op); \
}
|
||
template <typename T> | ||
static void Register(const std::string key) { | ||
register_op_converter_[key] = T(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registered_op_converter
seems better.
std::unordered_map<std::string, OpConverter*> converters_; | ||
|
||
// fluid inference scope | ||
framework::Scope* scope_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
framework::Scope* scope_{nullptr};
/* | ||
* Convert Op from Fluid to TensorRT Engine. | ||
*/ | ||
class OpConverter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以参考我pr里的写法,把单例和接口拆开。 之前王叔说过这个三种设计模式混在一起。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be refactored latter when utils/singleton.h in the convert_io PR merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
#include "paddle/fluid/platform/device_context.h" | ||
#include "paddle/fluid/platform/place.h" | ||
|
||
USE_OP(relu); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to bottom of the file, for that reader needn't care this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we move it to the bottom of the file, it seems no effect, and unit-test is a failure.
namespace inference { | ||
namespace tensorrt { | ||
|
||
void compare(float input, float expect) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
PADDLE_ENFORCE_EQ(0, buffer_sizes_.count(name), "duplicate output name %s", | ||
name); | ||
|
||
auto* output = TensorRTEngine::GetITensor(name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all the outputs of a TensorRT layer can be retrieved by the name? Is this interface necessary given that there is already a DeclareOutput
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all the outputs of a TensorRT layer can be retrieved by the name?
Yes, all the TensorRT layer will put its output into itensor_map_
. The reason is that other TensorRT layer can get its input from itensor_map_
directly.
engine_->SetITensor(op.Output("Out")[0], layer->getOutput(0));
Is this interface necessary given that there is already a DeclareOutput?
Yes, we can easily use engine->DeclareOutput("Out")
to DeclareOutput. Maybe this function can have another name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get
// Fill an ITensor into map itensor_map_. | ||
void SetITensor(const std::string& name, nvinfer1::ITensor* tensor); | ||
// Get an ITensor called name. | ||
nvinfer1::ITensor* GetITensor(const std::string& name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this interface necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have GetITensor
interface, we should directly call auto tensor = itensor_map_[name]
to get the itensor, which has two disadvantages:
- Each time we should call
PADDLE_ENFORCE(itensor_map_.count(name), "no itensor %s", name);
before. itensor_map_
should be public.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
A implement of #10028 (comment)
The result of
make test ARGS='-R tensorrt_convert_test -V'
:Will add
TensorRTEngine
as a member variable ofTensorRTConverter
class after #10003.