tensorrt convert init #10144

luotao1 · 2018-04-23T13:33:03Z

The result of make test ARGS='-R tensorrt_convert_test -V':

76: [==========] Running 1 test from 1 test case.
76: [----------] Global test environment set-up.
76: [----------] 1 test from tensorrt
76: [ RUN      ] tensorrt.ConvertBlock
76: I0423 13:14:39.315845 17146 convert.cc:42] convert a fluid mul op to tensorrt fc layer without bias
76: I0423 13:14:39.315907 17146 convert.cc:42] convert a fluid Conv2d op to tensorrt conv layer without bias
76: [       OK ] tensorrt.ConvertBlock (0 ms)
76: [----------] 1 test from tensorrt (0 ms total)
76: 
76: [----------] Global test environment tear-down
76: [==========] 1 test from 1 test case ran. (1 ms total)
76: [  PASSED  ] 1 test.
1/1 Test #76: tensorrt_convert_test ............   Passed    5.66 sec

Will add TensorRTEngine as a member variable of TensorRTConverter class after #10003.

Superjomn · 2018-04-24T03:18:33Z

paddle/fluid/inference/tensorrt/convert/convert.h

+  void RegisterOpConverters();
+
+  // convert a fluid Mul op to tensorrt fc layer without bias
+  static void ConvertMul(const framework::OpDesc& op);


Each op a static member function? better to be a normal function, or this class will have O(N) static member functions, weried.

Done 使用注册机制。

Superjomn · 2018-04-24T03:19:29Z

paddle/fluid/inference/tensorrt/convert/convert.cc

@@ -0,0 +1,51 @@
+/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.


why is a convert subdirectory needed?

因为convert目录下会有convert_mul， convert_conv2d等多个op文件，所以采用目录形式。

Superjomn · 2018-04-24T03:50:16Z

paddle/fluid/inference/tensorrt/convert/convert.cc

+}
+
+void TensorRTConverter::RegisterOpConverters() {
+  op_registry_["mul"] = ConvertMul;


use a normal functor, just like fluid::Operator:

struct MulOpConverter : public OpConverter { void operator()(const OpDesc&); }; REGISTER_TRT_OP_CONVETER("mul", MulOpConveter);

this way, it seems more natural to add more converters.

Done 采用注册机制。

Superjomn · 2018-04-25T12:13:16Z

paddle/fluid/inference/tensorrt/convert/convert.h

+      GetOpConverter()[#op_type] = new convert_class;    \
+    }                                                    \
+  };                                                     \
+  convert_class##Register convert_class##reg;


static convert_class##Register convert_class##reg;

Superjomn · 2018-04-25T12:19:02Z

paddle/fluid/inference/tensorrt/convert/convert.h

+namespace inference {
+namespace tensorrt {
+
+class ConverterBase {


ConvertBase just have data, no need to be a base class.

Superjomn · 2018-04-25T12:22:13Z

paddle/fluid/inference/tensorrt/convert/convert.h

+  virtual void Convert(const framework::OpDesc& op) = 0;
+};
+
+static std::unordered_map<std::string, OpConverter*>& GetOpConverter() {


OpConverter can be a unique class without inherit ConverterBase,
and it has no relation with TensorRTConverter, so no need to inherit the same class.

Superjomn · 2018-04-25T12:25:01Z

paddle/fluid/inference/tensorrt/convert/convert_conv2d.h

@@ -0,0 +1,36 @@
+/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.


better to have the same file name with the operators.

For example, for conv2d_op.h, there is a convert/conv2d_op.h, so that one can see the relationship between them by search files.

Superjomn · 2018-04-25T12:27:48Z

paddle/fluid/inference/tensorrt/convert/convert.h

+  };                                                     \
+  convert_class##Register convert_class##reg;
+
+class TensorRTConverter : public ConverterBase {


TensorRTConveter can be a unique class.

Superjomn · 2018-04-25T14:36:32Z

paddle/fluid/inference/tensorrt/convert/convert.h

+
+  void Convert(const framework::OpDesc& op) {
+    std::string type = op.Type();
+    OpConverter& op_converter = this->register_op_converter_[type];


auto& OpConverter& op_converter = OpConveter::register_op_converter_[type];

maybe this works.
register_op_conveter_ is a static member of OpConverter

Superjomn · 2018-04-26T01:44:34Z

paddle/fluid/inference/tensorrt/convert/CMakeLists.txt

@@ -0,0 +1,2 @@
+nv_library(tensorrt_convert SRCS convert.cc mul_op.cc conv2d_op.cc DEPS dynload_cuda)
+nv_test(test_tensorrt_convert SRCS test_convert.cc DEPS tensorrt paddle_fluid)


paddle_fluid is quite a large dependency, and slow for compiling this test.

Just a small issue.

Superjomn

I made some changes, and it seems that the static member should be defined in .cc file again, so it can link.

I also remove the template from the static method Register.

diff --git a/paddle/fluid/inference/tensorrt/convert/convert.cc b/paddle/fluid/inference/tensorrt/convert/convert.cc
index 78a72b1..7b6123a 100644
--- a/paddle/fluid/inference/tensorrt/convert/convert.cc
+++ b/paddle/fluid/inference/tensorrt/convert/convert.cc
@@ -26,6 +26,9 @@ void TensorRTConverter::ConvertBlock(const framework::BlockDesc& block) {
   }
 }

+
+std::unordered_map<std::string, OpConverter> OpConverter::register_op_converter_;
+
 }  // namespace tensorrt
 }  // namespace inference
 }  // namespace paddle
diff --git a/paddle/fluid/inference/tensorrt/convert/convert.h b/paddle/fluid/inference/tensorrt/convert/convert.h
index 953086a..b9a8a01 100644
--- a/paddle/fluid/inference/tensorrt/convert/convert.h
+++ b/paddle/fluid/inference/tensorrt/convert/convert.h
@@ -32,13 +32,12 @@ class OpConverter {

   void Convert(const framework::OpDesc& op) {
     std::string type = op.Type();
-    OpConverter& op_converter = this->register_op_converter_[type];
+    OpConverter& op_converter = OpConverter::register_op_converter_[type];
     op_converter.Convert(op);
   }

-  template <typename T>
-  static void Register(const std::string key) {
-    register_op_converter_[key] = T();
+  static void Register(const std::string key, const OpConverter& v) {
+    register_op_converter_[key] = v;
   }
   static std::unordered_map<std::string, OpConverter> register_op_converter_;

@@ -52,7 +51,7 @@ class OpConverter {
 #define REGISTER_TRT_OP_CONVETER(op_type, convert_class)                \
   class convert_class : public OpConverter {                            \
    public:                                                              \
-    convert_class() { OpConverter::Register<convert_class>(#op_type); } \
+    convert_class() { OpConverter::Register(#op_type, convert_class()); } \
     void Convert(const framework::OpDesc& op);                          \
   }

Superjomn · 2018-04-26T01:49:27Z

paddle/fluid/inference/tensorrt/convert/convert.h

+
+  template <typename T>
+  static void Register(const std::string key) {
+    register_op_converter_[key] = T();


registered_op_converter seems better.

Superjomn · 2018-04-27T08:11:11Z

paddle/fluid/inference/tensorrt/convert/op_converter.h

+  std::unordered_map<std::string, OpConverter*> converters_;
+
+  // fluid inference scope
+  framework::Scope* scope_;


framework::Scope* scope_{nullptr};

Superjomn · 2018-05-03T13:11:06Z

paddle/fluid/inference/tensorrt/convert/op_converter.h

+/*
+ * Convert Op from Fluid to TensorRT Engine.
+ */
+class OpConverter {


可以参考我pr里的写法，把单例和接口拆开。之前王叔说过这个三种设计模式混在一起。

can be refactored latter when utils/singleton.h in the convert_io PR merged.

Superjomn · 2018-05-03T13:12:11Z

paddle/fluid/inference/tensorrt/convert/test_activation_op.cc

+#include "paddle/fluid/platform/device_context.h"
+#include "paddle/fluid/platform/place.h"
+
+USE_OP(relu);


Move this to bottom of the file, for that reader needn't care this.

If we move it to the bottom of the file, it seems no effect, and unit-test is a failure.

Superjomn · 2018-05-03T13:12:24Z

paddle/fluid/inference/tensorrt/convert/test_activation_op.cc

+namespace inference {
+namespace tensorrt {
+
+void compare(float input, float expect) {


Superjomn · 2018-05-03T13:16:29Z

paddle/fluid/inference/tensorrt/engine.cc

+  PADDLE_ENFORCE_EQ(0, buffer_sizes_.count(name), "duplicate output name %s",
+                    name);
+
+  auto* output = TensorRTEngine::GetITensor(name);


Do all the outputs of a TensorRT layer can be retrieved by the name? Is this interface necessary given that there is already a DeclareOutput?

Do all the outputs of a TensorRT layer can be retrieved by the name?

Yes, all the TensorRT layer will put its output into itensor_map_. The reason is that other TensorRT layer can get its input from itensor_map_ directly.

For example:
https://github.com/luotao1/Paddle/blob/beb1245560b26fd198c3bdd7063334ad933f2d89/paddle/fluid/inference/tensorrt/convert/activation_op.cc#L32

engine_->SetITensor(op.Output("Out")[0], layer->getOutput(0));

Is this interface necessary given that there is already a DeclareOutput?

Yes, we can easily use engine->DeclareOutput("Out") to DeclareOutput. Maybe this function can have another name.

Superjomn · 2018-05-03T13:17:09Z

paddle/fluid/inference/tensorrt/engine.h

+  // Fill an ITensor into map itensor_map_.
+  void SetITensor(const std::string& name, nvinfer1::ITensor* tensor);
+  // Get an ITensor called name.
+  nvinfer1::ITensor* GetITensor(const std::string& name);


Does this interface necessary?

If we don't have GetITensor interface, we should directly call auto tensor = itensor_map_[name] to get the itensor, which has two disadvantages:

Each time we should call PADDLE_ENFORCE(itensor_map_.count(name), "no itensor %s", name); before.

itensor_map_ should be public.

Superjomn

LGTM

tensorrt convert init

42febfa

luotao1 requested a review from Superjomn April 23, 2018 13:33

Superjomn reviewed Apr 24, 2018

View reviewed changes

Merge branch 'develop' into tr_convert_init

48473dd

Superjomn reviewed Apr 25, 2018

View reviewed changes

auto registray op converters

d599de5

Superjomn reviewed Apr 25, 2018

View reviewed changes

use template to do registry

c4e3010

Superjomn reviewed Apr 26, 2018

View reviewed changes

Merge branch 'develop' into tr_convert_init

326221a

luotao1 added the 预测原名Inference，包含Capi预测问题等 label Apr 27, 2018

update the register method

6f6f330

Superjomn reviewed Apr 27, 2018

View reviewed changes

luotao1 added 2 commits May 2, 2018 14:22

Merge branch 'develop' into tr_convert_init

9945265

add relu converter and unit-test

beb1245

Superjomn reviewed May 3, 2018

View reviewed changes

Superjomn approved these changes May 4, 2018

View reviewed changes

luotao1 merged commit 4646c0f into PaddlePaddle:develop May 4, 2018

luotao1 deleted the tr_convert_init branch May 7, 2018 07:16

luotao1 mentioned this pull request May 7, 2018

refine io_convert and op_convert #10461

Merged

luotao1 mentioned this pull request May 22, 2018

Relu op TRT Converter #10630

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tensorrt convert init #10144

tensorrt convert init #10144

luotao1 commented Apr 23, 2018

Superjomn Apr 24, 2018 •

edited

Loading

luotao1 Apr 25, 2018

Superjomn Apr 24, 2018 •

edited

Loading

luotao1 Apr 25, 2018

Superjomn Apr 24, 2018

luotao1 Apr 25, 2018

Superjomn Apr 25, 2018 •

edited

Loading

Superjomn Apr 25, 2018

luotao1 Apr 25, 2018

Superjomn Apr 25, 2018

luotao1 Apr 25, 2018

Superjomn Apr 25, 2018

Superjomn Apr 25, 2018

luotao1 Apr 25, 2018

Superjomn Apr 25, 2018 •

edited

Loading

Superjomn Apr 26, 2018 •

edited

Loading

Superjomn left a comment

Superjomn Apr 26, 2018

Superjomn Apr 27, 2018

Superjomn May 3, 2018

Superjomn May 4, 2018

luotao1 May 7, 2018

Superjomn May 3, 2018

luotao1 May 7, 2018

Superjomn May 3, 2018

luotao1 May 7, 2018

Superjomn May 3, 2018

luotao1 May 3, 2018

Superjomn May 4, 2018

Superjomn May 3, 2018

luotao1 May 3, 2018

Superjomn left a comment

		@@ -0,0 +1,51 @@
		/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,36 @@
		/* Copyright (c) 2018 PaddlePaddle Authors. All Rights Reserved.

		@@ -0,0 +1,2 @@
		nv_library(tensorrt_convert SRCS convert.cc mul_op.cc conv2d_op.cc DEPS dynload_cuda)
		nv_test(test_tensorrt_convert SRCS test_convert.cc DEPS tensorrt paddle_fluid)

tensorrt convert init #10144

tensorrt convert init #10144

Conversation

luotao1 commented Apr 23, 2018

Superjomn Apr 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn Apr 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn Apr 25, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn Apr 25, 2018 • edited Loading

Choose a reason for hiding this comment

Superjomn Apr 26, 2018 • edited Loading

Choose a reason for hiding this comment

Superjomn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Superjomn left a comment

Choose a reason for hiding this comment

Superjomn Apr 24, 2018 •

edited

Loading

Superjomn Apr 24, 2018 •

edited

Loading

Superjomn Apr 25, 2018 •

edited

Loading

Superjomn Apr 25, 2018 •

edited

Loading

Superjomn Apr 26, 2018 •

edited

Loading