Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Hackthon_4th 244] Added Paddle Lite GPU Backend #1907

Merged
merged 10 commits into from
May 12, 2023

Conversation

unseenme
Copy link
Contributor

@unseenme unseenme commented May 6, 2023

PR types(PR类型)

Backend

Description

Android GPU (OpenCL) inference by Paddle Lite Backend has been enabled.

config_.set_model_file(runtime_option.model_file);
config_.set_param_file(runtime_option.params_file);
if (runtime_option.device == Device::GPU) {
config_.set_model_dir(runtime_option.model_file);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FD是需要使用model_file和和param_file的形式,这里需要保持原来的逻辑。可以使用FD提供的PaddleClas的模型进行验证:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

需要提供结果正确性的验证说明,比如使用FD已集成的分类或目标检测模型,在启用OpenCL后,得到正确的结果。目标检测可以输出可视化结果。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

另外,FD默认提供的Lite库是不支持opencl的,因此,需要修改cmake/paddlelite.cmake,提供opencl版本的lite full_publish库。我们已经预编译好armv7, armv8 android库,请修改cmake/paddlelite.cmake,根据ABI和WITH_OPENCL选择对应的库进行下载。

# Android
https://bj.bcebos.com/fastdeploy/third_libs/lite-android-arm64-v8a-fp16-opencl-0.0.0.ab000121e.tgz
https://bj.bcebos.com/fastdeploy/third_libs/lite-android-armeabi-v7a-opencl-0.0.0.ab000121e.tgz
  • WITH_OPENCL需要添加到FastDeploy.cmake.in中

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改了模型加载方式。
增加了预编译库下载。
添加了编译开关。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

模型加载方式修改后,需要用新模型重新验证。验证进行中。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

正确性验证已经提供。

if(WITH_OPENCL)
if(NOT ANDROID OR NOT ENABLE_LITE_BACKEND)
message(FATAL_ERROR "Cannot enable OpenCL while compling unless in Android and Paddle Lite backend is enbaled.")
set(WITH_GPU OFF)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这行是多余的,FATAL_ERROR直接就终止编译了

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删除了三处 FATAL_ERROR 后的 set。

@@ -49,6 +49,19 @@ void LiteBackend::ConfigureCpu(const LiteBackendOption& option) {
config_.set_valid_places(GetPlacesForCpu(option));
}

void LiteBackend::ConfigureGpu(const LiteBackendOption& option) {
config_.set_valid_places(std::vector<paddle::lite_api::Place>({
paddle::lite_api::Place{TARGET(kOpenCL), PRECISION(kFP16), DATALAYOUT(kImageDefault)},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里要根据option.enable_fp16来决定是否添加kFP16相关的设置,FD统一默认采用FP32推理,无论是CPU还是GPU

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如:

if (option.enable_fp16) {
      valid_places.emplace_back(
        Place{TARGET(kOpenCL), PRECISION(kFP16), DATALAYOUT(kImageDefault)});
      valid_places.emplace_back(
        Place{TARGET(kOpenCL), PRECISION(kFP16), DATALAYOUT(kImageFolder)});
      config_.set_opencl_precision(CL_PRECISION_FP16);
    }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经增加 fp16 判断。

@DefTruth
Copy link
Collaborator

DefTruth commented May 8, 2023

@unseenme 感谢您的贡献,有几处地方需要尽快修改和完善下才能合入哈~

@unseenme
Copy link
Contributor Author

unseenme commented May 8, 2023

@unseenme 感谢您的贡献,有几处地方需要尽快修改和完善下才能合入哈~

@DefTruth 感谢Review,我会尽快修改完善。

@CLAassistant
Copy link

CLAassistant commented May 9, 2023

CLA assistant check
All committers have signed the CLA.

@unseenme unseenme force-pushed the ph4_244_lite_gpu_backend branch from 50591cb to ed0ff1c Compare May 9, 2023 13:30
@DefTruth
Copy link
Collaborator

LGTM~ @unseenme 还需要再PR的comment中提供下编译说明和测试结果哈

valid_gpu_backends = {Backend::ORT, Backend::PDINFER, Backend::TRT};
}
valid_gpu_backends = {Backend::ORT, Backend::PDINFER, Backend::TRT,
Backend::LITE};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

先不需要修改UIE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UIE相关修改已经撤销了。

@unseenme unseenme force-pushed the ph4_244_lite_gpu_backend branch from 15c5bbd to 9a94dc4 Compare May 11, 2023 12:18
@unseenme
Copy link
Contributor Author

unseenme commented May 11, 2023

编译说明

编译选项 WITH_OPENCL 设置为 ON 时,FastDeploy 将支持 Paddle Lite GPU 后端。
例:

cmake -DCMAKE_TOOLCHAIN_FILE=${TOOLCHAIN_FILE} -DCMAKE_BUILD_TYPE=MinSizeRel -DANDROID_ABI=${ANDROID_ABI} -DANDROID_NDK=${ANDROID_NDK} -DANDROID_PLATFORM=${ANDROID_PLATFORM} -DANDROID_STL=${ANDROID_STL} -DANDROID_TOOLCHAIN=${ANDROID_TOOLCHAIN} -DENABLE_LITE_BACKEND=ON -DENABLE_VISION=ON -DCMAKE_INSTALL_PREFIX=${FASDEPLOY_INSTALL_DIR} -Wno-dev -DWITH_OPENCL=ON ../../..

测试结果

CPU 结果与 GPU 结果一致。

Android CPU

nio:/data/local/tmp/2023/2023 $ ./infer_demo ResNet50_vd_infer ILSVRC2012_val_00000010.jpeg 3
[FastDeploy][INFO] fastdeploy/runtime/runtime.cc(328)::CreateLiteBackend	Runtime initialized with Backend::PDLITE in Device::CPU.
ClassifyResult(
label_ids: 153, 
scores: 0.686230, 
)

Android GPU

nio:/data/local/tmp/2023/2023 $ ./infer_demo ResNet50_vd_infer ILSVRC2012_val_00000010.jpeg 8                                                                                                             
[FastDeploy][INFO] fastdeploy/runtime/runtime.cc(328)::CreateLiteBackend	Runtime initialized with Backend::PDLITE in Device::GPU.
ClassifyResult(
label_ids: 153, 
scores: 0.686230, 
)

@DefTruth
Copy link
Collaborator

LGTM~

@@ -46,7 +46,7 @@ void RuntimeOption::SetEncryptionKey(const std::string& encryption_key) {
}

void RuntimeOption::UseGpu(int gpu_id) {
#ifdef WITH_GPU
#if defined(WITH_GPU) || defined(WITH_OPENCL)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个函数需要修改一下,设置lite option的device,只这样设置,我在测试机上测试并没有成功使用gpu.

void RuntimeOption::UseGpu(int gpu_id) {
#if defined(WITH_GPU) || defined(WITH_OPENCL)
  device = Device::GPU;
  device_id = gpu_id;  

#if defined(WITH_OPENCL) && defined(ENABLE_LITE_BACKEND)
  paddle_lite_option.device = device;
#endif

#else
  FDWARNING << "The FastDeploy didn't compile with GPU, will force to use CPU."
            << std::endl;
  device = Device::CPU;
#endif
}

@DefTruth DefTruth merged commit 3fd21c9 into PaddlePaddle:develop May 12, 2023
DefTruth added a commit that referenced this pull request May 18, 2023
* [Hackthon_4th 244] Added Paddle Lite GPU Backend (#1907)

* [improved] enum; ConfigureGpu();

* [improved] init()

* [improved] valid place; model dir; is valid;

* [added] WITH_OPENCL in cmake

* [improved] set model; valid place; cmake url; cmake option;

* Update runtime_option.cc

---------

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>

* [Bug Fix] Fix get_models bug (#1943)

* 添加paddleclas模型

* 更新README_CN

* 更新README_CN

* 更新README

* update get_model.sh

* update get_models.sh

* update paddleseg models

* update paddle_seg models

* update paddle_seg models

* modified test resources

* update benchmark_gpu_trt.sh

* add paddle detection

* add paddledetection to benchmark

* modified benchmark cmakelists

* update benchmark scripts

* modified benchmark function calling

* modified paddledetection documents

* upadte getmodels.sh

* add PaddleDetectonModel

* reset examples/paddledetection

* resolve conflict

* update pybind

* resolve conflict

* fix bug

* delete debug mode

* update checkarch log

* update trt inputs example

* Update README.md

* add ppocr_v4

* update ppocr_v4

* update ocr_v4

* update ocr_v4

* update ocr_v4

* update ocr_v4

* update get_models.sh

---------

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>

* [benchmark] Add lite opencl gpu option support (#1944)

[benchmark] add lite opencl gpu option support

* [cmake] Support custom paddle inference url (#1939)

* [cmake] Support custom paddle inference url

* [Python] Add custom Paddle Inference URL support for python

* [Docker] Add fd serving Dockerfile for paddle2.4.2

* [Docker] Add fd serving Dockerfile for paddle2.4.2

* [Docker] Add fd serving Dockerfile for paddle2.4.2

* [Docker] Add fd serving Dockerfile for paddle2.4.2

* [Bug Fix] fixed result format string error

* rerunning the re-touch CIs

* rerunning CIs

* [XPU] Add gm_default_size -> Backend::LITE (#1934)

* add gm_default_size

* add gm_default_size

---------

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>

* [Bug Fix] Fix speech and silence state transition in VAD (#1937)

* Fix speech and silence state transition

* Fix typo

* Fix speech and silence state transition

---------

Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>

* [benchmark] support lite light api & optimize benchmark flags (#1950)

* [benchmark] support lite light api & optimize benchmark flags

* [backend] remove un-used option member

* [backend] remove un-used option member

* [python] support build paddle2onnx code for python & custom url (#1956)

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

---------

Co-authored-by: unseenme <41909825+unseenme@users.noreply.github.com>
Co-authored-by: linyangshi <32589678+linyangshi@users.noreply.github.com>
Co-authored-by: linkk08 <124329195+linkk08@users.noreply.github.com>
Co-authored-by: Qianhe Chen <54462604+chenqianhe@users.noreply.github.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants