-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cpp trainer lib and demo #10681
Add cpp trainer lib and demo #10681
Conversation
… add-cpp-trainer-demo
… add-cpp-trainer-demo
… add-cpp-trainer-demo
… add-cpp-trainer-demo
…addle into add-cpp-trainer-demo
… add-cpp-trainer-demo
…ao/Paddle into add-cpp-trainer-demo
… add-cpp-trainer-demo
@@ -0,0 +1,64 @@ | |||
cmake_minimum_required(VERSION 3.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
demo的话,是不是放到fluid/demo下比较好呢?或者doc/fluid下呢?这样以后官网可以展示demo。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, 放到 fluid/train/demo下了,这个和c++训练有关,就先放到train目录下了
|
||
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11") | ||
|
||
set(PADDLE_LIB "${PROJECT_SOURCE_DIR}/lib") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PROJECT_SOURCE_DIR请问是什么呢?inference_dist按照后就只有一个目录:PADDLE_LIB就够了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
set(PADDLE_LIB "${PROJECT_SOURCE_DIR}/lib") | ||
set(MATH_TYPE $ENV{LIB_TYPE} CACHE STRING "Choose the Math library type: openblas mkl") | ||
|
||
option(WITH_MKLDNN "Compile PaddlePaddle with MKLDNN" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10行可以先去掉,mkldnn的现在效果不好,以后加如何?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
保留了,默认false
add_executable(demo_trainer demo_trainer.cc) | ||
|
||
if(MATH_TYPE STREQUAL "mklml") | ||
include_directories("${PADDLE_LIB}/third_party/install/mklml/include") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
33行需要加么
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我自己测试的时候是需要的。
paddle/fluid/train/test/README.md
Outdated
### step 1. build paddle lib | ||
|
||
``` | ||
# option MATH_TYPE=mklml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第5行还要加openblas
第5-6行不是编译paddle lib需要的,是第四步的时候才需要,需要移动下位置。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/fluid/train/test/README.md
Outdated
### step 2. copy lib to this dir | ||
|
||
``` | ||
cp -r /paddle/src/dir/paddle/fluid/train/lib . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
第2步可以去掉,库安装完不用拷贝,只要在cmake里面加上库的路径即可。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/fluid/train/test/README.md
Outdated
|
||
This will generate two files: | ||
- startup_program: used to init all parameters | ||
- main_program: main logic of the network |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- 第三步,需要安装完paddlepaddle whl包的。才能跑。这里也需要说明一下。
- 这一步跑完有什么打印的信息么?可以写一下,这样用户知道这一步成功了。
paddle/fluid/train/test/README.md
Outdated
mkdir build | ||
cd build | ||
cmake .. | ||
make |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这一步的例子中,cmake选项没有实例,可以参考https://github.com/luotao1/fluid_inference_example#inference-example-project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
paddle/fluid/train/test/README.md
Outdated
make | ||
cp ../startup_program . | ||
cp ../main_program . | ||
./demo_trainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里跑完有什么结果么?可以写一下结果,这样用户知道例子跑通了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
paddle/fluid/train/test/README.md
Outdated
### step 4. build demo_trainer and run it. | ||
|
||
``` | ||
mkdir build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里需要说明下是在当前目录mkdir build,用户不知道去哪儿新建目录。
也需要说明整个demo目录可以放在任意一个地方。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
|
||
} // namespace train | ||
} // namespace paddle |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadBinaryFile和load函数既然在paddle/train的namespace里面,是否应该放在paddle代码里,而不是demo代码里。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
打算后面统一整理一下
@@ -0,0 +1,66 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要注明是使用CPU静态库的版本。GPU/动态库有点区别,但类似。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
后面统一整理?目前业务方只需要用CPU的部分,可以先给他们用起来
-DWITH_MKL=OFF \ | ||
-DWITH_MKLDNN=OFF | ||
make -j8 | ||
make -j8 inference_lib_dist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Superjomn 训练也使用的话,叫inference_lib_dist好么?改成fluid_lib_dist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
打算后面统一整理一下,这个demo可以尽快merge
``` | ||
step: 0 loss: 1069.02 | ||
step: 1 loss: 1069.02 | ||
step: 2 loss: 1069.02 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING: Logging before InitGoogleLogging() is written to STDERR
W0521 20:55:43.756713 24692 init.cc:84] 'CUDA' is not supported, Please re-compile with WITH_GPU option
W0521 20:55:43.756901 24692 init.cc:100] 'CUDA' is not supported, Please re-compile with WITH_GPU option
step: 0 loss: 58.8651
step: 1 loss: 58.8651
step: 2 loss: 58.8651
我打出来是这样,每个阶段的loss都一样?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对,因为没有加入optimize op,有一个参数控制,https://github.com/PaddlePaddle/Paddle/pull/10681/files#diff-7e8d0736b2aff0b2bc699d05f454e0a3R19
这是业务的需求
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加上之后可以正常收敛
|
||
auto loss_var = scope.Var(loss_name); | ||
|
||
for (int i = 0; i < 100; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里i=10,10个循环就够了
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 后续可以进一步优化demo
task list: #10574