2018 04 25

Lei Wang

Add issue to describe the goal of merging all the build related scripts (https://github.com/PaddlePaddle/Paddle/issues/10073)
Add "README" file to describe how to use the new build scripts (https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/scripts/README.md)

wangkuiyi

Build system: https://github.com/PaddlePaddle/Paddle/pull/10038#issuecomment-384045187
Fluid language embedding: https://github.com/PaddlePaddle/Paddle/issues/10152#issuecomment-384084262
reader.BlockingQueue: https://github.com/PaddlePaddle/Paddle/pull/10206#pullrequestreview-115331948
build script polishment: https://github.com/PaddlePaddle/Paddle/issues/10073#issuecomment-384465044

helinwang

Paddle Fluid API design with @cs2be , @abhinavarora and @varunarora
- Fluid API proposal: https://github.com/PaddlePaddle/Paddle/issues/10152 (coauthor with Thuan)
- Word to Vec under the new Fluid API: https://github.com/PaddlePaddle/Paddle/issues/10214
- Fluid data pipeline interface: https://github.com/PaddlePaddle/Paddle/issues/10102
- Related discussion: https://github.com/PaddlePaddle/Paddle/issues/10177
Reviews

tonyyang-svail

Code Clean Up:
- Clean up unused code in operator class (#10035)
- Remove duplicated ShareLoD in gru_op and sequence_conv_op (#10149)
Paddle Fluid API design
- https://github.com/PaddlePaddle/Paddle/issues/10103

weixing

Fluid api bug and dead-links fix:
- Change paddle.v2.dataset to paddle.dataset for V2 docs:https://github.com/PaddlePaddle/Paddle/pull/10222
- Add dataset for fluid api documentation: https://github.com/PaddlePaddle/Paddle/pull/10172
- [WIP]Fix most of deadlinks in fluid documentation:https://github.com/PaddlePaddle/Paddle/pull/10097
- Update some new apis:https://github.com/PaddlePaddle/Paddle/pull/10051
PR review:
- https://github.com/PaddlePaddle/Paddle/pull/10171

abhinavarora

CPPLint Progress

Directory	Develop	15-Mar
fluid/framework	29	227
fluid/framework/details	0	5
fluid/inference	0	15
fluid/inference/tensorrt	14	N/A
fluid/memory	0	2
fluid/operators	0	303
fluid/operators/reader	0	8
fluid/operators/concurrency	0	N/A
fluid/operators/math	328	369
fluid/operators/detail	0	29
fluid/operators/nccl	2	2
fluid/platform	0	155
fluid/pybind	0	41
fluid/recordio	0	18
fluid/string	0	7

Code Cleanup
- Fix more CPPLint errors https://github.com/PaddlePaddle/Paddle/pull/10218
- Fix CPPLint errors with framework/executor https://github.com/PaddlePaddle/Paddle/pull/10212
- Fix CPPLint errors with framework/op_desc https://github.com/PaddlePaddle/Paddle/pull/10181
- Fix CPPLint issues in framework/data_transform framework/prune.cc https://github.com/PaddlePaddle/Paddle/pull/10178
- Fix CPPLint issues in init.cc, init.h and library_type.h https://github.com/PaddlePaddle/Paddle/pull/10148
- Fix Cpplint issues in framework/data_type.h and framework/feed_fetch_type.h https://github.com/PaddlePaddle/Paddle/pull/10146
- Fix CPPLint issues in tensor_util_test https://github.com/PaddlePaddle/Paddle/pull/10111
- Fix CPPLint errors in framework/details https://github.com/PaddlePaddle/Paddle/pull/10104
- Fix CPPlint issues in fluid/inference https://github.com/PaddlePaddle/Paddle/pull/10075
- Fix CPPLint issues with select_op https://github.com/PaddlePaddle/Paddle/pull/10072
- Fix more CPPLint errors https://github.com/PaddlePaddle/Paddle/pull/10069
- Fix CPPLint issues in some tests in fluid/framework https://github.com/PaddlePaddle/Paddle/pull/10068
Fluid API V4
- Participate in discussions with Helin and Thuan on V4 API design
- https://github.com/PaddlePaddle/Paddle/issues/10152#issuecomment-384030871
- Recognize Digits example with new API https://github.com/PaddlePaddle/Paddle/issues/10215
PR Reviews

Chenxi

worked on the aws training issue, Yanxu's helping on this https://github.com/PaddlePaddle/Paddle/issues/10106
Aws tool to quickly switch on/off an instance for external user https://github.com/putcn/aws_instance_switch
Aws tool doc improved https://github.com/PaddlePaddle/Paddle/pull/10182

kexinzhao

float16 inference:
- Add float16 inference transpiler, fix a bug in prune method, and add image classification float16 inference example: https://github.com/PaddlePaddle/Paddle/pull/10109
- Add float16 inference design doc: https://github.com/PaddlePaddle/Paddle/pull/10210
- [WIP] float16 inference experiment report
PR review:

Qingsheng Li

Added kernel to beam_search_op :
- https://github.com/PaddlePaddle/Paddle/pull/10052
Adding kernel to beam_search_decode_op

Xin Pan

Debug Se-resnext layer-by-layer
- non-deterministic elementwise-grad-op: https://github.com/PaddlePaddle/Paddle/issues/10122
- non-deterministic conv2d-grad, batch-norm-grad
Research and outline ONNX fully-support technical difficulties
- https://github.com/PaddlePaddle/Paddle/issues/10115
Debug cuda-8-cudnn5 docker image transformer model crash issue
followup on the new api design

luotao

inference:
- tensorrt design doc: https://github.com/PaddlePaddle/Paddle/issues/10028
- refine tensorrt cmake and dockerfile: https://github.com/PaddlePaddle/Paddle/pull/10134
- tensorrt convert init: https://github.com/PaddlePaddle/Paddle/pull/10144
fix a bug in test_batch_norm_op.py: https://github.com/PaddlePaddle/Paddle/pull/10094
fix a cpu bug in parallel_executor.py: https://github.com/PaddlePaddle/Paddle/pull/10141
code review:
- fea/init tensorrt engine: https://github.com/PaddlePaddle/Paddle/pull/10003#pullrequestreview-113556181
- [merge] multiplication operator for MKLDNN: https://github.com/PaddlePaddle/Paddle/pull/9949
- MKLDNN implementation of batch normalization: https://github.com/PaddlePaddle/Paddle/pull/9904

wuyi

dist train accuracy/perf data updates: https://docs.google.com/spreadsheets/d/1D5Xc_TfGfMV5aKh4ZJS_b4js3Mnn06H1Po0iuECZLr4/edit#gid=0
Multi GPU dist train:
- https://github.com/PaddlePaddle/Paddle/pull/10143
- https://github.com/PaddlePaddle/Paddle/pull/10126
[WIP] some NCCL2 dist prototype: https://github.com/typhoonzero/nccl_rdma_demo
Reviews and discussions of async dist training

Baiyifan

Get familiarity with op development process, profile and timeline
Optimize iou_similarity_op cuda kernel:
- https://github.com/PaddlePaddle/Paddle/pull/10224

tangwei

dist train accuracy https://github.com/seiriosPlus/fluid_benchmark/tree/master/image_classification
MPI-Enabld https://github.com/seiriosPlus/mpi_enabled

fengjiayi

Add synchronous TensorCopy:
- https://github.com/PaddlePaddle/Paddle/pull/10142
fix Clang compile errors:
- https://github.com/PaddlePaddle/Paddle/pull/10165
BlockingQueue for readers
- https://github.com/PaddlePaddle/Paddle/pull/10206
Reviews:
- https://github.com/PaddlePaddle/Paddle/pull/10166

Yu Yang

Fix a critical bug of dynloader
- We use dlsym to extract function pointer from shared library(dynload namespace). We cast the pointer to the type that exactly fit the invoke parameter, not the actually function type defined in header.
  - for example, if we pass an (int, int) to a function void((int64_t, int64_t)). We will cast the function symbol to void((int, int)), rather than void(*(int64_t, int64_t)). It will cause bug if sizeof(int) != sizeof(int64) on some platform.
- https://github.com/PaddlePaddle/Paddle/pull/10191
- https://github.com/PaddlePaddle/Paddle/pull/10189
Find a critical bug of GPU memory allocator and memcpy
- We found that we cannot synchonize stream if we invoke cudaMemcpyAsync on a CPU memory, which is allocated by malloc not cudaMallocHost. It is suggest to use cudaMallocHost to malloc CPU memory, when the memory is used for CPU <--> GPU communication.
- When we change malloc to cudaMallocHost, we found that there are a lot of memory copies are not synchonized. It is a critical bug for Paddle and a key reason making our training process not stable.
- Currentlly, we add cudaMemcpySync API to avoid the bug when feeding/fetching data. To resolve this bug thoroughly, it will take a week or longer.
Add a demo for parallel execturo + reader to train and test a program
- https://github.com/PaddlePaddle/Paddle/pull/10166

Liu Yiqun

Inference Framework
- Add flush of program desc to update the proto information
  - [Merged] https://github.com/PaddlePaddle/Paddle/pull/10058
- Build the docker image paddle_manylinux_devel:cuda8.0_cudnn7 and build the latest inference library for image collegues
- Analysis the reason of that the runtime of setting fraction_of_gpu_memory_to_use=0 is 3~4x to the default setting (0.92)
- Review
  - Add init interface for customize devices, https://github.com/PaddlePaddle/Paddle/pull/10167
  - init tensorrt engine, https://github.com/PaddlePaddle/Paddle/pull/10003
Mobile
- Review
  - https://github.com/PaddlePaddle/paddle-mobile/pull/140

yangyaming

Refine reader
https://github.com/guoshengCS/transformer-nist/blob/refined_data_reader/transformer/data_util.py
Refine argument naming
https://github.com/PaddlePaddle/Paddle/pull/10223
Tuning Transformer
Speed up inference: 40+m —> 10+m

qiaolongfei

fluid support async training
- project: https://github.com/PaddlePaddle/Paddle/projects/61
- task list:https://github.com/PaddlePaddle/Paddle/issues/9941
- FLuid support async training
  - VariableResponse support deserialize var into local scope https://github.com/PaddlePaddle/Paddle/pull/10060
  - Refine listen and serve op https://github.com/PaddlePaddle/Paddle/pull/10080
  - split optimization ops on pserver to independenty blocks https://github.com/PaddlePaddle/Paddle/pull/10123
  - [WIP]listen_and_serv_op support async update https://github.com/PaddlePaddle/Paddle/pull/10042
  - Run test on text_classification of async training
code clean and improvement
- fix build activation_op.cc on mac https://github.com/PaddlePaddle/Paddle/pull/10116

Todo

do more benchmark about async training

Yan Xu

lookup remote table
- lookup table with nonexistent key, https://github.com/PaddlePaddle/Paddle/pull/10164
confirm text classification model acc with distributed training, https://docs.google.com/spreadsheets/d/1D5Xc_TfGfMV5aKh4ZJS_b4js3Mnn06H1Po0iuECZLr4/edit#gid=1478737887
review
- https://github.com/PaddlePaddle/Paddle/pull/10049#pullrequestreview-113483433
- https://github.com/PaddlePaddle/Paddle/pull/10042#discussion_r183947675

dongzhihong

[Speed] change Scope string hashed variable index to number hashed
- https://github.com/PaddlePaddle/Paddle/pull/10186
- https://github.com/PaddlePaddle/Paddle/pull/10190
upgrade to cuda9 cudnn 7
- https://github.com/PaddlePaddle/Paddle/pull/10140
Model CE
- teamcity, Model CE搭建完毕，1master，2 agent
- 修复gpu memory统计错误，采样粒度，Model CE访问等l问题
- NLP transfomer模型，图像ocr, image_classifaction, object_detection 4个模型已加入，正在观察稳定性

Yibing Liu

Fluid2onnx convertor:

Add unit test framework for operators' conversion
- https://github.com/PaddlePaddle/paddle-onnx/pull/29
Resovle name conflicts, operators enhancements & add dropout, elem_mul, sigmoid ops etc.
- https://github.com/PaddlePaddle/paddle-onnx/pull/30
Add vgg16 & resnet50 to supported models
- https://github.com/PaddlePaddle/paddle-onnx/pull/32
Add mobilenet & se_resnext to supported models
- https://github.com/PaddlePaddle/paddle-onnx/pull/33
[WIP] Add Inception_v4 config in models/fluid/image_classification
Enable the parallel training of mobilenet
- https://github.com/PaddlePaddle/models/pull/881
Merge design doc for onnx convertor
- https://github.com/PaddlePaddle/Paddle/pull/9296

guosheng

NMT:
- Transformer code clean and data utility.
- Transformer experiments related.

Yan Chunwei

daming-lu

WIP: create an online VisualDL demo server to give users first hand experience:
- https://github.com/daming-lu/VisualDL/tree/mock_requests
VisualDL improvement:
Reviewed PRs:

cs2be(thuan)

Paddle
- Imperative Design
  - Paddle API v4 proposal (https://github.com/PaddlePaddle/Paddle/issues/10152)
  - Paddle V4 API - Recognize Digits (https://github.com/PaddlePaddle/Paddle/issues/10215)
- Reviews
  - https://github.com/PaddlePaddle/Paddle/pull/10145
  - https://github.com/PaddlePaddle/Paddle/pull/10146

sidgoyal78

Code cleanup:
Imperative Fluid (With Helin and team):
- https://github.com/PaddlePaddle/Paddle/issues/10214
- https://github.com/PaddlePaddle/Paddle/issues/10216
ONNX: review: https://github.com/PaddlePaddle/paddle-onnx/pull/30
Working with Sharan on sentiment analysis model benchmark (using paddle)

jetfuel(Jeff)

VisualDL
- Update VisualDL documentation structure on PPO. Add new documentations to the website.: https://github.com/PaddlePaddle/VisualDL/pull/416
- Update embedding search experience: https://github.com/PaddlePaddle/VisualDL/pull/420
- Only allow one embedding record per run: https://github.com/PaddlePaddle/VisualDL/pull/422
- Update embedding API documentation. Create in-house dimension reduction functions: https://github.com/PaddlePaddle/VisualDL/pull/424
PaddlePaddle.org
- Fix the issue where the doc tool can't generate documentation: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/470
- Update VisualDL doc generating setting to consume the new layout: https://github.com/PaddlePaddle/PaddlePaddle.org/pull/471/files
Reviews and issues

Nicky

Use VisualDL on Paddle Demo
- Image classify Demo https://github.com/PaddlePaddle/VisualDL/pull/425
- Step by step tutorial document
  - https://github.com/PaddlePaddle/VisualDL/pull/427
  - https://github.com/PaddlePaddle/VisualDL/pull/428
PRs

Release Notes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2018 04 25

Lei Wang

wangkuiyi

helinwang

tonyyang-svail

weixing

abhinavarora

CPPLint Progress

Chenxi

kexinzhao

Qingsheng Li

Xin Pan

luotao

wuyi

Baiyifan

tangwei

fengjiayi

Yu Yang

gongweibao

wanghaoshuang

Dang Qingqing

zhaochengduo

Liu Yiqun

yangyaming

qiaolongfei

Todo

Yan Xu

dongzhihong

Yibing Liu

guosheng

Yan Chunwei

daming-lu

cs2be(thuan)

sidgoyal78

jetfuel(Jeff)

Nicky

Clone this wiki locally