Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Hackathon No.95】 #61

Merged
merged 5 commits into from
Apr 2, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
149 changes: 149 additions & 0 deletions rfcs/Julia/20220322_ChainRules_NeuralPDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# Paddle + NeuralPDE.jl 求解二维泊松方程

|API名称 | |
|---|---|
|提交作者<input type="checkbox" class="rowselector hidden"> | songjhaha |
|提交时间<input type="checkbox" class="rowselector hidden"> | 2022-03-22 |
|版本号| V1.0 |
|依赖飞桨版本 <input type="checkbox" class="rowselector hidden"> | develop版本 |
|文件名 | 20220322_ChainRules_NeuralPDE.md<br> |


# 一、概述
## 1、相关背景
[使用PaddlePaddle + Julia求解2D Poisson方程](https://github.com/X4Science/INFINITY/issues/1)

[NeuralPDE.jl](https://neuralpde.sciml.ai/dev/)为求解PDE提供了许多基于神经网络的算法实现,该库的神经网络模块主要基于[Flux.jl](https://github.com/FluxML/Flux.jl)。

## 2、功能目标

在Julia中封装paddle的神经网络,使得NeuralPDE基于paddle的神经网络模块,实现PDE的求解。在[NeuralPDE example](https://github.com/SciML/NeuralPDE.jl#example-solving-2d-poisson-equation-via-physics-informed-neural-networks)中,可以将网络模块部分直接替换成封装后的paddle模块,如:

```julia
# Neural network
paddlewrap = PaddleModuleWrap(paddle_module)
chain = Chain(paddlewrap)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

如果是当一整个黑盒的话,这里大概后面不需要再Chain了?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是有点奇怪,因为我在使用PyCallChainRules.jl和NeuralPDE结合时,如果不把jlwrap再用一层Chain包装起来的话,下面一句DiffEqFlux.initial_params(jlwrap)得到的就会是一句空数组,而如果用Chain包装起来就可以得到预期的参数数组

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是如果是调用Optimisers.destructure的话,直接传入jlwrap是可行的,之后我会再考虑用哪种形式

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是如果是调用Optimisers.destructure的话,直接传入jlwrap是可行的,之后我会再考虑用哪种形式

是的,得实现下这个的

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

但是如果是调用Optimisers.destructure的话,直接传入jlwrap是可行的,之后我会再考虑用哪种形式

是的,得实现下这个的

因为这一步主要只是得到一个flatten后的数组,倾向于在实现NeuralPDE的例子时直接调用Optimisers.destructure


# Initial parameters of Neural network
initθ = Float64.(DiffEqFlux.initial_params(chain))

discretization = PhysicsInformedNN(chain, QuadratureTraining(),init_params =initθ)
```

## 3、意义

使得Paddle能够作为支持[NeuralPDE.jl](https://neuralpde.sciml.ai/dev/)的神经网络后端,扩宽在AI+科学计算领域的应用。

# 二、飞桨现状

目前paddle在Julia可以直接采用[PyCall.jl](https://github.com/JuliaPy/PyCall.jl),但缺少相关封装可以和Julia生态直接结合。

# 三、业内方案调研

## [PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)
[PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)实现了对torch和jax的封装,定义了ChainRules的反向传播的规则,通过[DLPack.jl](https://github.com/pabloferz/DLPack.jl),支持在CPU和GPU上对tensor数据的无需拷贝的使用。其中定义`ChainRulesCore.rrlue`的代码为:

```julia
function ChainRulesCore.rrule(wrap::TorchModuleWrapper, args...; kwargs...)
T = typeof(first(wrap.params))
params = wrap.params
pyparams = Tuple(map(x -> DLPack.share(x, PyObject, pyfrom_dlpack).requires_grad_(true), params))
pyargs = fmap(x -> DLPack.share(x, PyObject, pyfrom_dlpack).requires_grad_(true), args)

torch_primal, torch_vjpfun = functorch.vjp(py"buffer_implicit"(wrap.torch_stateless_module, wrap.buffers), pyparams, pyargs...; kwargs...)
project = ProjectTo(args)
function TorchModuleWrapper_pullback(Δ)
cΔ = fmap(x->Adapt.adapt(PyAdaptor{T}(), x), Δ)
pycΔ = fmap(x->DLPack.share(x, PyObject, pyfrom_dlpack), cΔ)
torch_tangent_vals = torch_vjpfun(pycΔ)
jlparams_tangents = map(x -> DLPack.wrap(x, pyto_dlpack), torch_tangent_vals[1])
args_tangents = project(fmap(x -> DLPack.wrap(x, pyto_dlpack), torch_tangent_vals[2:end]))
return (Tangent{TorchModuleWrapper}(; torch_stateless_module = NoTangent(), dtype = NoTangent(), params = jlparams_tangents, buffers = NoTangent()), args_tangents...)
end
res = fmap(x->DLPack.wrap(x, pyto_dlpack), torch_primal)
return res, TorchModuleWrapper_pullback
end
```

## [Torch.jl](https://github.com/FluxML/Torch.jl)
[Torch.jl](https://github.com/FluxML/Torch.jl)实现了端到端的封装。有更细粒度的封装,包括基本的运算符,广播运算等。


# 四、对比分析

- [PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)利用了DLPack协议实现了python里tensor数据和julia内array数据的共享,[Torch.jl](https://github.com/FluxML/Torch.jl)则是为tensor封装了julia里的api, 前者的实现会更加简单方便些,paddle也支持[DLPack协议](https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/utils/dlpack.py)
- [PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)使用了[functorch](https://github.com/pytorch/functorch)作为torch的函数式实现,并实现了`ChainRulesCore.rrlue`。[Torch.jl](https://github.com/FluxML/Torch.jl)则是定义了相关运算的`@adjoint`。

[PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)的实现较为快捷,但封装的粒度不如[Torch.jl](https://github.com/FluxML/Torch.jl)。本方案目标主要是使得封装的paddle神经网络能够支持NeuralPDE的求解,理论上仅需实现对`Zygote.gradient`的支持,因此先采用[PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl)的方案进行实现。

# 五、设计思路与实现方案

## 命名与参数设计

实现PaddleModuleWrap类型包装paddle的神经网络。为了方便`ChainRulesCore.rrlue`的实现,参考[functorch](https://github.com/pytorch/functorch)的模式,将神经网络分为存储结构的`stateless_module`和对应的参数`params`:
```julia
struct PaddleModuleWrapper
NN::PaddleStatelessModule
dtype::PyObject
params::Tuple
end
```

实现相应的构造函数,能够直接构造简单的全连接神经网络,如:
```julia
PaddleModuleWrapper(dim_ins,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感觉这个constructor不是特别有必要,因为这里是specialized for dense layer,这里换个更具体的名字好一些

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

感谢建议,已进行了修订

dim_outs,
num_layers,
hidden_size,
dtype='Float32',
activation='tanh'))
```

实现前向传播:
```julia
function (wrap::PaddleModuleWrapper)(args...; kwargs...)
```

实现函数式的`vjp`:
```julia
function vjp(stateless_module::PaddleStatelessModule, pyparams, pyargs...; kwargs...)
```

实现`ChainRulesCore.rrule`:
```julia
ChainRulesCore.rrule(wrap::PaddleModuleWrapper, args...; kwargs...)
```

## API实现方案

为完成以上api,需要:
- 使用DLPack.jl对tensor和array进行转换
- 在实现`PaddleStatelessModule`的运算功能时,使用paddle的运算符,如`paddle.matmul()`,`paddle.add()`和`paddle.nn.Sigmoid()()`
- 在实现`vjp`时,使用[`paddle.fluid.dygraph.grad((outputs,inputs,grad_outputs)`](https://github.com/PaddlePaddle/Paddle/blob/d9a41fc479009f75aa976ea18bd759504497796b/python/paddle/fluid/dygraph/base.py#L428),例如:
```julia
function vjp(stateless_module::PaddleStatelessModule, pyparams, pyargs...; kwargs...)
res = stateless_module(pyparams, pyargs...; kwargs...)
function vjp_func(Δ)
grad = paddle.fluid.dygraph.grad([res], [pyparams...], Δ, retain_graph=true)
return grad
end
return res, vjp_func
end
```

# 六、测试和验收的考量
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这部分能否对最后验收的内容描述得更具体一些?比如,后面是像 PyCallChainRules一样提供一个迷你的package,还是说只是以一般demo源码形式提供呢?
个人倾向于前者,这样也好方便后面其他人复现和优化 😃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为有部分实现是和PyCallChianRules.jl重合的,是否可以考虑以在PyCallChianRules.jl上添加模块的形式,或者先完成这部分的任务,再作为后续的整合工作

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为有部分实现是和PyCallChianRules.jl重合的,是否可以考虑以在PyCallChianRules.jl上添加模块的形式,或者先完成这部分的任务,再作为后续的整合工作

嗯,如果 PyCallChainRules.jl 里的接口有不够灵活的地方可以顺手去发个pr,cc我下,我们可以一起看看


测试考虑的case如下:
- 实现对CPU和GPU的支持
- 实现对Zygote.gradient的支持
- 相应的梯度运算结果与使用paddle的原生api计算的结果一致
- 对NeuralPDE.jl的支持,将[示例](https://github.com/SciML/NeuralPDE.jl#example-solving-2d-poisson-equation-via-physics-informed-neural-networks)中的神经网络模块替换成封装后的paddle模块,求解器仍能正常运行

# 七、可行性分析和排期规划

方案参考[PyCallChainRules.jl](https://github.com/rejuvyesh/PyCallChainRules.jl),实现后需要进一步分析性能消耗,考虑优化方案,总体可以在活动时间内完成。

# 八、影响面
在Julia中对paddle进行一定程度的封装,对其他模块无影响