-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error when using view
of Const
to calculate view
of Duplicated
#1956
Comments
@hexaeder is this the code which segfaults? in reverse mode (the error message implies you should use runtime activity so that seems like the resolution). However, a segfault is clearly bad so I want to make sure we fix that. |
Hi! The segfault appears in my actual code, but I wasn't able to reproduce it to a MWE. If your interested I can try to set up a script which sets up the full objects using my packages and segfaults on jacobian call, but I'm not sure how easy debugging in there would be. I was mainly posting because the error message said something like report if you think enzyme should be able to prove no runtime activity. And I don't see why the MWEs would contain runtime activity... |
yeah that would be helpful [it should never segfault]. It is though confusing why it would require runtime activity here indeed |
Unfortunately, I was not able to reproduce the problem with a plain Additional observations:
using Pkg
@assert VERSION == v"1.10.5"
pkg"activate --temp"
pkg"add NetworkDynamics#3e99370, Enzyme, Graphs, DifferentiationInterface"
using NetworkDynamics, Graphs, Enzyme
using Enzyme: Enzyme
using DifferentiationInterface: DifferentiationInterface as DI
# we need to load some test utils from NetworkDynamics
include(joinpath(pkgdir(NetworkDynamics),"test","ComponentLibrary.jl"))
# setup of the system
g = complete_graph(4)
vf = Lib.kuramoto_second()
ef = [Lib.diffusion_odeedge(),
Lib.kuramoto_edge(),
Lib.kuramoto_edge(),
Lib.diffusion_edge_fid(),
Lib.diffusion_odeedge(),
Lib.diffusion_edge_fid()]
nw = Network(g, vf, ef)
x0 = rand(dim(nw))
dx = zeros(dim(nw))
p0 = rand(pdim(nw))
# this is the rhs we want to differentiate
# the last argument is time but it is not used in the system so I use NaN.
nw(dx, x0, p0, NaN)
# fault
DI.jacobian(nw, dx, DI.AutoEnzyme(mode=Enzyme.set_runtime_activity(Enzyme.Reverse), function_annotation=Enzyme.Duplicated), x0, DI.Constant(p0), DI.Constant(NaN)) Trace
|
@gdalle re DI segfault |
In this case, |
Ah I didn't realize that the Segfault only happens in the more complex usecase by DI, not by my very simple When using using Pkg
pkg"activate --temp"
pkg"add Enzyme, DifferentiationInterface"
using Enzyme: Enzyme
using DifferentiationInterface: DifferentiationInterface as DI
struct Functor{RT}
range::RT
end
function (f::Functor)(du, u, p, t)
r = f.range
# r = 1:4 # this literal would work
_du = view(du, r)
_p = view(p, r)
_du .= _p
nothing
end
f = Functor(1:4)
# test normal function call
dx, x, p, t = zeros(4), zeros(4), collect(1.0:4.0), NaN
f(dx, x, p, t)
@assert dx == 1:4
#💣
DI.jacobian(f, dx, DI.AutoEnzyme(mode=Enzyme.set_runtime_activity(Enzyme.Reverse), function_annotation=Enzyme.Duplicated), x, DI.Constant(p), DI.Constant(NaN)) |
Thanks for the smaller MWE! So in the end, this is a tale of two runtime activities. This version errors but the error is pretty self-explanatory: you copied data from backend_errors = DI.AutoEnzyme(; mode=Enzyme.Reverse)
DI.jacobian(f, dx, backend_errors, x, DI.Constant(p), DI.Constant(NaN))
Meanwhile, this version segfaults: backend_segfaults = DI.AutoEnzyme(; mode=Enzyme.set_runtime_activity(Enzyme.Reverse))
DI.jacobian(f, dx, backend_segfaults, x, DI.Constant(p), DI.Constant(NaN)) My best guess is that the problem comes from the runtime activity analysis? |
@gdalle can you boil out the DI sugar to something which errs with just Enzyme calls? And paste the stack trace |
Sounds good, I'll try that tomorrow, logging out for the day! |
bump @gdalle |
Here is the failing code without DI dependency, closely mimicking what @assert VERSION == v"1.10.5"
using Pkg
pkg"activate --temp"
pkg"add Enzyme"
using Enzyme: Enzyme
struct Functor{RT}
range::RT
end
function (f::Functor)(du, u, p, t)
r = f.range
# r = 1:4 # this literal would work
_du = view(du, r)
_p = view(p, r)
_du .= _p
nothing
end
f = Functor(1:4)
# test normal function call
y, x, p, t = zeros(4), zeros(4), collect(1.0:4.0), NaN
f(y, x, p, t)
@assert y == 1:4
# pure enzyme preparation
f!_and_df! = Enzyme.BatchDuplicated(f, ntuple(_ -> Enzyme.make_zero(f), 4))
mode = Enzyme.NoPrimal(Enzyme.set_runtime_activity(Enzyme.Reverse))
y_seeds = ([1.0, 0.0, 0.0, 0.0], [0.0, 1.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0])
y_and_dy = Enzyme.BatchDuplicated(y, y_seeds)
x_and_dx = Enzyme.BatchDuplicated(x, ntuple(_ -> Enzyme.make_zero(x), 4))
# crash
Enzyme.autodiff(mode, f!_and_df!, Enzyme.Const, y_and_dy, x_and_dx, Enzyme.Const(p), Enzyme.Const(NaN)) BTW on 1.11 this results in an AssertionError, no fail, but I guess this is just because Enzyme is not fully compatible with 1.11 yet? |
Thanks @hexaeder, sorry this flew under my radar |
Thanks to Williams heroic effort to support v1.11, the segfault is now reproducible on up-to-date Julia versions too :D |
This will be fixed on 1.10 on next jll bump via EnzymeAD/Enzyme#2210 |
I am trying to make the RHS of an ODEProblem Enzyme compatible. My function has the signature
(du, u, p, t)
and I try to differentiatedu
foru
for constantp
andt
. I hit the errorfor some operations which use
p
in a calculation fordu
. I am quite new to Enzyme and don't fully understand this error, but on very simple examples it isn't a problem to useConst(p)
to calculateDuplicated(du)
.I boiled it down to 2 MWEs. The first MWE is closer to my actual code, including loop unrolling. The second MWE seems to error because of the broadcasting but does not need the loop unrolling to fail. I am not sure whether both demonstrate the same or different problems.
Both Examples have been created on Julia 1.10.5 and Enzyme 0.13.8. I am aware of
set_runtime_activity
, which works for forward mode in my actual example but segfaults for reverse mode...MWE 1
MWE 2
The text was updated successfully, but these errors were encountered: