You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The fix for #618 (7243989) introduced DUPN to the compilation of reduce f as $x (init; update) and foreach f as $x (init; update; extract) to address extra references being held.
However, the value that DUPN dupes (that is, . as it is passed to f) comes from before a forkpoint, which makes DUPN unsafe here; if we backtrack to the FORK (in particular, if init backtracks), then f will see . mutated to null. This is a problem, as to my understanding the mutation in DUPN (and the other -N instructions) should be non-observable optimizations.
Put another way, it should be the same as e.g. [] | . as $dot | reduce ($dot | .[]) as $x (0, 0; .), which is a workaround today.
Environment
OS: Linux (NixOS 25.05)
jq version: tested against current master (588ff18) and 1.7.1
Additional context
I noticed this issue while inspecting the bytecode, which gives some insights towards addressing it, which I've noted below.
From digging in thread history, it seems that @leonid-s-usov noticed the same issues with the bytecode while looking for a way to remove the DUPN instruction for another reason. They give a draft of potential changes in #2059, which from my testing would be sufficient to fix this bug.
foreach
This shouldn't be an issue with foreach in the first place, and should be fixable without having to compromise. Inspecting the bytecode for the example above:
Notice that the FORK 0031 backtracks directly to a BACKTRACK, which is unnecessary, as backtracks in the loop body will already backtrack properly, and we don't need to consume the stream like we do in reduce. I suspect that this is only here because 5a863bf copied gen_reduce and modified it a bit.
If the FORK ... BACKTRACK (and JUMP) structure is removed, then DUPN isn't necessary anymore; if we don't need to backtrack, then . isn't held across a forkpoint and stack_pop_will_free, meaning no unnecessary references. If we do need to backtrack, then we needed to keep the reference anyway.
reduce
reduce can't avoid the FORK ... BACKTRACK. A STOREV $dot ... FORK ... LOADVN $dot would serve a similar purpose to DUPN for avoiding unnecessary references while not causing backtracking issues; that's the only potential fix I can think of while staying within the constraints of the current instruction set.
Footnotes
This is the output from local testing of a potential fix (removing FORK from foreach, using STOREV/LOADVN instead of DUPN in reduce) ↩
The text was updated successfully, but these errors were encountered:
The fix for #618 (7243989) introduced
DUPN
to the compilation ofreduce f as $x (init; update)
andforeach f as $x (init; update; extract)
to address extra references being held.However, the value that
DUPN
dupes (that is,.
as it is passed tof
) comes from before a forkpoint, which makesDUPN
unsafe here; if we backtrack to theFORK
(in particular, ifinit
backtracks), thenf
will see.
mutated tonull
. This is a problem, as to my understanding the mutation inDUPN
(and the other-N
instructions) should be non-observable optimizations.To reproduce
The init contains a
,
, so.[]
gets[]
on the first pass, butnull
when backtracking.Expected behavior
reduce
andforeach
should not break backtracks that run back through the body. That is to say, we should see:1
Put another way, it should be the same as e.g.
[] | . as $dot | reduce ($dot | .[]) as $x (0, 0; .)
, which is a workaround today.Environment
jq
version: tested against currentmaster
(588ff18) and 1.7.1Additional context
I noticed this issue while inspecting the bytecode, which gives some insights towards addressing it, which I've noted below.
From digging in thread history, it seems that @leonid-s-usov noticed the same issues with the bytecode while looking for a way to remove the
DUPN
instruction for another reason. They give a draft of potential changes in #2059, which from my testing would be sufficient to fix this bug.foreach
This shouldn't be an issue with
foreach
in the first place, and should be fixable without having to compromise. Inspecting the bytecode for the example above:Notice that the
FORK 0031
backtracks directly to aBACKTRACK
, which is unnecessary, as backtracks in the loop body will already backtrack properly, and we don't need to consume the stream like we do inreduce
. I suspect that this is only here because 5a863bf copiedgen_reduce
and modified it a bit.If the
FORK ... BACKTRACK
(andJUMP
) structure is removed, thenDUPN
isn't necessary anymore; if we don't need to backtrack, then.
isn't held across a forkpoint andstack_pop_will_free
, meaning no unnecessary references. If we do need to backtrack, then we needed to keep the reference anyway.reduce
reduce
can't avoid theFORK ... BACKTRACK
. ASTOREV $dot ... FORK ... LOADVN $dot
would serve a similar purpose toDUPN
for avoiding unnecessary references while not causing backtracking issues; that's the only potential fix I can think of while staying within the constraints of the current instruction set.Footnotes
This is the output from local testing of a potential fix (removing
FORK
fromforeach
, usingSTOREV
/LOADVN
instead ofDUPN
inreduce
) ↩The text was updated successfully, but these errors were encountered: