-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make last(empty) empty #1869
Make last(empty) empty #1869
Conversation
It's actually empty now. (Makes last/1 even slower than before, though.)
Here's a
EDIT: Ah, similar to yours. Yours uses an array instead of an object, which is faster. There have been times when I've wondered about making a special |
Actually, that behavior is critical. It's very important that during the execution of the reduction update body, jq not hold on to any additional references to the reduction state, otherwise the memory consumption of things like assignments blow up. |
I agree it should be possible to bytecode a |
Another one like this I noticed today is |
Here's an efficient (I think)
This is efficient because, though it allocates an array at run-time, it only allocates one array, and it mutates it over and over. |
Probably any case where last/1 performance matter is also one that would be better served by just not computing out values just to throw them away. Unlike first, it doesn't even actually avoid the extra work...
A few variants I tried:
Sometimes performance can be p unintuitive... |
(I think what we're seeing is, in order: keeping [0] in a variable isn't actually an improvement over a constant array; array collection is still turns out to be quicker than three function calls; and skipping elements in the VM is quicker than doing the reduction rigmarole.)
If that does in fact help, it seems like it'd be helpful to |
|
Evaluates g too many times |
Yeah, evaluating g only once is essential, otherwise |
And |
Here's an almost correct
I say "almost" because the |
@muhmuhten looking at the disassembly helps a bit:
vs
Stripping out most of the instruction addresses and diff'ing:
|
@nicowilliams wrote:
For the record, the relevant tweak would be:
Personally, I'm more concerned about |
@pkoppstein |
@nicowilliams wrote:
Consider:
Notice that an error was raised. Similarly, |
|
Taking a page from @muhmuhten's
|
Offhand, but this seems to assume that if an evaluation of the
argument produces no results, it'll never produce any more results. Is
this really a reasonable assumption though? e.g.
`repeat(input|numbers)`
…On 3/22/19, Nico Williams ***@***.***> wrote:
Taking a page from @muhmuhten's `last/1`:
```
$ ./jq -cn '
def repeat(e):
label $out |
def _r:
foreach . as $junk ([]; (e|[.]) // break $out; .[0]), _r;
_r;
[limit(10; repeat(empty))]'
[]
$ ./jq -cn '
def repeat(e):
label $out |
def _r:
foreach . as $junk ([]; (e|[.]) // break $out; .[0]), _r;
_r;
[limit(10; repeat(1,2,3))]'
[1,2,3,1,2,3,1,2,3,1]
$ ./jq -cn '
def repeat(e):
label $out |
def _r:
foreach . as $junk ([]; (e|[.]) // break $out; .[0]), _r;
_r;
[limit(10; repeat(1,null,3))]'
[1,null,3,1,null,3,1,null,3,1]
$
```
--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#1869 (comment)
|
Currently |
A slightly different way to think
Because to date we haven't had any impure expressions that end up being BTW, we still have to do something about C-coded jq functions returning |
The issue is with
I don't think jv_invalid() being a backtrack was a problem, just that Possibly it would be better to have this repeat as a builtin, since it has useful properties and the implementation is nonobvious, but those properties need to be documented since they are a bit surprising. Also, it the init expression of foreach needs to be
|
I agree, but why not write that sort of expression like so: Ah, but the point is this is a breaking change then. OK, fair enough :( |
Another thing like this is that the
But sometimes you want the RHS only if the LHS was The way to jq-code that is remarkably like your
As a function:
("definedor" is the internal name for the The bytecode for this is nowhere near as tight as for |
It seems like "alternative for empty", "alternative for null", and "alternative for falsey things" should be distinct. It's just the first one that's particularly awkward/nonintuitive to implement in jq. I think a bit more work is needed to get definedor right:
|
Actually, wouldn't |
Boxing happens anyways in my version. |
Oops, my |
Using |
last(f) output null for empty (is a bug jqlang/jq#1869) nth($n; f) error on negative $n and pick correct value for short f
This is a kind of unsatisfying patchset, in which I set out to make
last/1
faster, then discovered that it was buggy all along and fixing it made it even slower than it was in the first place.I noticed reduce's update semantic s(it throws away all but the last value from the update), and noticed it basically made for a faster last/1. Passed tests. Discovered
last(empty) == null
. Realized that it was always like that. I don't know what the right thing to do even is and am leaving this here to foist this headache on someone else.last/1 and first/1 are inconsistent, in that
first(empty)
is empty but currentlylast(empty) == null
. The former is fairly intentional and gives rise to a useful pattern used by the current definition ofisempty
. The latter is an accident resulting from reduce semantics.Unfortunately, it is surprisingly hard to implement a consistent last! The attempt given here unnecessarily wraps every single output of the generator in an array, then throws it out unless it's the last one. I think it's possible to bytecode without doing that nonsense, but would abusing variable guts and I don't want to write that and nobody else will understand it.
In any case, the existing documented semantics of nth is inconsistent with both and is also total nonsense, but I just complained about that elsewhere.
Probably the fundamental issue is the
reduce
shouldn't throw away all except the last update. On the other hand, its current behaviour is consistent withforeach
, where it kind of does make sense, in that all the updates get extracted on but only the last one becomes the next state. Maybe that's nonsense too.Also fixed a bug (mea culpa) and an unnecessary call in limit/2 (mea culpa).