Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

surprising bug if n > 65535 -- array slice offset/size overflow #1108

Closed
pkoppstein opened this issue Feb 25, 2016 · 12 comments
Closed

surprising bug if n > 65535 -- array slice offset/size overflow #1108

pkoppstein opened this issue Feb 25, 2016 · 12 comments
Labels
Milestone

Comments

@pkoppstein
Copy link
Contributor

$ jq --version
jq-1.5rc2-150-g1740fd0

$ jq -nc '[range(0;65537)] | .[-1:]'
[0]

$ jq -nc '[range(0;65538)] | .[-2:]'
[0,1]

$ jq -nc '[range(0;65540)] | .[65536:]'
[0,1,2,3]
$ uname -a
Darwin mac-mini 13.4.0 Darwin Kernel Version 13.4.0: Wed Mar 18 16:20:14 PDT 2015; root:xnu-2422.115.14~1/RELEASE_X86_64 x86_64
@nicowilliams
Copy link
Contributor

Oh. Wow. Fixing this one is going to be fun. Here's the problem:

/* All of the fields of this struct are private.
   Really. Do not play with them. */
typedef struct {
  unsigned char kind_flags;
  unsigned char pad_;
  unsigned short offset;  /* array offsets */
  int size;
  union {
    struct jv_refcnt* ptr;
    double number;
  } u;
} jv;

See that offset field? The idea is that to slice an array you just change the offset and size fields of the jv value (which, recall, is always passed around and returned as an immediate value). But these are smallish fields. So if you want to slice past the 65535th element of an array, boom, we overflow and you get something you didn't expect.

I can think of several fixes. One is to fix jv_array_slice() to allocate a new array when the start point of the slice is larger than 65535 and copy the slice -- this is slow if you have huge arrays. Another is to modify jvp_array and the surrounding code to allow one jvp_array to refer to another and point into the other's array of elements, in which case we can use this when the offset needs to be larger than 65535 (or we can even abandon the offset field of jv, which gives us two bytes to play with in the future). Currently I prefer the latter, though it will have more footprint and performance impact on arrays in general than the former, but if I keep the offset field then I think I can minimize that impact.

@nicowilliams nicowilliams added this to the 1.6 release milestone Feb 25, 2016
@nicowilliams
Copy link
Contributor

I'm not fond of int lengths, but @stedolan is. I'm inclined to turn all lengths into unsigned integral types -specifically size_t- where possible, though in struct jv the size has to be a 32-bit int. Thus size should be an int32_t or a uint32_t, though using int or unsigned int today would only break this on Crays :)

h6ah4i added a commit to h6ah4i/jq that referenced this issue Nov 24, 2016
Related issues:
  - Issue jqlang#1108  surprising bug if n > 65535
    jqlang#1108
@nicowilliams
Copy link
Contributor

Ah, yes, I should not forget this one.

@nicowilliams nicowilliams changed the title surprising bug if n > 65535 surprising bug if n > 65535 -- array slice offset/size overflow Mar 1, 2017
@nicowilliams
Copy link
Contributor

I've discussed this with @stedolan.

The simplest thing to do is to simply create a new array that is a copy of the requested slice when the jv .offset or .size fields would overflow. This might even be highly desirable as @stdeolan points out because the slice may be much smaller than the original array, and we might be able to free a lot of memory if we don't hold on to a reference to the original. Of course, if we can't immediately (or anytime soon) free the original then this will add to the memory footprint of jq, but it fixes the immediate issue and buys us time to consider alternatives.

It's also worth noting that @stedolan agrees that we must not enlarge jv. All suggestions of enlarging jv are off the table.

nicowilliams added a commit to nicowilliams/jq that referenced this issue Mar 1, 2017
@nicowilliams
Copy link
Contributor

Thanks for catching this @pkoppstein!

@pkoppstein
Copy link
Contributor Author

@nicowilliams - And thank you for working on this and other issues. Unfortunately, this particular issue has not yet been fully resolved:

$ ./jq --version
jq-1.5rc2-214-g65cbaac

$ ./jq -nc '[range(0;65540)] | .[65536:]'
[0,1,2,3]
 

nicowilliams added a commit that referenced this issue Mar 2, 2017
@nicowilliams
Copy link
Contributor

@pkoppstein Ah, yes, an off-by-one bug. Thanks again!

@nicowilliams
Copy link
Contributor

@pkoppstein I pushed a fix for that. Can you bang on it some more?

@pkoppstein
Copy link
Contributor Author

@nicowilliams - Banged some. As you would say, LGTM. Tx.

@nicowilliams
Copy link
Contributor

@pkoppstein Thanks!

davidfetter pushed a commit to davidfetter/jq that referenced this issue Oct 27, 2017
davidfetter pushed a commit to davidfetter/jq that referenced this issue Oct 27, 2017
@omeid
Copy link

omeid commented Apr 16, 2018

Wasted a lot of time trying to figure this out, I think you should really cut a release. RC is not good enough, this is a serious bug.

@jcbarton
Copy link

jcbarton commented May 8, 2018

Just to add another voice here as well - this caught us out, using the npm package, "node-jq". This wraps jq, and we were seeing issues trying to read very large files in chunks.

Finally figured out it was wrapping at (unsigned short) - and then found this bug.

Is there a possibility of a new release of jq?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants