Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update validator key/index URI array limits #328

Merged
merged 2 commits into from
Jul 9, 2023

Conversation

jshufro
Copy link
Contributor

@jshufro jshufro commented Jun 16, 2023

RFC-7230 recommends that the Request-Line component of an HTTP 1.1 request is limited to a minimum of 8000 bytes.

Each pubkey is 48 bytes binary, or 96 bytes hex encoded. Prefixing with 0x adds two more bytes for 98.

All but the first pubkey are preceded by a comma, so with a limit of 64 we will have consumed 64*98+63 bytes, or 6335 bytes.

Factoring the length of these paths (we will assume the state is a stateRoot with 0x prefix, ie 66 bytes, to remain adequately pessimistic) we get 107 bytes for validator_balances and 99 for validators. We increase our consumption by 107 bytes to 6442 bytes.

Adding in some unavoidable bits and pieces from the RFC section on Request-Line-

  • ?id= = 6446
  • GET = 6449
  • HTTP/1.1 = 6457
  • 2 spaces and CR LF = 6461

This leaves us with 1539 bytes to spare.

Section 5.3.2 goes on a bit about when the Request-Target may include the absolute URI (ie the hostname), so having a bit of headroom is good if we expect some http clients to respect this rule (which is specifically for proxies, and anecdotally I've never seen it be respected).

@mcdee
Copy link
Contributor

mcdee commented Jun 16, 2023

There were implementation issues when we first looked at this, which is why we settled on 30. Can't remember the details of which client/middleware had the problem, mind...

Copy link
Contributor

@arnetheduck arnetheduck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problems in nimbus - we can handle well past 1024 validators in a single call and 30 is unnecessarily low - +1 for using at existing RFC:s as base for decision, that's a reasonable bar of support in any situation

@jshufro
Copy link
Contributor Author

jshufro commented Jun 16, 2023

There were implementation issues when we first looked at this, which is why we settled on 30. Can't remember the details of which client/middleware had the problem, mind...

I'm interested to know the specifics!

Rocket Pool has been passing very large batches to the validators endpoint for some time now, so I can confirm that all of teku, lodestar, nimbus, prysm and lighthouse support at least 64 values.

@nflaig
Copy link
Member

nflaig commented Jun 17, 2023

Rocket Pool has been passing very large batches to the validators endpoint for some time now

This is true, however for Lodestar, the default max header size (16384 bytes) is increased to 65536 to support more than 500 keys. Still, 64 is far below that limit so no issues for Lodestar.

I don't think any client actually enforces the 30 items limit and if they start doing it, a lot of downstream tooling would just break.

In general, I think it would be good to advocate for batched requests even if it is not strictly enforced by the beacon node, having a well-defined max limit for this in the API spec is definitely a good thing.

@jshufro
Copy link
Contributor Author

jshufro commented Jun 17, 2023

I don't think any client actually enforces the 30 items limit and if they start doing it, a lot of downstream tooling would just break.

I don't think beacon nodes should interpret the limit as anything other than "the maximum should be at least this value".

validator clients and other applications downstream of the beacon node, however, should use it (whether it's 30 or 64) to batch requests, so they are compatible with a service that does enforce a limit of 30/64. I'd also suggest that the client teams make it configurable, as lodestar has done.

@rolfyone
Copy link
Contributor

Fairly sure originally Teku was using default jetty and more than 30 was potentially a problem.

We've overridden this now, so 64 should be fine for us, but it's very 'http server' related, depends on implementations.

I don't think we'd choose to reject things that exceed the 'maxItems', and if noone is enforcing the 30, maybe we just remove the maxItems here rather than increase it to another 'too small' value?

@jshufro
Copy link
Contributor Author

jshufro commented Jun 19, 2023

I don't think we'd choose to reject things that exceed the 'maxItems', and if noone is enforcing the 30, maybe we just remove the maxItems here rather than increase it to another 'too small' value?

Even if not enforced, I think it has positive impact as a "minimum maximum". Ultimately there is a limit, somewhere, maybe in an http library, maybe a proxy. By setting maxItems to 64 we signal to beacon nodes that they must support at least 64, and we signal to validator clients that they can safely submit up to 64 for all clients.

@rolfyone
Copy link
Contributor

Even if not enforced, I think it has positive impact as a "minimum maximum". Ultimately there is a limit, somewhere, maybe in an http library, maybe a proxy. By setting maxItems to 64 we signal to beacon nodes that they must support at least 64, and we signal to validator clients that they can safely submit up to 64 for all clients.

I don't disagree in principle, it's just that a lot of the route we can't control. It's a nice to have but it's not possible to guarantee because there's a lot of places it could fail before we even get the request. 64 as a minimum (maximum) would be ideal, but ultimately it's a route, so any proxies, load balancers etc get a say due to the URL constraints, and "actual results may vary"

@jshufro
Copy link
Contributor Author

jshufro commented Jun 19, 2023

I disagree with removing it altogether though I'm having trouble articulating why. Based on the RFC, 64 should work except in extreme cases. Without a limit, ~300 breaks for a BN behind cloudflare's "orange cloud" tls termination dns proxy thing (we found out the hard way).

Yes, it's impossible to come up with a number that is guaranteed to always work, but I think 64 is the right balance between spammy on the low end and likely-to-break on the high.

Maybe removing it and adding an admonition about Request-Line limits to the documentation is good enough.

@rolfyone
Copy link
Contributor

To me a maximum that's not enforced anywhere is basically just documentation. We can put it in, but it's not going to be actually enforced, people can go well over it, and as long as it gets to the endpoint it's likely to be processed... To me it's kind of academic, unless someone is using these values in some important way in generated apis or something...

@arnetheduck
Copy link
Contributor

arnetheduck commented Jun 19, 2023

To me a maximum that's not enforced anywhere is basically just documentation.

the point of this limit is to give validator clients a starting point that they know the server will reasonably accept - we've already hit a lighthouse/nimbus interop problem because nimbus was using a higher limit in the client (1024) than lighthouse in their server leading to user disappointment. to "guarantee" that nimbus vc will stay reasonably compatible the best thing we can do is to use maxItems in the client.

the server would do well to support any number it sees fit ("be liberal in what you accept") and we can document this separately from the actual maxItems limit.

I think maybe the contention point here is really around naming: maxItems is a poor name when looking at from the server side because "the server must accept at least N items" is implied by it's "client-side" interpretation, namely that "clients must send at most N items".

I think any method of documenting this point would be welcome and maxItems is not a bad option because it satisfies at least the client (ie one can imagine writing a client that automatically reads this limit from the spec and splits up requests accordingly)

@rolfyone
Copy link
Contributor

Remember also that the thing we seem to be so interested in is the pubkey, and it's possible to set validator id's not just pubkeys...

This is a very academic argument. if people want the number to remain, at least set 200, and lets wrap this up.

If we enforced the limit, it's an impossible problem, this is why I dislike these interfaces that pass pubkey on the URL rather than request body, we can't set a realistic limit unless we split out the keys from the indices, and set limits for both, and it comes down to url length and external dependencies as to what that limit is.

@jshufro
Copy link
Contributor Author

jshufro commented Jun 19, 2023

We're talking about pubkeys specifically to worst-case our analysis, which is important if we're trying to keep Request-Line bytes under control.

200 pubkeys would be about 20 Kb which probably breaks certain transports.

I'd like to escape from talking about enforcing this limit. Nobody wants that. Can we rename it to 'minAccepted' in a subsequent commit?

@arnetheduck
Copy link
Contributor

at least set 200,

As discussed above, the 64 number is based on recommendations established by internet RFC:s which means that infrastructure "out there" will typically support it: libraries, proxies and so on - as long as the url in general stays under 8k characters, things will just work for most everyone involved.

if we want to support more than 8k characters, POST is the way to go - then maxItems can be cranked up to any number we want. That's a much bigger change though than simply setting this number to something slightly more reasonable (as in "related to some standard) than 30 which was pulled out of thin air.

@rolfyone
Copy link
Contributor

if we want to support more than 8k characters, POST is the way to go - then maxItems can be cranked up to any number we want. That's a much bigger change though than simply setting this number to something slightly more reasonable (as in "related to some standard) than 30 which was pulled out of thin air.

Fairly sure 30 was because of the small limit teku had... it wasn't arbitrary...

Changing the name is less simple because we're following the swagger schema, there may be an option, but i think within swagger we'd be better off just not specifying maxItems at all and putting it in the description, because maxItems has a meaning and we're not following it at all.

@jshufro
Copy link
Contributor Author

jshufro commented Jun 20, 2023

After skimming the appropriate docs, changing the name doesn't seem viable.

I might be overly pedantic for saying so, but this:

   An array instance is valid against "maxItems" if its size is less
   than, or equal to, the value of this keyword.

doesn't explicitly state that an array instance with a size greater than the value of maxItems is invalid. It only explicitly states that an array instance with a size less than or equal to the value of maxItems is valid. There's no iff/if and only if here.

@rolfyone
Copy link
Contributor

doesn't explicitly state that an array instance with a size greater than the value of maxItems is invalid. It only explicitly states that an array instance with a size less than or equal to the value of maxItems is valid. There's no iff/if and only if here.

'less than or equal to' is not a suggestion...

Anyway clearly I'm in the minority here. Are we sticking with 64?

@jshufro
Copy link
Contributor Author

jshufro commented Jun 20, 2023

'less than or equal to' is not a suggestion...

It's a very literal interpretation that I don't care to defend, but formally it would look something like:

A ≔ The array's size is less than or equal to maxItems
B ≔ An array instance is valid
A → B (An array instance is valid against "maxItems" if its size is less than, or equal to, the value of this keyword.)

And the argument is that this is a fallacy:

¬A (the length is greater than 64)
∴ ¬B (the array is not a valid length)

which is fallacy of the inverse AKA denying the antecedent. It would be a different story if the language was if and only if or we were speaking Lojban or something.

Sorry for nerding out. I can't help it sometimes.

Copy link
Contributor

@rolfyone rolfyone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is ok as a starting point, and conversation has stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants