Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rate limit: add computed descriptors #14448

Merged
merged 26 commits into from
Jan 10, 2021

Conversation

kyessenov
Copy link
Contributor

@kyessenov kyessenov commented Dec 16, 2020

Signed-off-by: Kuat Yessenov kuat@google.com

Commit Message: Introduce computed descriptors to rate limits. Refactors the rate limit descriptor interface to accept StreamInfo and RequestHeaderMap. As a general refactor, adds an extension point for alternative rate limit descriptors to register.
Risk Level: low, not enabled unless used
Testing: unit
Docs Changes: yes
Release Notes: yes

Signed-off-by: Kuat Yessenov <kuat@google.com>
@repokitteh-read-only
Copy link

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #14448 was opened by kyessenov.

see: more, trace.

@markdroth
Copy link
Contributor

I think I need some more context here. What is this going to be used for, and how do you expect that it will be used?

@kyessenov
Copy link
Contributor Author

kyessenov commented Dec 16, 2020

@markdroth Rate limit descriptors act as categories for requests, since token buckets are applied per category. We have a requirement to use source pod labels as a category. Our extension exposes this information in a filter state, so we need a way to extract this info from the filter state (which means traverse the filter state down to a field).

The current design relies on adding more and more oneof messages here, which we can avoid by just using a generic AST from CEL.

// The key to use in the descriptor entry.
string descriptor_key = 1 [(validate.rules).string = {min_len: 1}];

oneof expr_specifier {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a oneof?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's anticipating parsed_expr here which would introduce a dependency on expr proto. I can drop it since it's not needed right now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think we're going to need the oneof later, let's keep it for now. We can't add it later without breaking the go proto API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I believe that we are actually already using the CEL parsed expression proto in other places in the API, so this is probably not a new external dependency. You might be able to just add it now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dependency is coming from a filter. This is a core proto. Thoughts @htuch?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't want to depend on the CEL protos for the xDS core protos, another option might be to make this more extensible by simply adding a TypedExpressionConfig. In the long term, we could maybe even migrate the existing options here to be extensions as well.

@envoyproxy/api-shepherds, thoughts? I wonder if we should add something to the API review checklist in #14399 that prompts people to consider using TypedExtensionConfig instead of a oneof?

Copy link
Member

@htuch htuch Dec 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an Envoy binary perspective, are we adding this to core router, or does it only apply to rate limit filter? Hard to tell from config. If core, @kyessenov would be nice to have clarity on when CEL can a reasonable number of the checkmarks in https://github.com/envoyproxy/envoy/blob/master/DEPENDENCY_POLICY.md#new-external-dependencies. Whatever is reasonable, but if this is going to become a core data plane dependency, it's probably good to use this opportunity to check in. CC @envoyproxy/dependency-shepherds

From an API perspective, I don't feel super strongly, CEL is a reasonably standard thing. One area to have some concern around is that using CEL is similar to regex; we're introducing dynamic evaluation of some expression, which ultimately might have security and inter-request isolation concerns. I don't know if this is in any way actionable, but certainly when communicating in comments/docs its worth pointing out what the runtime implications are. E.g. does evaluating the expression have a bounded amount of time for a given expression, or can input size affect it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the runtime change is scoped to global and local rate limit filters. So this is purely a question of a proto dependency, to avoid parsing overhead with a parsed expression-as-proto. Good point about dynamic evaluation, I'll document that.

Expression evaluation is linear in both input and expression size. The input here is request metadata.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that rate limit stuff is in the main router is a historical artifact of before we had per-filter route config, etc. None of this should be here and I think there is a tracking issue to move it all out, but I don't think that should block this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think this forces moving expr package into common from extensions, since it gets pulled in the implementation.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

The PR is ready for review.

I wonder what I should do about the layering enforcement between core and extensions. It seems that rate limit descriptors are in the core as a historic artifact, so I'd need to pull "Expr" package into source/common to match it because it is a dependency from the router. Is that a reasonable approach? Ideally, rate limit descriptors would be another extension point and separated from the router, but I don't see an easy way to achieve that right now.

@kyessenov kyessenov marked this pull request as ready for review December 23, 2020 04:26
@kyessenov kyessenov requested a review from lizan as a code owner December 23, 2020 04:26
@kyessenov
Copy link
Contributor Author

Bringing this back to attention: I am looking how to resolve the layering violation in API with rate limit descriptors being in the core. There are a couple of ways around it:

  1. Add an opaque Any extension to rate limit descriptors, register additional descriptors in rate limit extensions.
  2. Keep as-is, move expr-related packages (CEL, other transitive stuff like ANTLR) as a "data-plane" dependency.

Opinions?

@mattklein123
Copy link
Member

Opinions?

Definitely not (2). As I already mentioned, the rate limit stuff is only in the core due to legacy reasons. It should all be removed. If you want to take one for the team we can do that (deprecate, move to filter specific route config).

If you aren't willing to do that I think (1) is our best option and we can add extensions there as @rgs1 is doing currently for access log formatters. In thinking about it we probably want an extension point here anyway, so I guess you could just go ahead and add it orthogonally to moving the stuff out of core.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

kyessenov commented Jan 6, 2021

Thanks! Updated the PR to address a new extension point for rate limit descriptors and added computed descriptors as an extension. I did some basic refactoring around rate limit interfaces to make it look like other extension points.

Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14448 (comment) was created by @kyessenov.

see: more, trace.

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@lizan lizan self-assigned this Jan 7, 2021
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
lizan
lizan previously approved these changes Jan 7, 2021
@repokitteh-read-only repokitteh-read-only bot removed the api label Jan 7, 2021
@kyessenov
Copy link
Contributor Author

The coverage checker counts the #ifdef and NOT_REACHED_GCOVR_EXCL_LINE as no coverage for some reason. Any hint to work around it?

Signed-off-by: Kuat Yessenov <kuat@google.com>
Signed-off-by: Kuat Yessenov <kuat@google.com>
@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Check envoy-presubmit isn't fully completed, but will still attempt retrying.
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14448 (comment) was created by @kyessenov.

see: more, trace.

@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #14448 (comment) was created by @kyessenov.

see: more, trace.

@kyessenov
Copy link
Contributor Author

@lizan Could you please re-approve and merge? There's a merge conflict with #14588 that I need to fix.

@repokitteh-read-only repokitteh-read-only bot removed the api label Jan 8, 2021
@kyessenov
Copy link
Contributor Author

@moderation Hi, do you have a minute to approve deps change in this PR?

@moderation
Copy link
Contributor

Done @kyessenov. Metadata only deps change
/lgtm deps

@repokitteh-read-only repokitteh-read-only bot removed the deps Approval required for changes to Envoy's external dependencies label Jan 10, 2021
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just skimmed, but this is great, thanks.

@mattklein123 mattklein123 merged commit 876a363 into envoyproxy:master Jan 10, 2021
mpuncel added a commit to mpuncel/envoy that referenced this pull request Jan 12, 2021
* master:
  http: support creating filters with match tree (envoyproxy#14430)
  [tls] Expose ServerContextImpl::selectTlsContext (envoyproxy#14592)
  docs: update ext_proc docs to reflect implementation status (envoyproxy#14636)
  filter manager: drop assert (envoyproxy#14633)
  kick off v1.18.0 (envoyproxy#14637)
  1.17.0 release (envoyproxy#14624)
  Implement request header processing in ext_proc (envoyproxy#14385)
  http: expose encoded headers/trailers via callbacks (envoyproxy#14544)
  [fuzz] fix minor fuzz bugs (envoyproxy#14593)
  rate limit: add computed descriptors (envoyproxy#14448)
  tools: fill in the required args for Api::Impl (envoyproxy#14554)
  Bump envoy-build to current images (envoyproxy#14608)

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants