Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EndpointAuthentication filter #135

Merged
merged 8 commits into from
Nov 25, 2020
Merged

EndpointAuthentication filter #135

merged 8 commits into from
Nov 25, 2020

Conversation

markmandel
Copy link
Contributor

Implementation of the filter that will route packets to Endpoints that have a matching connection_id to the auth token found in the
dynamic metadata.

Closes #8

Not 100% sold on the name of "EndpointAuthentication" - so would love feedback!

@markmandel markmandel added kind/feature New feature or request area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc labels Nov 14, 2020
@google-cla google-cla bot added the cla: yes label Nov 14, 2020
Copy link
Collaborator

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation looks good! Mostly had some comments around docs and metrics

metrics: Metrics,
}

/// Factory for the EndpointAuthentication filter that only allows packets to be passed to Endpoints that have a matching
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Factory for the EndpointAuthentication filter that only allows packets to be passed to Endpoints that have a matching
/// Factory for the EndpointAuthentication filter.

The latter part of the comment is describing a different EndpointAuthentication struct rather than the factory itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, looks like I managed to somehow combine the two 🤦

fn on_downstream_receive(&self, mut ctx: DownstreamContext) -> Option<DownstreamResponse> {
match ctx.metadata.get(self.values_key.as_str()) {
None => {
error!(self.log, "Value key not found in DownstreamContext"; "key" => self.values_key.clone());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could log at warn rather than error, e.g with endpoint churn some packets can be dropped which wouldn't necessarily be an error.
Also can we only log periodically (e.g every 50th?) rather than on every packet that doesn't match?

Copy link
Contributor Author

@markmandel markmandel Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going round and round on this one in my head.

My thinking here (and below) is that if there is no filter setup that provides the authentication token/the wrong type of auth token, then that's a clear error, and we should be really explicit about it. It's not really a warning, it's that something has gone wrong in your config.

To play devil's advocate against myself:

  1. If that's the case, should we catch that at config set time, and fail accordingly? (i.e. have a more robust filter validation logic for Filters?) -- or maybe that mutually exclusive (i.e. we should have both). Reporting a runtime seems less ideal in this scenario though, but at least a decent solution in the medium term.
  2. Are there valid flows where not storing a value in the dynamic metadata is a valid flow, in which case, should we log at all? Or maybe then warning is enough? Counter argument, rather than not storing the token, realistically the filter that should be setting the value should be returning None from the filter to stop processing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, error likely makes more sense given that the a value should always be present.
1 might be worth a shot, though it might not be feasible to get right all the time, today we can e.g check if there's a capturebytes earlier in the chain otherwise reject - but if we add any feature or (custom)filter that dynamically sets (or does so based on its own config in turn) then it would no longer be possible.

Also one change we can make to the log message I think would be to make it explicit that we're dropping the packet

Copy link
Contributor Author

@markmandel markmandel Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we go down the path of 1 (and I think that might be more of an "eventually" thing) - we should probably have each filter publish some kind of schema of what changes it makes to the chain, so that Filters can then backtrack through the chain's config and make sure they have what they need.

But I think that's more of an eventually thing -- I think runtime error level logging is likely fine for the moment. But I 100% agree the logging should be less implementation specific and more user understandable. I will rewrite it.

src/extensions/filters/endpoint_authentication/mod.rs Outdated Show resolved Hide resolved

struct EndpointAuthentication {
log: Logger,
values_key: String,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would become metadata_key or similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh. Didn't flip it over. Thanks.

Comment on lines 122 to 124
}
fn on_upstream_receive(&self, ctx: UpstreamContext) -> Option<UpstreamResponse> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we add an extra line between the functions?

@@ -0,0 +1,70 @@
# EndpointAuthentication
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Router works fine as a name for the filter? thinking its not necessarily authentication specific and the Endpoint part is somewhat redundant.
I'm thinking of it as a filter that looks up information to figure out what endpoint to send a packet to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - Endpoint is redundant - let's drop it 👍

I am worried that Router is too generic. We could have a variety of different Routers as time goes on ( ChaosRouter for some chaos engineering 😄 ?)

AuthenticationRouter ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe TokenRouter might be less generic, if we can say it routes based on a token it gets from somewhere? I think AuthenticationRouter could imply this filter is somehow tied to authentication which would be too much of a specialization

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think AuthenticationRouter could imply this filter is somehow tied to authentication which would be too much of a specialization

Yeah, I can see that. Also means if we wanted to do a router that was more based on other types of "authentication" it could be confusing.

The only possible tweak I could think of (maybe?) is AuthTokenRouter / AuthenticationTokenRouter - since it is an authentication token that is being used here to gain access.

WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not necessarily seeing it as an auth token even though it'll mostly be used for it, e.g one can implement a loadbalancer that checks the packet for a token and routes based on that - in which case the AuthTokenRouter naming would be awkward since the use case isn't auth related

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. Okay, TokenRouter it is. I'll make the change.

That also let's me go back through the docs, as it defines some domain specific language as well. I.e. you use a "token" to route your packet. The token is matched against connection_ids (which won't be called that long term).

I should rename some variables too in the implementation 👍

@@ -0,0 +1,70 @@
# EndpointAuthentication

The `EndpointAuthentication` filter's job is to ensure only authorised clients are able to send packets to Endpoints that
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sounds like its describing the auth use case for the filter rather than the filter itself? We likely want describe it as generically as a filter that can be used to select what endpoints to forward a packet, and mention that it does so via a lookup table with tokens from the packet metadata as a key.
Then if we wanted to, we could explicitly call out in a separate section as an example use case how it could work with token for game servers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! Rewording! Now we also have the "token" nomenclature, so this becomes easier too.

# quilkin::proxy::Builder::from(std::sync::Arc::new(config)).validate().unwrap();
```

View the [CaptureBytes](./capture_bytes.md) filter documentation for more details.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reference seems out of place? is it left over?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved down below into a "Sample Applications" section.

type: string
default: quilkin.dev/captured_bytes
description: |
The key under which the captured bytes are stored in the Filter invocation values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the description mentions capture bytes directly, it could be more generic e.g the key under which the route token is stored? since captured bytes is only relevant in the cases where the CaptureBytes filter was used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tweaked!

Comment on lines 28 to 32
- name: quilkin.extensions.filters.capture_bytes.v1alpha1.CaptureBytes # This filter is often used in conjunction to capture the authentication token
config:
metadataKey: myapp.com/myownkey
size: 3
remove: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my idea of the examples in this section are to showcase the config for the filter being described. Ideally that shouldn't include other filters since that contains unneeded information.
If we wanted to include the capturebytes filter, it might make more sense to do that in a dedicated section (not necessarily in this filter's page, we could link to it from here) that explains the auth routing usecase and how both filters fit together and have an example like this where we also walk through how the configured values affect packet routing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a "Sample Applications" section below - PTAL!

@markmandel
Copy link
Contributor Author

Some tweaks incoming. Rename coming next with some docs changes.

@markmandel markmandel force-pushed the feature/endpoint-auth branch 2 times, most recently from 5010310 to 2786558 Compare November 19, 2020 21:19
@markmandel
Copy link
Contributor Author

Ready for a new review 👍

Copy link
Collaborator

@iffyio iffyio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a couple minor doc suggestions LGTM!

@@ -66,7 +66,8 @@ Quilkin includes several filters out of the box.
| [Debug](debug.md) | Logs every packet |
| [LocalRateLimiter](./local_rate_limit.md) | Limit the frequency of packets. |
| [ConcatenateBytes](./concatenate_bytes.md) | Add authentication tokens to packets. |
| [CaptureBytes](capture_bytes.md) | Capture bytes from a packet into the Filter Context. |
| [CaptureBytes](capture_bytes.md) | Capture specific bytes from a packet and store them in filter dynamic metadata. |
| [TokenRouter](token_router.md) | Only sends packets to Endpoints that they are authenticated to access. |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| [TokenRouter](token_router.md) | Only sends packets to Endpoints that they are authenticated to access. |
| [TokenRouter](token_router.md) | Send packets to endpoints based on metadata. |

e.g in order to avoid specifying authentication in the description of the filter

The `TokenRouter` filter's job is to ensure only authorised clients are able to send packets to Endpoints that
they have access to.

It does this via matching an authentication token found in the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It does this via matching an authentication token found in the
It does this by matching a token found in the

they have access to.

It does this via matching an authentication token found in the
[Filter dynamic metadata]`(TODO: add link to dynamic metadata docs)`, and comparing it to Endpoint's connection_id
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[Filter dynamic metadata]`(TODO: add link to dynamic metadata docs)`, and comparing it to Endpoint's connection_id
[Filter dynamic metadata]`(TODO: add link to dynamic metadata docs)`, to an Endpoint's connection_id

Comment on lines +42 to +47
packets_dropped_no_token_found: metric
.get_metric_with_label_values(vec!["NoTokenFound"].as_slice())?,
packets_dropped_invalid_token: metric
.get_metric_with_label_values(vec!["InvalidToken"].as_slice())?,
packets_dropped_no_endpoint_match: metric
.get_metric_with_label_values(vec!["NoEndpointMatch"].as_slice())?,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works, ideally we could store only metric and retrieve the counter when needed e.g self.metrics.packets_dropped.get_metric_with_label_values(vec!["NoTokenFound"].as_slice())?.inc(); but this probably makes sense since we can fail early if the number of labels mismatch rather than try to handle it later at runtime - thinking later on might be nice to have some helper that handles that error transparently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - I figured it made more sense to handle it here, and fail fast if it's setup incorrectly in our code. It felt weird to be constantly handling this potential error when we have a finite set of labels and values that are known at compile time.

filter_opts(
"packets_dropped",
"TokenRouter",
"Total number of packets dropped. Reason is provided via the `Reason` label.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Total number of packets dropped. Reason is provided via the `Reason` label.",
"Total number of packets dropped. labels: reason",

We can only list the labels themselves to help with querying. since e.g reason described in the doc for the metric

Comment on lines +58 to +62
* `NoEndpointMatch` - The token provided via the Filter dynamic metadata does not match any Endpoint's connection
ids.
* `NoTokenFound` - No token has been found in the Filter dynamic metadata.
* `InvalidToken` - The data found for the token in the Filter dynamic metadata is not of the correct data type
(Vec<u8>)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay nice that we can have these documented here as well 🙌

@markmandel markmandel force-pushed the feature/endpoint-auth branch 2 times, most recently from 13d72d1 to 5a760ef Compare November 25, 2020 01:36
@markmandel markmandel force-pushed the feature/endpoint-auth branch from c31dec5 to 099d79b Compare November 25, 2020 05:40
Some(token) => {
match ctx
.endpoints
.retain(|e| e.connection_ids.iter().any(|id| id.as_ref() == token))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iffyio looking at the method signature for UpstreamEndpoints.retain as it's used here, do you think we should (not this PR) change the method signature from:

pub fn retain<F>(&mut self, predicate: F) -> Result<(), AllEndpointsRemovedError>

to

pub fn retain<F>(&mut self, predicate: F) -> Option<()> ?

Or maybe something else?

Just feels like having no endpoints after filtering will be a common thing, and it doesn't feel like an error, so might not want to be handled as such?

But not sure, as if it returns None then that means retain didn't change the underlying subset. 🤔 wdyt?

@markmandel markmandel merged commit 6912706 into master Nov 25, 2020
@markmandel markmandel deleted the feature/endpoint-auth branch November 25, 2020 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc cla: yes kind/feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter idea: Simple routing with every packet having the connection id appended
2 participants