Extend extractors with an optional seed per endpoint. #126

HeroicKatora · 2019-01-20T21:25:30Z

Description

The Extract-trait was complemented by another trait to allow one type to provide an seed for extracting multiple others. All existing self-extracting types (Body, Json, Path, …) are provided with an impl Extract<T, Data> for (). A second impl for Endpoint has been created: Seeded. It is essentially a tuple of a standard endpoint function and a Sync + 'static seed tuple which provides the seeds for extracting the function arguments.

The new trait looks as follows (trait bounds may be stricter than necessary), fairly closely modeled after serde::de::DeserializeSeed:

/// An extractor for an app with `Data`
pub trait ExtractSeed<Param, Data>: Send + Sync + Sized + 'static 
    where Param: Send + Sized + 'static 
{
    /// The async result of `extract`.
    ///
    /// The `Err` case represents that the endpoint should not be invoked, but
    /// rather the given response should be returned immediately.
    type Fut: Future<Output = Result<Param, Response>> + Send + 'static;

    /// Attempt to extract a value from the given request.
    fn extract(&self,
        data: &mut Data,
        req: &mut Request,
        params: &Option<RouteMatch<'_>>,
        store: &Store,
    ) -> Self::Fut;
}

An example impl for this trait (except () for all Extract), uses a runtime provided header name to extract the corresponding header:

/// Wrapper for conenience of disambiguiation, can be discussed
pub struct NamedHeader(pub http::header::HeaderName);

impl<T: 'static, S: 'static> ExtractSeed<Header<T>, S> for NamedHeader 
    where T: From<http::header::HeaderValue> + Send
{
    type Fut = future::Ready<Result<Header<T>, Response>>;
    fn extract(&self,
        data: &mut S,
        req: &mut Request,
        params: &Option<RouteMatch<'_>>,
        store: &Store,
    ) -> Self::Fut {
        let header = req.headers().get(&self.0);
        match header {
            Some(value) => future::ok(Header(value.clone().into())),
            None => future::err(http::status::StatusCode::BAD_REQUEST.into_response()),
        }
    }
}

Similar to the other Endpoint impl for closures, for Seeded we also use a closure to provide implementations up to 9 arguments. It follows the other implementation closely, only extending the required trait parameters by new ones and replacing the extraction logic with the following:

// within the endpoint macro, from
let ($($Y),*) = &self.1;
$(let $X = <$Y as ExtractSeed<$X, Data>>::extract($Y, &mut data, &mut req, &params, store);)*

Motivation and Context

This allows an extracted value to depend on information only available at runtime. Such behaviour was previously only (nicely, without static Mutex<T>) possible by creating a new generic type wrapping the actually intended parameter type. This generic would then take a type argument implementing a trait. However, this trait implementation must typically be deferred to an enduser who has to customize it according to their own app state type. Futhermore, this needs one type and impl per endpoint with different configuration.

One problem where this would be applicable is authentication/authorization. Different endpoints may require different security realms or permissions, different user names etc. Encoding this all into types with static functions to derive different runtime values from app state quickly becomes a hassle of writing the same implementation with just slightly different constants. However, the outcome parameter into endpoint functions should mostly be very similar. A single value containing the name of the authorized entity. By implementing this via type-level generics though, the type of this value quickly becomes much much more complex. Moving the necessary information into runtime instead allows one to use a single type to guarantee the necessary invariant.

Programmers may still use newtype structs to type such values more securely but these can be expressed more succinctly without generics.

This could be a solution for #108 as NamedComponent may gain a seed which gets supplied with the name at runtime. This has been exemplified in the example.

This solution is not intended to encourage arbitrary reconfiguration of extractor seeds at runtime although it could allow it. The seeds are hidden behind a BoxedEndpoint as soon as they are provided to the router and may only inspect themselves immutably while requiring Send + Sync.

How Has This Been Tested?

There is an example that extracts the user-agent header based on a runtime constructed HeaderName. Another route extracts the path component num where num was supplied via a static string instead of an associated item on a type parameter.

Other tests in cfg(test) will follow as implementation goes beyond Poc. No other tests had to be modified for these changes to work as the App interface remained unchanged.

Other TODOs

Provide missing wrappers like Resource::get_with
Consistently create dynamic seeds for extracted types where applicable (in the style of SegmentName, NamedHeader)
Use the same naming scheme for all seeds and types

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation.
I have updated the documentation accordingly. (Not yet)
I have read the CONTRIBUTING document.
I have added tests to cover my changes. (Not yet)
All new and existing tests passed. (All existing, new tests to come)

edit 2019-01-21 (yw): updated link to example, and added syntax highlighting.

yoshuawuyts · 2019-01-21T09:56:54Z

ll existing self-extracting types (Body, Json, Path, …) provide impl Extract<T, Data> for () and the Endpoint implementation of Fn(…) was changed to use () to extract these.

While reading the source I was a bit confused by how this works. I think explaining this in the source itself would be helpful!

Pinging @aturon @tirr-c -- this PR seems pretty involved with the design of Tide, I think it'd be good if you could take a look.

HeroicKatora · 2019-01-21T15:01:50Z

@yoshuawuyts I expanded the code to that section underneath, condensed to the really relevant lines, I hope that is what you meant.

tirr-c

I briefly looked through the code, and this looks quite a large change!

tirr-c · 2019-01-22T15:16:42Z

examples/seeded_extractor.rs

+
+fn main() {
+    let mut app = tide::App::new(());
+    app.at("/").get_with(display_header, NamedHeader(HeaderName::from_static("user-agent")));


This get_with call would require ExtractSeed<fn(Header<HeaderValue>) -> String, NamedHeader>: Endpoint<_, _>, but NamedHeader is not a tuple. I think this should be (NamedHeader(...),), but maybe I've got something wrong.

No, rather I missed something because in the macro expansion for one argument, this let binding gets created: let ($($X),*) = …. Which of course then expands to let (T0) = and not a tuple as in let (T0,) = …. And the same for the right side of that assignment.

I must however say it felt more ergonomic to write, it is much less confusing with one less pair of brackets and comma.

src/body.rs

tirr-c · 2019-01-22T15:22:19Z

src/extract.rs

@@ -3,15 +3,15 @@ use futures::prelude::*;
 use crate::{configuration::Store, Request, Response, RouteMatch};

 /// An extractor for an app with `Data`
-pub trait Extract<Data>: Send + Sized + 'static {
+pub trait Extract<Param, Data>: Send + Sync + Sized + 'static where Param: Send + Sized + 'static {


I feel like the trait signature should be Extract<Seed, Data>. This looks quite reversed; for instance, Json is a extractor, not ().

() is intended as an extractor for Json. That sounds more plausible even, () extracts Param from Data. If we go the way of adding a trait rather than replacing one, this could of course be solved with another name. I'm open to bikeshedding.

However, Extract<(), Data> for Json would mean that the natural &self argument becomes &Seed. Then the trait can no longer be boxed to create a Box<Extract<Json, Data, Fut=Box<Future<…>>>> that can extract Json.

tirr-c · 2019-01-22T15:24:09Z

src/head.rs

+
+pub struct Header<T>(pub T);
+
+impl<T: From<http::header::HeaderValue> + Send + 'static, S: 'static> Extract<Header<T>, S> for NamedHeader {


A seed doesn't have to be a newtype; impl Extract<Header<T>, S> for http::header::HeaderName would work fine here! (or reversed, as of the review above.)

I gravitated to that choice as well, but consider String may get confusing if used as the extractor for too many argument types. Then, consistency of providing a newtype may be more important than not nesting in a newtype. It has its pros and cons, feel free to disagree.

aturon · 2019-01-23T20:56:59Z

Thanks for the PR, @HeroicKatora! There are definitely some interesting ideas here. I want to mull on it a bit, but here are some immediate thoughts:

It seems like there are two problems you're taking on here. One is extraction that depends on runtime values. Another is avoiding having to encode too much into the type system. (I do think these may be related issues, but still worth separating out). Does that seem right?
AIUI, seeding for endpoints is "all or nothing": either you don't pass in any seeds (and everything has to work with ()), or else you have to pass a tuple with all of the seeds explicit (including passing dummy () values). While it's possible to do type-level computation to give more flexibility, that route entails a lot of complexity (and usually baffling error messages).
Seeding spreads out the logic of extraction a bit: rather than it being wholly contained in the endpoint definition, now some important parts of provided in router definition. That's a bit unfortunate.
For comparison, I had in mind a couple of other ways of addressing the core problems, IIUC:
- Extractors don't have to be transparent newtypes. So, for example, we could provide a Headers extractor that then provides a method for extracting particular headers. That method would be invoked within the endpoint, thereby centralizing extraction logic there. Being a method, it can take a runtime parameter for which header is desired.
- Extractors should be able to use Tide's configuration system. This could possibly be used to provide parameters to a given extractor instance, though we haven't worked out a full, ergonomic way for doing that.

In short, I definitely think we should have a solid story for the concerns you're raising in this PR, but I'm wondering if we can find alternative approaches that stick with Tide's original strict separation between routing concerns and endpoint/extraction concerns.

HeroicKatora · 2019-01-23T22:14:02Z

One is extraction that depends on runtime values. Another is avoiding having to encode too much into the type system. (I do think these may be related issues, but still worth separating out). Does that seem right?

I suppose these are related but intended was mostly a solution to the second (with a specific issue faced by third-party library authors. I'll expand on it below). The design specifically asserts all extractor seeds to be sync, and only gives immutable access. It does not assert Clone as the configuration was intended to be endpoint specific. Once there are runtime values in place, I don't see how one could prevent them from being used as dynamic configuration, how could one stop injecting Arc<Mutex<Anything>> or a Once<_> of this when a static reference is required? So, instead I cut that effort and focused on making it clear that this was not intended while making it useful.

AIUI, seeding for endpoints is "all or nothing":

You are correct, and this was intentional. In the tuple model, every parameter maps very clearly to its seed. By using the standard unit type and making this the standard seed, I hoped to make get_with seem like a natural extension of an endpoint whose seed is only (). Type level computation would not only make the implementation hard but also the usage.

Seeding spreads out the logic of extraction a bit: rather than it being wholly contained in the endpoint definition, now some important parts of provided in router definition. That's a bit unfortunate.

This could definitely be improved. I was just not sure where to integrate seeds idiomatically. So, I chose to put them/the one next to the convenience methods in Resource so that there is no need to name the otherwise clunky ExtractSeed-endpoint. I'm open to other suggestions of where and how to construct this Endpoint implementation wrapper. The current place just felt fairly ergonomic when writing the example. For a first time user, that is.

Extractors should be able to use Tide's configuration system. This could possibly be used to provide parameters to a given extractor instance, though we haven't worked out a full, ergonomic way for doing that.

I'm not sure in which way this might improve the situation. Extractors seem to then need to rely on a specific type to extract from a loose typemap but that type is no longer type checked and needs to be unique to this extractor at the same time.

Extractors don't have to be transparent newtypes. So, for example, we could provide a Headers extractor that then provides a method for extracting particular headers.

This seems to be at conflict with the idea of using extraced values as an easily pluggable guard to BadRequest and the sort. Another usage are authorization tokens or in the simplest case passwords. This is just one of the cases where a third-party library would be very welcome. But as a library writer, this approach is flawed when constrained to the type system alone.

Requiring the all values to be derived from the user's AppState works in the end but makes it impossible to provide drop-in guards. Every part needs to be amamalgamated into the AppState (and that also requires Sync etc.).
Deriving the values from a type system has similar problems. Coupling values loosely through dynamic typemaps runs into configuration problems as soon as there should be more than one value or that value should be routing of endpoint dependent, which is the common case for authorization. Coupling them via a associated items on traits results in the stacking of type-parameters in most places and in most libraries only couples dynamic user values by having AppState derive this trait. This also provides no integration into routing or endpoint.

Tl;dr: I agree that we should try to separate endpoint and routing information more cleanly in the interface, and not just its implementation. But I think that dynamic values provide the most idiomatic and most usable coupling, especially when considering how typical configurable components might be implemented in libraries.

Lastly, let me note that this could solve the problem of #63 . If you want a proof of concept for this, I will gladly try to provide one.

aturon · 2019-01-23T23:53:19Z

@HeroicKatora thanks for the detailed response! I'll try to respond fully later, but a quick question:

This seems to be at conflict with the idea of using extraced values as an easily pluggable guard to BadRequest and the sort.

Can you elaborate on this a bit? One core design aspect of Tide is that routing happens fully prior to extraction, and hence extraction cannot influence which endpoint is invoked. Bailing out for an endpoint has to happen within the endpoint body (and we eventually intend to provide redirects or other similar functionality).

So to clarify what I meant a bit, we could have a Segments extractor that did not need to have a name statically set up, but where you do the extraction in the body:

fn my_endpoint(segments: Segments) {
    let id: u64 = segments.get("id");
    let name: String = segments.get("name");
    // ...
}

HeroicKatora · 2019-01-23T23:59:59Z

Can you elaborate on this a bit? One core design aspect of Tide is that routing happens fully prior to extraction, and hence extraction cannot influence which endpoint is invoked.

Sure. In case of a denied authorization for example you might want to respond with a fully static 'Denied' error with correct Authorization header. That error response is possible for an extractor because it does not need to influence routing. And for a library, the header should most likely be set by the extractor since it contains information about the authorization method that should be used for example. That is not to say that it should not be possible to do this in the endpoint body but that for the most common cases the extractor would be much more ergonomic.

HeroicKatora · 2019-01-24T00:02:23Z

Bailing out for an endpoint has to happen within the endpoint body (and we eventually intend to provide redirects or other similar functionality).

Can I get some clarification on this? If I understood the blog post correctly and recall it, then this was sometime intended to occur not via the path, i.e. by not going through the router. Is there a reason why such a redirection would be possible for endpoints but not for extractors (especially if seeded such that the intended target can be configured at construction)?

HeroicKatora · 2019-03-01T00:20:42Z

To address the biggest concern of @tirr-c I think I'll do a rebase of this where I leave out the changes to Extract itself and instead introduce a new trait ExtractSeed with a blanket impl for () (or Phantom<T> paralleling closely serde::de::DeserializeSeed).

HeroicKatora · 2019-03-01T00:22:12Z

Is it okay to rebase this in progress? Or should we close and reopen, or something else entirely?

The result looks more clunky for binding it to a route but at least the concerns are separate more strictly. This also removes the need for duplicating the router utilities for methods.

HeroicKatora · 2019-03-01T22:51:47Z

In order to makes this much more convincing, I've written a sample add-on to provide sessions:

https://github.com/HeroicKatora/tide/tree/session
or direct link to example app

This is exactly the use case, where the third-party library wants to provide additional extractors without the configuration living inside generic type arguments (and it should not leak into AppData).

HeroicKatora · 2019-03-02T02:10:49Z

And another one, providing Basic authentication. For this use case, the desire of endpoint specific configuration is probably greater. Don't use this in production but it's more or less okayish and feature complete.

https://github.com/HeroicKatora/tide/tree/authorize
https://github.com/HeroicKatora/tide/blob/authorize/examples/simple.rs

HeroicKatora · 2019-03-04T04:40:12Z

Found no further issues after the rebase, so I've updated the documentation in the pull request. @aturon @tirr-c This completes the rework to address the main weaknesses, so I've also removed the [WIP] tag in the title.

HeroicKatora · 2019-04-10T21:25:00Z

Made irrelevant by #156 which removes the concept to be extended while simultaneously providing a much cleaner approach with core Rust semantics. Third-party crates can add parameter to the extension traits created by them to replace the concept of seeds or pass the Context<_> to the seed.

You could still find the example code for the authentication crates useful.

HeroicKatora mentioned this pull request Jan 20, 2019

Reduce the amount of boilerplate needed to declare partials in the routes #108

Closed

yoshuawuyts added feature A feature that's ready for implementation design Open design question labels Jan 21, 2019

yoshuawuyts requested review from aturon and tirr-c and removed request for aturon January 21, 2019 09:53

HeroicKatora changed the title ~~Extend extractors with an optional seed per endpoint.~~ [WIP] Extend extractors with an optional seed per endpoint. Jan 21, 2019

tirr-c requested changes Jan 22, 2019

View reviewed changes

HeroicKatora added 5 commits March 1, 2019 01:41

Relaxed Extract with uncoupled Self from Output

99a6f84

Example exemplifying seeded extraction

6db9af3

Extraction of segments by runtime name

1099a87

Expand endpoint macros to up to 9 seeded arguments

cd48e8f

Uncouple endpoint construction from routing

3294cc6

The result looks more clunky for binding it to a route but at least the concerns are separate more strictly. This also removes the need for duplicating the router utilities for methods.

HeroicKatora force-pushed the master branch from 8b1a556 to 3294cc6 Compare March 1, 2019 18:07

HeroicKatora changed the title ~~[WIP] Extend extractors with an optional seed per endpoint.~~ Extend extractors with an optional seed per endpoint. Mar 4, 2019

bIgBV mentioned this pull request Mar 6, 2019

Session management #9

Closed

HeroicKatora closed this Apr 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend extractors with an optional seed per endpoint. #126

Extend extractors with an optional seed per endpoint. #126

HeroicKatora commented Jan 20, 2019 •

edited

Loading

yoshuawuyts commented Jan 21, 2019

HeroicKatora commented Jan 21, 2019 •

edited

Loading

tirr-c left a comment

tirr-c Jan 22, 2019

HeroicKatora Jan 22, 2019 •

edited

Loading

tirr-c Jan 22, 2019

HeroicKatora Jan 22, 2019 •

edited

Loading

tirr-c Jan 22, 2019

HeroicKatora Jan 22, 2019

aturon commented Jan 23, 2019

HeroicKatora commented Jan 23, 2019 •

edited

Loading

aturon commented Jan 23, 2019

HeroicKatora commented Jan 23, 2019

HeroicKatora commented Jan 24, 2019 •

edited

Loading

HeroicKatora commented Mar 1, 2019 •

edited

Loading

HeroicKatora commented Mar 1, 2019

HeroicKatora commented Mar 1, 2019 •

edited

Loading

HeroicKatora commented Mar 2, 2019

HeroicKatora commented Mar 4, 2019

HeroicKatora commented Apr 10, 2019 •

edited

Loading


		pub struct Header<T>(pub T);

		impl<T: From<http::header::HeaderValue> + Send + 'static, S: 'static> Extract<Header<T>, S> for NamedHeader {

Extend extractors with an optional seed per endpoint. #126

Extend extractors with an optional seed per endpoint. #126

Conversation

HeroicKatora commented Jan 20, 2019 • edited Loading

Description

Motivation and Context

How Has This Been Tested?

Other TODOs

Types of changes

Checklist:

yoshuawuyts commented Jan 21, 2019

HeroicKatora commented Jan 21, 2019 • edited Loading

tirr-c left a comment

Choose a reason for hiding this comment

tirr-c Jan 22, 2019

Choose a reason for hiding this comment

HeroicKatora Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

tirr-c Jan 22, 2019

Choose a reason for hiding this comment

HeroicKatora Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

tirr-c Jan 22, 2019

Choose a reason for hiding this comment

HeroicKatora Jan 22, 2019

Choose a reason for hiding this comment

aturon commented Jan 23, 2019

HeroicKatora commented Jan 23, 2019 • edited Loading

aturon commented Jan 23, 2019

HeroicKatora commented Jan 23, 2019

HeroicKatora commented Jan 24, 2019 • edited Loading

HeroicKatora commented Mar 1, 2019 • edited Loading

HeroicKatora commented Mar 1, 2019

HeroicKatora commented Mar 1, 2019 • edited Loading

HeroicKatora commented Mar 2, 2019

HeroicKatora commented Mar 4, 2019

HeroicKatora commented Apr 10, 2019 • edited Loading

HeroicKatora commented Jan 20, 2019 •

edited

Loading

HeroicKatora commented Jan 21, 2019 •

edited

Loading

HeroicKatora Jan 22, 2019 •

edited

Loading

HeroicKatora Jan 22, 2019 •

edited

Loading

HeroicKatora commented Jan 23, 2019 •

edited

Loading

HeroicKatora commented Jan 24, 2019 •

edited

Loading

HeroicKatora commented Mar 1, 2019 •

edited

Loading

HeroicKatora commented Mar 1, 2019 •

edited

Loading

HeroicKatora commented Apr 10, 2019 •

edited

Loading