Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shadow-traffic / traffic mirroring feature #105

Open
softeering opened this issue Apr 25, 2020 · 13 comments
Open

Add shadow-traffic / traffic mirroring feature #105

softeering opened this issue Apr 25, 2020 · 13 comments
Labels
Type: Idea This issue is a high-level idea for discussion.
Milestone

Comments

@softeering
Copy link

What should we add or change to make your life better?

Is there a plan to support request mirroring to a second endpoint?
Let's say the reverse-proxy receives a request on 8080. In addition to forwarding it to the configured endpoint, it would "duplicate" the request and send it to a second endpoint (most probably in a fire-and-forget manner).
Configuration could define a percentage of requests to mirror, request modification etc...

Why is this important to you?

In a production system, when testing a new version, it is very useful to be able to get real PROD traffic in without impacting the production environment. Being able to shadow some traffic to another fleet of boxes running the new version helps us a lot when releasing impacting changes to high-throughput services (1M+ requests per second)

@softeering softeering added the Type: Idea This issue is a high-level idea for discussion. label Apr 25, 2020
@Tratcher
Copy link
Member

Tratcher commented Apr 26, 2020

It's doable in theory. A few things to be careful about:
A) Be very careful mirroring the request body. We stream the request body today (if any), and streaming that to two concurrent destinations has some challenges.
B) Consider when/where we do transformations. E.g. headers, url, etc.. Make sure the two copies of the request aren't trying to transform the same underlying objects.
C) We need to work out our A/B testing story in general. A/B testing seems like it would be built onto the routing platform (route 20% to route A, 80% to route B), but mirroring complicates that. Maybe mirroring is a separate feature that kicks in after routing, and configured on a specific route, copies everything over to a fake request, and executes it against the mirrored backend. There's some risk that the fake request wouldn't be a 100% accurate copy, especially around the IFeature infrastructure.

@softeering
Copy link
Author

Interesting points. Agree with the advanced / edge cases. Maybe the mirroring feature would have some limitations (eg. no modification to the request except the url).
Regarding C), I think mirroring happening post-routing would make perfect sense in the use-case I have in mind at the moment but wouldn't work in the case of load-balancing for example (which again, could just be a known limitation to this specific feature)
Thoughts?

@Tratcher
Copy link
Member

Load balancing shouldn't be a problem. I'd expect the mirror target to be a separate backend group rather than a specific endpoint instance, and load balancing is per group.

@Tratcher
Copy link
Member

As for the modifications and such its just a matter of caution. We need similar caution for other reasons such as if we had a retry-on-failure feature.

@samsp-msft
Copy link
Contributor

Mirroring is something we have talked about a bit as interesting in principle, but more complex in practice.
From the scenario description, I am assuming that responses from the mirror are to be ignored - this is purely about generating load / testing, and that the mirror will handle data collisions talked about in B).
Would A/B testing with a retry/failover switch be a better way of handling the scenario of testing a new version - assign a percentage of the traffic to the new service, but monitor the results. In the case of errors, it would retry against the primary backend(s).

@softeering
Copy link
Author

In our case, A/B testing with retry / failover wouldn't fit the purpose of what we call the shadow stack. Especially because this means we would double the latency from a caller perspective in case of a failure and add load on the boxes running the app.
Not everyone has the same use-case obviously, just describing ours to give you as many details as possible (talking about some apps getting ~1M RPS with an SLA usually under 10ms)

@samsp-msft
Copy link
Contributor

Thank you @softeering, that helps clarify the scenario / use case.

@JackPoint
Copy link

As mentioned in the linked issue I'm also looking for something like this to distribute messages to multiple environments. I hope this feature will be considered.

@Tratcher
Copy link
Member

Tratcher commented Apr 23, 2021

We received out of band feedback requesting mirroring support specifically for IHttpProxy. Most of the above concerns apply, but there are fewer components to contend with.

@Towmeykaw
Copy link

I have been using the IHttpProxy and would love a way to have mirroring out of the box. Currently I set up a second HttpMessageInvoker and just copy everything I need onto that which has worked for the GET requests I'm currently using it for. But I will be using it for POSTs in the future so if there is some pitfalls then a proper way of doing it would be great.

@Tratcher
Copy link
Member

Currently I set up a second HttpMessageInvoker and just copy everything I need onto that

@Towmeykaw can you show a rough outline of your code for that? And where is that called in relation to IHttpProxy?

Here's an outline for how mirroring could be implemented as a DelegatingHandler. It's careful to avoid either request from affecting the other.

        private class MirrorHandler : DelegatingHandler
        {
            public MirrorHandler(HttpMessageHandler innerHandler) : base(innerHandler)
            {
            }

            protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
            {
                var mirrorRequest = CopyRequest(request);
                RetargetRequest(mirrorRequest);

                // Dispatch the real one so even if it throws we'll still send the mirror request.
                var realTask = Task.Run(() => base.SendAsync(request, cancellationToken));

                var mirrorTask = Task.Run(async () =>
                {
                    try
                    {
                        // TODO: Provide a different cancellation token here:
                        using var mirrorResult = await base.SendAsync(mirrorRequest, cancellationToken);
                        // Report
                    }
                    catch (Exception ex)
                    {
                        // Report
                    }
                });

                return realTask;
            }

            private HttpRequestMessage CopyRequest(HttpRequestMessage request)
            {
                throw new NotImplementedException();
            }

            private void RetargetRequest(HttpRequestMessage mirrorRequest)
            {
                // Update RequestUri and the Host header as needed
                throw new NotImplementedException();
            }
        }

I'd only recommend this for use with IHttpProxy. When using the full proxy model there are a lot of other considerations like load balancing, health checks, etc. so mirroring would be implemented as middleware instead. The CopyRequest step is a lot more complicated in middleware, HttpContext has a lot more state.

As for POSTs, that's where things get hard. The simplest approach would be to pre-buffer the body and attach a copy to each request. This is problematic for a couple of reasons:

  • It consumes a lot of memory
  • It adds latency to the request, you can't forward the request until you've gotten the whole body.
  • It breaks 100-continue logic, the destination may refuse the body (e.g. 401).

These can be mitigated with some complicated streams that stream content to both destinations, but I don't think you can fully insulate one request from the other in this scenario. E.g. If the real request is rejected or fails then the client may abort sending the body, causing the mirror request to fail differently.

@Towmeykaw
Copy link

Towmeykaw commented May 3, 2021

@Tratcher For my first attempt I was very careful not to affect the proxy as I was running against production data. So I just created a separate HttpMessageInvoker and called it after the real request. This is a very basic setup which was just to get one feature tested but in a few weeks I will probably have to set it up for posting so will make some tests with the DelegatingHandler and pre-buffering. It might work for my use case as the Post bodies are usually just small json messages.

var httpClient = new HttpMessageInvoker(new SocketsHttpHandler()
            {
                UseProxy = false,
                AllowAutoRedirect = false,
                AutomaticDecompression = DecompressionMethods.None,
                UseCookies = false
            });

var mirrorHttpClient = new HttpMessageInvoker(new SocketsHttpHandler()
{
    UseProxy = false,
    AllowAutoRedirect = false,
    AutomaticDecompression = DecompressionMethods.None,
    UseCookies = false
});

app.UseEndpoints(endpoints =>
{
    endpoints.Map("/{**catch-all}", async httpContext =>
    {
        await httpProxy.ProxyAsync(
            httpContext, "http://" + GetRouteFromDomain(ParseDomain(httpContext)), httpClient,
            new RequestProxyOptions {Timeout = TimeSpan.FromSeconds(100)});

        if (unleash.IsEnabled("MirrorTraffic"))
        {
            await mirrorHttpClient.SendAsync(new HttpRequestMessage(new HttpMethod(httpContext.Request.Method), GetMirrorUrl() + httpContext.Request.QueryString), CancellationToken.None);
        }                    
    });
});

@Tratcher
Copy link
Member

Tratcher commented May 3, 2021

@Towmeykaw

  • There are still some possible side-effects there.
    • Chunked responses aren't completed until the pipeline unwinds or you call httpContext.Response.CompleAsync(). The client may end up waiting for the end of the response while the mirror request executes. If the mirror request throws it could cause the original response to be aborted.
    • That said, you can't use Task.Run at this layer because it would cause mult-threaded access of HttpContext which is not safe.
    • You should pass the httpContext.RequestAborted CT to ProxyAsync to cancel proxy operations if the client disconnects.
    • If ProxyAsync throws then the mirror request won't execute. (However, ProxyAsync doesn't throw in most situations).
  • For the mirror request itself, by creating your own request you can end up with significantly different output.
    • You're not copying request headers
    • IHttpProxy defaults to HTTP/2, but HttpRequestMessage defaults to HTTP/1.1.
    • It needs its own timeout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Idea This issue is a high-level idea for discussion.
Projects
Status: 📋 Backlog
Development

No branches or pull requests

8 participants