-
Notifications
You must be signed in to change notification settings - Fork 852
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retry of "Safe" HTTP requests #56
Comments
A big thing to consider here is request "safety" and idempotency. A "safe" request is one which does not modify server state (other than diagnostic/tracing data), such as An idempotent request is one which can be made several times but will only ever have the effect once. All safe requests are idempotent, as well as properly-written We'll want our retry to be conservative by default and likely only retry on safe requests. When we get an error response from a backend we don't know if the request was partially completed and state may be corrupted such that it's unsafe to retry. Safe requests don't modify state and are thus safe to be retried indefinitely. We could have options in the future to allow retry on other kinds of request or on certain response errors, but I think those are out-of-scope for v1. So the v1 feature would be retrying of safe requests. |
Open questions:
|
Triage: We decided to not implement it for 1.0, moving to Future. |
Quick, high level idea
{
"Clusters": {
"cluster1": {
"RetryPolicy": {
"Handler": "SomeNiceHandlerName",
"RetryOnStatusCode": ["401", "5xx"],
"RetryOnException": ["System.Net.WebException", "*"],
"RetryAfter": ["00:00:01", "00:00:03", "00:00:12"],
"When": {
"Method": ["GET", "HEAD", "OPTIONS", "*"]
}
}
}
}
} ProxyInvokerMiddleware.Invoke, code around "await _httpProxy.ProxyAsync" ?
|
I expect this will be in its own middleware, like load balancing. That allows you to position it before or after load balancing and potentially retry on a different destination (naturally or forcefully selected).
That's already handled by the passive health checks middleware. With retry as its own middleware we can ensure passive health checks see each failure before we retry it. |
Sorry for late reply. I think you are correct, current architecture seems very very good and it should be possible to achieve retry policy with special "retry middleware" (leaving ProxyInvokerMiddleware, HttpProxy components as is). |
@vdjuric any chance you could add a sample with retry logic? I'm just getting started with this framework and i have no idea where to add the retry logic. A big plus would be if it was doable with the |
Hi @AnderssonPeter, In your Startup.cs, Configure method you would have to register custom "RetryMiddleware" as following: app.UseEndpoints(endpoints =>
{
endpoints.MapReverseProxy(app =>
{
app.UseAffinitizedDestinationLookup();
app.UseProxyLoadBalancing();
app.UsePassiveHealthChecks();
app.UseMiddleware<RetryMiddleware>(); //add custom retry logic to chain
});
}); where RetryMiddleware is standard ASP.NET Core Middleware. Implementation is still on my TODO list :) Kind regards, |
@vdjuric don't think this approach will because:
|
It would require buffering the request body. It should be noted that requests with bodies are general considered unsafe to retry anyways given they are likely to cause side-effects and you don't know how far their processing got the first time. E.g. you wouldn't want to double charge someone's credit card. |
I think overhead and unsafe retries are acceptable if user explicitly enables this behavior from configuration for specific endpoint. This behavior would be disabled by default. If user enables this, possible duplicated requests should be properly handled by server (for example, you could introduce unique request id (sent by client) and server would know if this request was already processed). |
I Agree with @vdjuric, should be off by default. We want to retry request that result in intermittent failure. Our downstream services return custom http code when a intermittent failure occurs in which case the client is responsible for retrying the request. Also buffering can be avoid if http request object can be made available to retry middleware. HttpProxy is passing the request object to http client here:
|
How will that help avoid buffering the request body? |
You are right it's not possible. I just read through |
Had to roll up my own middleware for my specific use case . Not sure if this is optimal solution, any feedback is appreciated :) Use case
Implementation
Code public class Startup
{
public Startup(IConfiguration configuration)
{
Configuration = configuration;
}
public IConfiguration Configuration { get; }
public void ConfigureServices(IServiceCollection services)
{
var proxyBuilder = services.AddReverseProxy();
proxyBuilder.LoadFromConfig(Configuration.GetSection("ReverseProxy"));
}
// This method gets called by the runtime. Use this method to configure the HTTP request
// pipeline that handles requests
public void Configure(IApplicationBuilder app, IWebHostEnvironment env)
{
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapReverseProxy(m =>
{
m.UsePassiveHealthChecks();
m.UseMiddleware<CustomForwarderMiddleware>();
});
});
}
} CustomForwarderMiddleware public class CustomForwarderMiddleware
{
private readonly RequestDelegate _next;
private const string RedirectToSecondaryDestination = "x-redirect-secondary-destination";
public CustomForwarderMiddleware(RequestDelegate next)
{
_next = next;
}
public async Task InvokeAsync(HttpContext context, IDestinationHealthUpdater healthUpdater)
{
#region Prepare
var route = context.GetRouteModel();
var cluster = route.Cluster!;
#endregion
#region Read Request Content
// We have to use the request content twice so parse and store.
var reader = new StreamReader(context.Request.Body, Encoding.UTF8, bufferSize: 1024);
var requestPayload = await reader.ReadToEndAsync();
reader.Dispose();
#endregion
#region Find Healthy Desitnation
var reverseProxyFeature = context.GetReverseProxyFeature();
// Find destination which is not marked as unhealthy.
var healthyDestination = reverseProxyFeature
.AllDestinations
.Where(m => m.Health.Passive != DestinationHealth.Unhealthy)
.OrderBy(m => m.DestinationId)
.ToList();
// Get the first destination we should always have 1 healthy destination.
var destinationConfig = healthyDestination[0];
#endregion
#region Prepare Cancellation Token
var requestAborted = context.RequestAborted;
var requestTimeoutSource = CancellationTokenSource.CreateLinkedTokenSource(requestAborted);
requestTimeoutSource.CancelAfter(TimeSpan.FromSeconds(50));
var requestTimeoutToken = requestTimeoutSource.Token;
#endregion
try
{
// Build the first request.
var request = await GetRequestMessage(requestPayload, route, context, destinationConfig);
// Call downstream service.
var responseOne = await reverseProxyFeature.Cluster.HttpClient.SendAsync(request, requestTimeoutToken);
// If downstream service returns http 500 or response contains RedirectToSecondaryDestination header.
// Then we have to mark destination_1 as unhealthy and retry request using secondary destination.
var redirectToDestination2 = responseOne.Headers.Contains(RedirectToSecondaryDestination);
if (((int)responseOne.StatusCode) >= 500 || redirectToDestination2)
{
// Step 1 - Mark first destination as unhealthy for 16 minutes.
if (destinationConfig.DestinationId.Equals("destination_1"))
healthUpdater.SetPassive(cluster, destinationConfig, DestinationHealth.Unhealthy, TimeSpan.FromMinutes(16));
// Step 2 - Build request and try
var secondaryDestination = healthyDestination.FirstOrDefault(m => m.DestinationId.Equals("destination_2"));
if (secondaryDestination != null)
{
// Call downstream service using secondary destination.
var httpRequest = await GetRequestMessage(requestPayload, route, context, secondaryDestination);
var secondaryResponse = await reverseProxyFeature.Cluster.HttpClient.SendAsync(httpRequest, requestTimeoutToken);
// Transform request and return to client.
context.Response.StatusCode = (int)secondaryResponse.StatusCode;
await route.Transformer.TransformResponseAsync(context, secondaryResponse);
await secondaryResponse.Content.CopyToAsync(context.Response.Body, requestTimeoutToken);
}
else
{
context.Response.StatusCode = StatusCodes.Status500InternalServerError;
}
}
else
{
context.Response.StatusCode = (int)responseOne.StatusCode;
await route.Transformer.TransformResponseAsync(context, responseOne);
await responseOne.Content.CopyToAsync(context.Response.Body, requestTimeoutToken);
}
}
catch (OperationCanceledException canceledException)
{
if (!requestAborted.IsCancellationRequested && requestTimeoutToken.IsCancellationRequested)
{
context.Response.Clear();
context.Response.StatusCode = StatusCodes.Status504GatewayTimeout;
return;
}
context.Response.StatusCode = StatusCodes.Status502BadGateway;
}
catch (Exception requestException)
{
//TODO
}
finally
{
requestTimeoutSource.Dispose();
}
}
public async Task<HttpRequestMessage> GetRequestMessage(string payload, RouteModel route, HttpContext context, DestinationState destinationConfig)
{
var httpRequest = new HttpRequestMessage
{
Method = GetMethod(context.Request.Method),
Version = HttpVersion.Version11,
VersionPolicy = HttpVersionPolicy.RequestVersionOrLower,
Content = new StringContent(payload, Encoding.UTF8)
};
// Builds RequestUri and Copies Headers
await route.Transformer.TransformRequestAsync(context, httpRequest, destinationConfig.Model.Config.Address);
return httpRequest;
}
#region GetMethod
private static HttpMethod GetMethod(string method)
{
if (HttpMethods.IsDelete(method)) return HttpMethod.Delete;
if (HttpMethods.IsGet(method)) return HttpMethod.Get;
if (HttpMethods.IsHead(method)) return HttpMethod.Head;
if (HttpMethods.IsOptions(method)) return HttpMethod.Options;
if (HttpMethods.IsPost(method)) return HttpMethod.Post;
if (HttpMethods.IsPut(method)) return HttpMethod.Put;
if (HttpMethods.IsTrace(method)) return HttpMethod.Trace;
return new HttpMethod(method);
}
#endregion
} appsettings.json {
"Logging": {
"LogLevel": {
"Default": "Debug",
"Microsoft": "Debug",
"Microsoft.Hosting.Lifetime": "Debug",
"Yarp": "Debug"
}
},
"AllowedHosts": "*",
"ReverseProxy": {
"Routes": {
"Route8": {
"ClusterId": "api1",
"Match": {
"Methods": [ "Post", "GET" ],
"Path": "/api1/v1/{*s}"
},
"Transforms": [
{
"PathSet": "/posts"
},
{
"ResponseHeader": "Access-Control-Expose-Headers",
"Append": "IsSuccessful",
"When": "Always"
},
{
"ResponseHeader": "Access-Control-Max-Age",
"Append": "86400",
"When": "Always"
}
]
}
},
"Clusters": {
"api1": {
"Destinations": {
"destination_1": {
// this destination generates http 500
"Address": "https://webhook.site/5166e19c-9665-4db9-a05a-a7fb800a1cf4"
},
"destination_2": {
"Address": "https://myapi1.net"
}
}
}
}
}
}
|
@Henery309 glad you got it working. Some feedback:
|
@Tratcher thanks for you feedback really appreciated it. Based on the feedback I am trying to simplify my solution. But when retrying on second destination I am getting. Here is my second approach. 1 - Retry middleware is added in the beginning of all middleware. public void Configure(IApplicationBuilder app)
{
app.UseMiddleware<RetryMiddleware>();
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapReverseProxy();
});
} 2 - SuppressResponseBody is set to true if error occurs public void ConfigureServices(IServiceCollection services)
{
var proxyBuilder = services.AddReverseProxy().AddTransforms(m =>
{
m.AddResponseTransform(context =>
{
// Suppress the response body from errors.
// The status code was already copied.
if ((int)context.ProxyResponse.StatusCode >= 500)
context.SuppressResponseBody = true;
return default;
});
});
proxyBuilder.LoadFromConfig(Configuration.GetSection("ReverseProxy"));
} 3 - I rely on health check middleware to mark failed destination as unhealthy. "Clusters": {
"api1": {
"HttpClient": {
"DangerousAcceptAnyServerCertificate": true
},
"HttpRequest": {
"Timeout": "00:01:00",
"Version": "1.1"
},
"HealthCheck": {
"Passive": {
"Enabled": "true",
"Policy": "TransportFailureRate",
"ReactivationPeriod": "00:16:00"
}
},
"MetaData": {
"TransportFailureRateHealthPolicy.MinimalTotalCountThreshold": "1",
"TransportFailureRateHealthPolicy.DefaultFailureRateLimit": "0.1",
"TransportFailureRateHealthPolicy.DetectionWindowSize": "15"
},
"LoadBalancingPolicy": "FirstAlphabetical",
"Destinations": {
"destination_1": {
// this destination generates http 500
"Address": "https://webhook.site/5166e19c-9665-4db9-a05a-a7fb800a1cf4"
},
"destination_2": {
"Address": "https://localhost:5001"
}
}
}
} 4 - Finally in my retry middleware if downstream returns http 500 or above I am using IHttpForwarder to rerun the same request again. Retrying is resulting in public async Task InvokeAsync(HttpContext context)
{
context.Request.EnableBuffering();
await _next(context);
var statusCode = context.Response.StatusCode;
if (statusCode >= 500)
{
var reverseProxyFeature = context.GetReverseProxyFeature();
// Get available destinations
var healthyDestination = reverseProxyFeature
.AvailableDestinations
.ToList();
// Only retry if we have a destination available.
if (healthyDestination.Count > 0)
{
// Get configs of healthy destination
var destinationConfig = healthyDestination[0];
// Snapshot request to reply the request.
var clusterConfig = reverseProxyFeature.Cluster;
// Current route config request to reply request.
var routeConfig = context.GetRouteModel();
// Finally forward request to secondary destination. If all goes well the request will be successful.
await _httpForwarder.SendAsync(context, destinationConfig.Model.Config.Address, clusterConfig.HttpClient, clusterConfig.Config.HttpRequest, routeConfig.Transformer);
}
}
} |
|
Thanks @Tratcher it worked. I actually did not know that you can re-try middleware by just calling _next again. I thought pipelines only run sequence in 1 direction. Not sure if this should be added to https://docs.microsoft.com/en-us/aspnet/core/fundamentals/middleware/?view=aspnetcore-5.0#middleware-order. As per the retry middleware this is my final code. 1 - Startup.cs public void Configure(IApplicationBuilder app)
{
app.UseRouting();
app.UseEndpoints(endpoints =>
{
endpoints.MapReverseProxy(m =>
{
m.UseMiddleware<RetryMiddleware>();
m.UseLoadBalancing();
m.UsePassiveHealthChecks();
});
});
} 2 - RetryMiddleware public class RetryMiddleware
{
private readonly RequestDelegate _next;
public RetryMiddleware(RequestDelegate next)
{
_next = next;
}
public async Task InvokeAsync(HttpContext context)
{
context.Request.EnableBuffering();
// Proceed to next middleware
await _next(context);
var statusCode = context.Response.StatusCode;
if (statusCode >= 500)
{
var reverseProxyFeature = context.GetReverseProxyFeature();
// Get available destinations
var healthyDestination = reverseProxyFeature
.AllDestinations
.Where(m => m.Health.Passive != DestinationHealth.Unhealthy)
.ToList();
// Set the available destination to available healthy destination.
reverseProxyFeature.AvailableDestinations = healthyDestination;
context.Request.Body.Position = 0;
// Rerun middleware, this will re-run LoadBalancingMiddleware PassiveHealthCheckMiddleware and ForwarderMiddleware
await _next(context);
}
}
} |
Much better. A few more suggestions (coding in the browser, may not compile 😁): public class RetryMiddleware
{
private readonly RequestDelegate _next;
public RetryMiddleware(RequestDelegate next)
{
_next = next;
}
public async Task InvokeAsync(HttpContext context)
{
context.Request.EnableBuffering();
var reverseProxyFeature = context.GetReverseProxyFeature();
var available = reverseProxy.AvailableDestinations;
// Proceed to next middleware
await _next(context);
var statusCode = context.Response.StatusCode;
if (statusCode >= 500)
{
// Update available destinations to exclude the one we just tried.
var healthyDestinations = available
.Where(m => m != reverseProxyFeature.ProxiedDestination)
.ToList();
if (healthyDestinations.Count == 0)
{
return;
}
// Set the available destination to available healthy destination.
reverseProxyFeature.AvailableDestinations = healthyDestinations;
context.Request.Body.Position = 0;
reverseProxyFeature.ProxiedDestination = null;
// Rerun middleware, this will re-run LoadBalancingMiddleware PassiveHealthCheckMiddleware and ForwarderMiddleware
await _next(context);
}
}
} |
Hi @Tratcher,
After this line the RouteValues is cleared.
For a work around I am storing the route Values in a Dictionary<string,string> and then re populating them in context.Request.RouteValues just before the Following is my configs "Routes": {
"Route8": {
"ClusterId": "api1",
"Match": {
"Methods": [ "Post" ],
"Path": "myctrl/v1/{*routeName}"
},
"Transforms": [
{
"PathPattern": "/{routeName}"
}
]
}
} |
Oh, I didn't know the binder removed matched values from the collection. That's not great. Can you please make a separate issue for that? We don't want the transforms to have side effects on the original request. Making a backup copy of the route values is your best mitigation for now. |
Triage: We should limit it to safe HTTP methods without body. |
@karelz graphql has a use case where posts can be a safe request. Any request bodies without a mutation would be considered safe. I think it would be good to try to support graphql along with traditional rest services. |
@divinebovine well, POST is in general NOT safe, right? How would one distinguish if it is graphql safe POST vs. not? |
The other reason we'd avoid POST is that it would require buffering every request body. That's more overhead than we'd want to introduce. Since most aren't safe anyways, that shouldn't be a major loss. |
@karelz I agree that POST is generally NOT safe per http specifications. However, graphql over http implementations like Apollo, require all queries (safe) and mutations (unsafe) with variables to be sent as a POST. I understand not wanting to do it like @Tratcher mentioned, though it would be nice to be able to specify a retry middleware specific for certain routes that could inspect the bodies and determine if a retry would be safe. Edit: |
While we don't have a built-in retry feature I'm also trying to perform it with a HTTP POST body scenario by using the RetryMiddleware posted by @Tratcher together with the SuppressBodyResponse transform code (originally posted by @Henery309):
The problem is that when we call the middleware again in case of errors 500):
It throws an exception saying the response already started:
Shouldn't the context.SuppressResponseBody = true; prevent that response from being written? Any insights here will be appreciated. |
Here's the actual check for that error: reverse-proxy/src/ReverseProxy/Forwarder/RequestUtilities.cs Lines 419 to 423 in c664310
I suggest calling HttpResponse.Clear() before retrying. SuppressResponseBody prevents the body from being proxied, but the status and headers are still copied and should be cleared out. |
Hi @andredewes , have you fixed your problem? My problem looks the same, Here is my code:
public async Task InvokeAsync(HttpContext context)
{
context.Request.EnableBuffering();
await _next(context); // returns 500
context.Response.Clear();
await _forwarder.SendAsync(context, "http://localhost:5002", _client); // calls an externeral service which returns 200
}
[HttpGet]
public ActionResult Get()
{
return new StatusCodeResult(500);
}
builder.Services.AddReverseProxy().AddTransforms(m =>
{
m.AddResponseTransform(context =>
{
context.SuppressResponseBody = true;
return default;
});
}); |
Another idea is that my case is slightly different because the 1st attempt is processed in local controller, only forward to external service if failed. If that is the reason, may I ask what is the correct practice then? |
Hi @Elderry and @Tratcher I realized the issue in the code @Elderry posted is YARP will be suppressing all responses in the current way the transform is configured. You will need to add some logic to stop it when the response YARP receives is the one you want to send back to your client. Something like builder.Services.AddReverseProxy()
.LoadFromConfig(builder.Configuration.GetSection("ReverseProxy"))
.AddTransforms(builderContext =>
{
builderContext.AddResponseTransform(transformContext =>
{
transformContext.SuppressResponseBody = transformContext.HttpContext.Response.StatusCode == 429;
return default;
});
}); In this case only suppress 429s and retry them or add some more detailed logic to stop within a certain retry count |
when a request fails - retry against a different server
The text was updated successfully, but these errors were encountered: