Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add information to indicate a SignalR request is long running #29699

Closed
shirhatti opened this issue Jan 27, 2021 · 8 comments · Fixed by #32084
Closed

Add information to indicate a SignalR request is long running #29699

shirhatti opened this issue Jan 27, 2021 · 8 comments · Fixed by #32084
Assignees
Labels
area-signalr Includes: SignalR clients and servers
Milestone

Comments

@shirhatti
Copy link
Contributor

shirhatti commented Jan 27, 2021

To allow APMs (e.g., Application Insights) to treat long running SignalR requests differently from the their request statistics, we should set some information (I presume via an Activity.Tag?) on current Activity to indicate a request is long running.

Edit: After more discussion with App Insights, we think headers will work.

cc @cijothomas

@shirhatti shirhatti added the area-signalr Includes: SignalR clients and servers label Jan 27, 2021
@BrennanConroy BrennanConroy added this to the Next sprint planning milestone Jan 27, 2021
@ghost
Copy link

ghost commented Jan 27, 2021

Thanks for contacting us.
We're moving this issue to the Next sprint planning milestone for future evaluation / consideration. We will evaluate the request when we are planning the work for the next milestone. To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@halter73 halter73 changed the title Add information to current Activity to indicate a SignalR request is long running Add information to current Activity or Response headers to indicate a SignalR request is long running Apr 12, 2021
@halter73
Copy link
Member

halter73 commented Apr 12, 2021

We're also considering using response headers (and possibly request headers if we want to wait to flush response headers for long polling or something) to indicate that a request is expected to be long-running. An upside to using a header instead of (or maybe in addition to) adding info to the current Activity is that it could also be used by reverse proxies.

Long-Running: true

@halter73 halter73 changed the title Add information to current Activity or Response headers to indicate a SignalR request is long running Add information to Response headers to indicate a SignalR request is long running Apr 12, 2021
@BrennanConroy BrennanConroy self-assigned this Apr 21, 2021
@shirhatti
Copy link
Contributor Author

My current proposal is that we don't take the current PR to add a response header. Given that Application Insights didn't find it useful and I know of nobody else clamoring for it, I don't think it's worth it.

I have two proposals on how to proceed here listed in the order of my preference:

Close the span prematurely

For example if /websocket is an endpoint that results in a connection upgrade, I'm now considering the HTTP request completed after the response headers indicating the upgrade have been flushed. Any subsequent data sent over the websocket wouldn't be part of the duration of the initial span (Activity) created for the HTTP request.

When SignalR now uses the established WebSocket as a channel, it can create a new Span for each RPC invocation (with possibly a Span link to the original associated with the incoming HTTP request). While this bit this optional for now, this is how envision it working if we went all the way and made SignalR span-aware.

endpoints.MapGet("/", async context =>
{
    using var webSocket = await context.WebSockets.AcceptWebSocketAsync();
    context.Features.Get<IActivityFeature>().Activity.Stop();
    await webSocket.ReceiveAsync(...);
    // do more stuff
});

For non-websocket transports, close the span whenever the determination is made that this request will be long-running.

While this helps ActivityListener based instrumentation, Application Insights may not benefit. While they will see shortened (in duration) spans, they don't subscribe to the ActivityListener based Activity stopped callback. They'll only see it when the DiagnosticsListener callback is fired when the long-lived WebSocket is closed/dies.

Add a new attribute (Activity.AddTag())

Invent a new attribute, not present in the HTTP Semantic conventions.
This may just result in crud since:

  • No APM vendor will ever elect to use this; OR
  • The HTTP semantic convention decides to actually address this and create a new attribute at which point we'll be stuck emitting both
endpoints.MapGet("/", async context =>
{
    using var webSocket = await context.WebSockets.AcceptWebSocketAsync();
    context.Features.Get<IActivityFeature>().Activity.AddTag("http.long_running", true);
    await webSocket.ReceiveAsync(...);
    // do more stuff
});

NOTE: The Semantic convention specification does define attributes for RPC, but these are for a spans created per-RPC invocation. There is nothing defined for the span associated with creating the channel used by the RPC framework.

@shirhatti
Copy link
Contributor Author

@noahfalk Do you know the GitHub aliases of other APM vendors? Can we tag them on here for feedback?

@noahfalk
Copy link
Member

noahfalk commented May 5, 2021

@davmason would know the best : )

@davmason
Copy link
Member

davmason commented May 5, 2021

Off the top of my head @pjanotti @macrogreg @zacharycmontoya @discostu105 @d-schneider are all people I engage with frequently. It would make sense to add a post at dotnet/runtime#9305 too if you want more APM vendors to see it.

@shirhatti
Copy link
Contributor Author

shirhatti commented May 11, 2021

@halter73, @BrennanConroy, @davidfowl, and I met to discuss this.

We've decided:

  • Not to emit a response header
  • Not to close the span prematurely. It leads to loss of fidelity in data. Though we agreed that in context of a RPC, the request that establishes the underlying channel for RPCs isn't the causal parent, there's no need to nerf it. If and when we have per-RPC call instrumentation, we'll change the causal parent for such Activities to the remote parent.
  • We'll be adding a new Activity Tag (i.e., span attribute) to the Activity to indicate that the request is long running. Given that the OpenTelemetry semantic conventions don't have a defined tag, we'll be inventing a new tag (we welcome feedback here!)
  • We'll add documentation/samples showcasing how to write a OpenTelemetry filtering processor to drop Activities with the specified tag.

👀 @cijothomas @rajkumar-rangaraj

@cijothomas
Copy link

@noahfalk Do you know the GitHub aliases of other APM vendors? Can we tag them on here for feedback?

@alanwest from NewRelic.

@BrennanConroy BrennanConroy changed the title Add information to Response headers to indicate a SignalR request is long running Add information to indicate a SignalR request is long running Jun 7, 2021
@ghost ghost locked as resolved and limited conversation to collaborators Jul 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-signalr Includes: SignalR clients and servers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants