-
Notifications
You must be signed in to change notification settings - Fork 863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Telemetry Consumption package #571
Conversation
src/Microsoft.ReverseProxy.TelemetryConsumption/IReverseProxyBuilderTelemetryExtensions.cs
Outdated
Show resolved
Hide resolved
return; | ||
} | ||
|
||
foreach (var consumer in _serviceProvider.GetServices<IProxyMetricsConsumer>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider for consuming in parallel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These callbacks execute inline within the request, so the expected scenario is to have very few (just 1) consumer and for it to be very cheap and execute quickly.
I don't believe we would gain anything from parallelizing consumers (and the cost of it would likely outweigh any benefit).
If you were only referring to the metrics part, we could consider pushing that work onto the ThreadPool instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments.
|
||
const int TotalEventCounters = 10; | ||
|
||
if (++_eventCountersCount == TotalEventCounters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please move the increment out of if
expression? It seems a bit unusual.
return; | ||
} | ||
|
||
foreach (var consumer in _serviceProvider.GetServices<IKestrelMetricsConsumer>()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understood it correctly, IKestrelMetricsConsumer
will be resolved for each EventWrittenEventArgs
which seems a lot. Would it be better to resolve it once and store on a field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the metrics, we're resolving the consumers on each EventCounters update - so once a second, which doesn't seem problematic.
For events, we are resolving it for each event.
Since we're relying on DI for lifetime management, we have to do it each time. If we documented that we disallow transient consumer services (since they don't make much sense to use for events anyway), we could avoid the cost on each event and instead use an AsyncLocal
to cache the resolved services.
src/ReverseProxy.TelemetryConsumption/NameResolution/NameResolutionEventListenerService.cs
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/NameResolution/NameResolutionEventListenerService.cs
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/NameResolution/NameResolutionEventListenerService.cs
Outdated
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/NameResolution/NameResolutionMetrics.cs
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/NetSecurity/INetSecurityMetricsConsumer.cs
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/NetSecurity/INetSecurityTelemetryConsumer.cs
Show resolved
Hide resolved
@davidni Could you please look at this as well to make sure the public APIs are matching your expectations/needs |
src/ReverseProxy.TelemetryConsumption/Microsoft.ReverseProxy.Telemetry.Consumption.csproj
Outdated
Show resolved
Hide resolved
src/ReverseProxy.TelemetryConsumption/Microsoft.ReverseProxy.Telemetry.Consumption.csproj
Outdated
Show resolved
Hide resolved
} | ||
} | ||
|
||
private sealed class MetricsConsumer : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How closely do these Telemetry and Metrics tests reflect the actual expected usage? If this isn't how we expect it to be used then please add a sample that shows the expected usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented only on the first set of consumer classes but the comments apply to all of them.
@@ -0,0 +1,257 @@ | |||
// Copyright (c) Microsoft Corporation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're adding a new library ReverseProxy.TelemetryConsumption
shouldn't the tests be also in their own test project?
src/ReverseProxy.TelemetryConsumption/Microsoft.ReverseProxy.Telemetry.Consumption.csproj
Show resolved
Hide resolved
|
||
protected override void OnEventWritten(EventWrittenEventArgs eventData) | ||
{ | ||
if (eventData.EventId < 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we throw an exception from OnEventWritten
method, it'll bubble up into whatever place generated the event, e.g. the call of SendAsync
in HttpProxy
. Shouldn't the whole method body be in try catch block then? Or do we want the errors from events consumption affect the proxy and thus get noticed more easily?
|
||
var expected = new[] | ||
{ | ||
"OnRequestStart-Kestrel", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these be created with nameof(...)
as well? Or are they like this on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these could be nameof
, but some are explicitly different since some names repeat.
Specifically both Runtime Http and Kestrel include RequestStart
and RequestStop
events.
|
||
namespace Microsoft.ReverseProxy.Common | ||
{ | ||
public class TestEnvironment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for extracting this.
} | ||
} | ||
|
||
protected override void OnEventWritten(EventWrittenEventArgs eventData) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OnEventWritten
can run before the derived constructor on an EventListener runs.- dotnet/runtime#38804 (comment)
You may need to guard against it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more checks to avoid this in d11d635.
From what I can tell, the linked example has a race condition:
After setting _initialized
but before looping over the queue in the ctor, new EvnetSource
s may still be added to the queue and enabled in OnEventSourceCreated
, and will thus be enabled twice (don't know if EventSource
internals have any checks guarding against this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EnableEvents
is safe to call twice.
@@ -3,7 +3,7 @@ | |||
|
|||
namespace Microsoft.ReverseProxy.Telemetry | |||
{ | |||
internal enum ProxyStage : int | |||
public enum ProxyStage : int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought everything was supposed to be decoupled to the point where you didn't even need a dependency on the core package? Is this type the only dependency? Would duplicating it in the telemetry package work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, technically this is the only type we depend on.
As long as the values are kept in sync, we can duplicate it in the Telemetry package and remove a dependency on the core package entirely.
The only other thing the telemetry package depends on is that consumers of the library target a preview >= 7, otherwise they will not observe proxy events (wouldn't break anything tho).
We're currently relying on the package reference in 3 places:
The extension methods could be swapped to use But we would have to duplicate the already public I think we should:
@Tratcher thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That plan makes sense, but see my comment below about using IReverseProxyBuilder.
return builder; | ||
} | ||
|
||
public static IReverseProxyBuilder AddKestrelTelemetryListener(this IReverseProxyBuilder builder) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kestrel, HttpClient, DNS, etc. specific extensions don't belong on IReverseProxyBuilder, move these to IServiceCollection. At that point I'm not sure if it makes sense to still have the others on IReverseProxyBuilder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed all to use IServiceCollection instead.
@Tratcher is there anything else that should be done here before merging the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only outstanding question is https://github.com/microsoft/reverse-proxy/pull/571/files#r535719286
I added a sample telemetry & metrics consumer to Code.Sample. More examples can be added as part of #676. I also wrapped the metrics consumers in a try/catch, passing any exceptions to ILogger instead of letting them propagate to the CounterGroup thread and crashing the process. |
Opening for feedback on the design and implementation.
Includes APIs for 6 EventSources:
For each of those, providing 4 components:
I{Name}TelemetryConsumer
- interface with methods mirroring EventSource's Events that are relevant within the scope of an HTTP request{Name}Metrics
- Immutable POCO containing properties mirroring EventSource's EventCountersI{Name}MetricsConsumer
- interface with a single methodvoid On{Name}Metrics(Metrics oldMetrics, Metrics newMetrics)
{Name}EventListenerService
- internal infrastructure for dispatching events and metrics, exposed as an extension method onIReverseProxyBuilder
:Add{Name}TelemetryListener
The added tests are not exhaustive for every event and parameter.