-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin telemetry events #894
Conversation
Hey @akoutmos thanks for taking a stab at this! Here's the challenge we need to solve: Dataloader is an optional dependency within Absinthe. I don't think it's appropriate for Absinthe to emit dataloader specific events. I do think that the resolution phase should emit events for when a plugin callback has started, and when a plugin callback has ended though, and the plugin module would be part of the event emitted. For Dataloader specific events, I think we need to have those fire within Dataloader itself. |
|
||
emit_start_event(start_time, exec, loader) | ||
updated_loader = Dataloader.run(loader) | ||
emit_stop_event(start_time, exec, loader) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specifically, I think we should move these start and end events into https://github.com/absinthe-graphql/absinthe/blob/master/lib/absinthe/phase/document/execution/resolution.ex#L78 and it would be something like:
defp run_callbacks(plugins, callback, acc, true) do
Enum.reduce(plugins, acc, fn plugin, acc ->
start_time = System.monotonic_time()
:telemetry.execute(
[:absinthe, :plugin, callback, :start]
%{start_time: start_time},
%{plugin: plugin, acc: acc}
)
acc = apply(plugin, callback, [acc])
# stop event
acc
end)
end
@benwilson512 I think my latest commits address the comments that you brought up. I really like this approach compared to my original solution as it is more general purpose now and covers all the plugins. On that same note, is a dataloader specific event even needed? You can match on {
:telemetry_event,
[:absinthe, :plugin, :callback, :start],
%{start_time: _},
%{loader:Absinthe.Middleware.Dataloader}
} to get your dataloader (or any plugin for that matter) specific events. Thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvements!
Enum.reduce(plugins, acc, fn plugin, acc -> | ||
start_time = System.monotonic_time() | ||
|
||
emit_start_event(start_time, acc, plugin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close! So, callback
here is either the atom :before_resolution
or :after_resolution
, and I was thinking we could use that to dispatch more finely tuned events. Specifically this would get us:
[:absinthe, :plugin, :before_resolution, :start]
[:absinthe, :plugin, :before_resolution, :stop]
[:absinthe, :plugin, :after_resolution, :start]
[:absinthe, :plugin, :after_resolution, :stop]
:telemetry.execute( | ||
@callback_stop_event, | ||
%{duration: System.monotonic_time() - start_time}, | ||
%{acc: acc, loader: loader} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loader
variable here is I think the plugin
right? We probably want to call it plugin: plugin
My bad. I read your callback line as an atom and not the function variable. Third time is a charm I think I addressed all your feedback :D. Thoughts on the separate dataloader telemetry events? Thanks again!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me!
Thoughts on the separate dataloader telemetry events?
Let's do these within the Dataloader project itself: https://github.com/absinthe-graphql/dataloader. PR welcome. We probably want to emit events for each run_batch
call basically.
@binaryseed you did the original telemetry stuff, this look good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern would be how frequently these will happen. With field-level events, we made sure that the telemetry only triggered when a resolver function was explicitly called. We need to make sure that this instrumentation won't cause there to be a huge increase in calls to execute
, since they are not free (an ETS lookup so cheap, but not free)
# This test emits a total of 24 events, but there we are only validating the | ||
# dataloader events in this test | ||
results = | ||
Enum.reduce(1..24, %{dataloader_starts: 0, dataloader_stops: 0}, fn _, acc -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear why the reduce
is needed, the other telemetry tests managed to work just fine with assert_receive
...
assert_receive {[:absinthe, :execute, :operation, :start], _, %{id: id}, _config} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the reduce is so that I can count only the specific events that I want to fetch. Since the telemetry events emit on all plugins, I was also getting Async and Batch events. But only wanted to assert on the Dataloader specific events.
Also, we have a pattern of sticking an
|
Is this ready for re-review? |
I think I addressed everything on my end. The only open question from what I gather is the number of events being generated...but I defer to you guys on that one. |
So, I looked into this locally on my API, and the problem is real. We've got a few plugins installed for standard stuff like auth & monitoring, and this addition calls That means when using stuff like For example, I tried it out with a really simple query, and there were 6 resolution phases. Each phase added 14 calls to execute (we have 7 plugins). That means a total of 84 additional calls to I'm a little worried about this... It's pretty unlikely that the raw data here is going to be leveraged, and that's a good bit of work to do for something that's rarely going to be used. I"m wondering if directly instrumenting Dataloader would be a better way to go about this vs instrumenting every single plugin interaction? |
@binaryseed This is a good point, I didn't think about the fact that this will run N*M times, where N is the number of plugins, and M is the number of resolution phases that happen (roughly proportional to the depth of the GraphQL query). Perhaps an alternative strategy would be to emit events only within the functions within the |
No worries at all @benwilson512! That sounds good to me in regards to moving the telemetry events to the plugins directly. If that sounds good to you as well @binaryseed then I can get going on this approach and update this PR + open up a new one for the dataloader repo. |
Instrumenting |
@akoutmos do note the naming conventions mentioned here: #901 (review) |
Working on cleaning up this PR to reflect the conversation that we had. But managed to wrap up the Dataloader changes: absinthe-graphql/dataloader#89 Will update this PR over the weekend. Thanks! |
I think I addressed everything that we discussed and also updated the documentation. Question for you guys. Should I remove the telemetry tests in the |
Yeah, I'd much rather we keep the tests for dataloader telemetry events inside the library itself... |
I went ahead and removed the test @binaryseed |
Thanks! I'm going to merge this and followup with a small commit to move the |
Here it is: 38016db |
Thanks @binaryseed! Let me know what you think of the dataloader changes and if you require any changes there absinthe-graphql/dataloader#89. |
This PR addresses some of my thoughts that I raised in #892 and emits start+stop events for dataloader events where there are
pending_batches
to process. Those two additional Telemetry events are:[:absinthe, :dataloader, :resolve, :start]
when a dataloader with pending batches starts[:absinthe, :dataloader, :resolve, :stop]
when a dataloader with pending batches finishesThis is just an initial stab at the problem, so feel free to comment. Thanks :)!