-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agent Configuration in Kibana, GA #138
Comments
Regarding which config options to support, I just want to mention that of the ones listed above, Whether or not all agents support a given config option might of course not be a dealbreaker for using it in the central configuration. But if we do, do we know how it would look if say the Node.js agent tries and fails to set |
Agreed, but given the range of options available is probably better to maximize utility by choosing settings that all agents support. Thanks! |
I'd opt for:
Not sure about Also, not sure about |
I could see users eventually using only central configuration to push agent configurations eventually so the more options available to prevent that, the better IMO, regardless of whether they seem useful to change often. |
After discussion this morning with @sqren we concluded that it makes sense that Kibana keeps track of Etags instead (right now they are generated by apm-server). As per the suggested fields, is there any one not in my initial proposal that makes more sense to consider now? |
Any configuration which is marked as reloadable would be interesting: https://docs.google.com/spreadsheets/d/1JJjZotapacA3FkHc2sv_0wiChILi3uKnkwLTjtBmxwU |
For anyone interested: I created an issue for targeting multiple services/environments with a single configuration: elastic/kibana#44475. |
@elastic/apm-agent-devs you can link your implementation issue above |
Don't we need to first solve #92 before we can continue with this one? |
Sure. What I meant is that when you have an implementation issue, you can link it above. |
I'm a bit unclear what needs to be done. Is there a final list of configuration options that we should implement? |
The Java agent has generic support for any configuration option marked as dynamic, including |
Given that
|
That looks mightily confusing for our users. I think we first need to harmonize on a common name and semantics. Should I open a new issue for that? |
@beniwohli 😩 yes please |
@formgeist we decided in yesterday's meeting on Note that |
@beniwohli Excellent! Will we need any specific validation needed on the |
Hah, that's where there are some discrepancies amongst the agents. Some have special meanings for Can we go with |
Sure, it was only to get some ideas around how we should validate that field. Thanks! I'll move forward with this and get the design in order for implementation. We can sort out the validation details when you folks have made up your minds 😉 |
Just a note, both of these options don't apply for RUM. |
@formgeist the deathmatch happened, and this is where we landed: all backend agents will treat |
@axw Thanks for the update, I'll make sure to update the description. |
I had a chat with @sqren and we concluded that now is a good time to allow users to select "all services" for which to apply a configuration. This config will be stored with an empty service name in Elasticsearch. When querying config, if there is a conflict, the most explicit one (ie with no empty service ) takes precedence. When new services are added, they will be affected by the "all services" config (if it exists) right away. No changes are required for agents. |
Have you thought about merging configurations? Example:
service:
service:
Effective configuration
service:
A while ago we also discussed allowing to create different configurations blocks for specific All of that is ofc. something we can add later as well. I'm just curious 🙂 |
@felixbarny Have you thought about merging configurations? Yes, we did talk about it. We need to think more about it to avoid it becoming very confusing to users - so it won't happen this release. I did play around with it though and one thing I couldn't figure out how to handle was the "applied flag". In this version we are adding the applied flag, so that agents can report back which configuration they have applied, and the user will be able to see in the UI which configurations have been applied, and which have not. However, if we merge configurations the agent will not apply a single configuration but multiple. Which configuration should the UI then indicate as being applied? One or all of them? Or should the apply mechanism not be on a configuration level but on a field level? (that would require the agent to report back which configuration id it got each field from...) |
@sqren by "this version" you mean 7.5? Also, has the method the agents report back which configuration they applied already been spec'd out somewhere? I couldn't find it on a cursory search. |
Yes, 7.5.
Yes, it's in the description of this issue. Agents don't have to do anything if they already sent back the etag they receive from apm-server. |
Regarding |
the Ruby agent doesn't have |
Not really an implementation detail, it is just a whitelist of content types which is made configurable (not required- has a default). |
To be honest, given the experience we had with just agreeing on name and behavior of two config options, I'm against a free-text configuration system. Before that can happen, we need a systematic review of all common config options, and ensure that all agents agree on common behavior. Otherwise, the user experience will be terrible for people who use more than one language. I suggest that we set up a working group for this, with 3 people from 3 different agents. For each config option, the group members write up the exact behavior of the option, and compare notes. If all three agree, it's a good indicator that it's the correct behavior, and should be put in the spec. If not, the members can then discuss how a common spec for that option should look, and suggest a spec to the wider agent team. For future config options, a spec should be written from the beginning. WDYT? |
eIs your feature request related to a problem? Please describe.
APM delivered Agent Configuration in Kibana as beta in 7.3, intentionally deferring some aspects for later phases. The goal of this issue is to pick up where we left and have a more compelling feature towards GA. We are aiming 7.5.
Two lines of work stand out:
Provide support for more settings.
Provide feedback to the user in Kibana.
Describe the solution you'd like
On the first point, ideally we add support for 2-3 more fields. Some reasonable candidates are:
ACTIVE/RECORDING
(depends on [agents] definition of ACTIVE/DISABLED_INSTRUMENTATION #92)CAPTURE_BODY
METRICS_INTERVAL
IGNORE_URLS
TRANSACTION_MAX_SPANS
-SPAN_FRAMES_MIN_DURATION
On the second point, a good start would be to show in Kibana whether some configuration was applied by some agent or not. For this, we could use the Etags sent from the agents to know what is the last good value they have. More precisely:
false
, and flip it totrue
when it receives an Etag from any one agent matching the Etag in the document.This approach entails a feedback delay of up to 2 times the
POLL_INTERVAL
, which is acceptable.One downside is that if some agents succeed and others fail, those failures would be silently ignored. This is (hopefully!) an unlikely scenario.
Another downside is that it is not possible for Server/Kibana to distinguish failure from missing (agent never querying) unless we keep track of how many agents are around. This would add significant complexity. A workaround is trough documentation, warning users that if they don't see feedback in 2x
POLL_INTERVAL
seconds it is probably because something went wrong.Another option is that agents send the
ephemeral_id
upstream, and Kibana shows a count of agents that applied the last configuration based on the ephemeral ids (or maybe even show the ids themselves). This requires agents to calculate/generate theirephemeral_id
. Alternatively, agents can use their IP address.Describe alternatives you've considered
We could come up with a way to aggregate data coming from different agents. For instance:
ephemeral_id
, IP address, or similar.This would require agents to send data to apm-server (probably to the same endpoint) and define a new schema. A new schema would also allows us to show more information than a boolean, eg: timestamp of config application, error messages, etc.
I suggest to not introduce a new data model until we know more how the feature is used, what problems users run into, how/why exactly agents might fail to apply configuration, etc.
We also need to decide if we want to support RUM in GA, later, or never. User feedback is probably not practical for the RUM agent.
Note that at the moment Agent Configuration in Kibana is unusable for customers with a Distributed Tracing setup, as the only available setting is
SAMPLING_RATE
, which in case of DT will be dictated by the RUM agent (almost always).RUM status is tracked in elastic/apm-agent-rum-js#253
Finally, during the first design we briefly touched on auditing. Some sort of audit logs could be achieved eg. by creating one elasticsearch document per update, and adding information about the user that did that change. However if we do not plan to add an UI interface for it, it might be enough to simply log configuration updates in Kibana.
Implementation issues
Kibana
Server
Agents
See background in #4 and #76
The text was updated successfully, but these errors were encountered: