-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow configuring a mapping from endpoint to a service name #555
Comments
Warming back up to having the config as well - while it will miss some cases it's true that we should get what we can without app modifications before resorting to them. Though another realization I had is it's common for things like elasticsearch, cassandra, to only input 1-2 IPs and get the rest from the cluster, making the static mapping problem even more challenging >< |
First, I am not sure I quite understand "With manual instrumentation, users would generally set CatCache, DogCache as service names when registering each client with a tracer". Can you point me to the corresponding API method? Second, I am not sure this functionality should live in agent as opposed to Collector. Collector could use the whole span to determine the desired service name. E.g. in case of path-based routing, such as Spring Gateway, service name does not depend on ip, but on the portion of the path requested. |
TBH I don't know if there is a great API in otel Java yet since we usually use auto instrumentation, but with Brave there's always a way to set the service name like here Services that use path, gRPC is also a famous one, are relatively easy to deal with because the service can be extracted from the path in a consistent way but not so simple for databases where often there is nothing except for an IP address. Collector is an option but I think it is a job of the instrumentation to provide a user the ability to specify these names, since there is nothing else in the span that can help for these databases it is completely up to the user. Requiring the collector for something as important as being able to differentiate between different database clients in the same app seems too strict to me. And while I can think of the programmatic approaches presented to solve the dynamic endpoint case, I think that will be basically impossible in the collector. |
If there is no such functionality in manual instrumentation (meaning otel-java), than IMO it is wrong to add this here but not there. Users should be able to get everything without the requirement to use byte code manipulation if they choose so. |
Well I don't think something like that would live in the SDK itself but in instrumentation. I would say the closest example we have is customization of kind here If we had more manual instrumentation, we could imagine they expose a builder or factory with customization options, similar to most of the brave instrumentation libraries. But right now we're still basically only auto instrumentation. Another point to keep in mind is with manual instrumentation it is easy to expose a knob but not with auto. So a solution for auto will almost assuredly be dramatics different, and often have tradeoffs, than one for manual instrumentation where it's very straight forward. |
Still not convinced :) @trask @tylerbenson @jkwatson what do you think? |
@iNikem are you not convinced about the |
I'd like the instrumentation group to explore the possibility of having auto-instrumentation that also allows programmatic, runtime configuration added to it, via callbacks, or instrumentation hooks (or something akin to those). I'd rather not bake that sort of thing into the SDK itself unless we have to. |
Attribute is good, no objection here. I just don't think configuration of "ip -> service name" mapping in agent is the best place for it. |
I also feel an ip -> service mapping is not great. Seems difficult to configure and maintain in any moderate sized environment. |
Yeah I mentioned my issues with the approach of mapping too ;) But it's the only approach I could find that holds the principle of agents, no modification to user apps. It doesn't solve complex cases but it does do fine with simpler ones. But I'm happy with focusing on a programmatic approach if that's not necessary - it was my first suggestion :) |
It's nice to have a codeless option for people who don't want to modify their application. We can still layer a programmatic approach on top of that for people who outgrow the codeless option. |
Yup except #657 was definitely lacking, sorry about that. Will follow up with that issue. |
It is generally important for a user to be able to specify a peer service name for each client they use in an app. For example (this example will apply to many different databases), a user may access two caches with Redis,
CatCache
andDogCache
. In Redis, it is not uncommon for an endpoint to be defined as an IP address. This means that spans will havenet.peer.ip
set to this IP address and no identifying information at all otherwise - this makes traces far less readable, especially since Redis itself will not have server spans showing up and the client spans must have as much utility as possible.With manual instrumentation, users would generally set
CatCache
,DogCache
as service names when registering each client with a tracer. But with auto instrumentation, we can't expect user apps to do this. A way to work around this would be for us to have a configuration, which is a mapping from endpoint -> service name. The endpoint could be matched againstnet.peer.name
andnet.peer.ip
, and for any span that matches, we populatepeer.service
, a new convention which I'm pretty sure will be approved since this use case is too important, with the name in the config.Some points to ponder
Map
type. Format? MaybePEER_ENDPOINT_NAME_MAPPING=1.2.3.4=CatCache,dogs.com=DogCache
peer.endpoint.name.mapping=1.2.3.4=CatCache,dogs.com=DogCache
(though I think there's some convention to use[key]
for a map in properties, but I'd probably stick with the simplerFor sharded, config has to have every endpoint mapped to the same name. Can't think of a way to avoid this though with auto instrumentation.
Endpoint groups are often dynamically updatable, but this configuration isn't. Probably have to give up on this point
For the harder cases where sharding is involved, we could additionally provide an entrypoint for a user app to use, it removes the "total auto instrumentation" but maybe is still an ok compromise.
Let me know any thoughts. After thinking about the sharded case, I'm starting to lean towards not bothering with a static config since it seems like it could miss a lot of use cases. If it was trivial, yeah, but there will be infrastructure for supporting a map configuration.
The text was updated successfully, but these errors were encountered: