-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DB Convention does not cover batch/multi/envelope operations #712
Comments
Hey! Thanks for your question. |
Anything is possible ;) Do you mean that the server-side of the
Why do you say that a span for the |
The database semantic conventions were only designed for client-side calls, not for the server end. If you could share some details on how your instrumentation looks like for the server and what you would expect from a semantic convention for calls from the server's perspective in a separate issue we can look into crafting such conventions based on the client ones. Ah I did not consider
Alternatively, if your database client library accepts the operations individually and only combines them into a batch later on, you could make each individual operation span be the child of it's "actual" causal parent and use span links to link the batch span to them instead.
The batch going over the wire in one RPC call would be modeled as a child of the batch span in either case. |
In HBase, we will apply the batch at server side as a whole(almost), thw work flow is like this: RPC server receives a batch -> grab all the row locks -> Build the WAL edit for all the operations -> Write out the WAL edit -> Apply all the operations to memstore -> advance the MVCC number -> return So typical I do not think it is possible to use different spans to trace different operations in the batch. As you can see, although in every step we will likely process the operations one by one, but looking at a higher level, in each step will process all the operations and then go to the next step. It will be very strange to create a span for each operation and switch them all the time... Thanks. |
@arminru @bogdandrutu I wonder if you have any thoughts about the PR linked here. The idea is to expose a summary of the content of a batch operation as an additional attribute that is implementation-specific. Specifically, I hope that a span storage/query system would be able to make use of that attribute to enable operators to find all spans that execute a given operation, whether that operation is executed at the top level or it is a part of a batch operation. |
Assuming there is just one bulk operation that deals with batch as a whole, I can think of the following solutions: E.g bulk operation consists of Option 1. Attributes with array values
Cons:
Option 2. Events/logs
We do something similar in messaging (with links though):
DB operations don't have an individual trace-context, so links are not suitable here, but events could work. Then it should also be easier to enable/disable sub-operation reporting depending on the needs. the drawback is that events/logs could go to a different backend Cons:
Option 3. Creating artificial spans per sub-operation Cons:
Additional things to consider:
|
Cosmos DB is currently creating a string attribute We should be able to capture:
I prefer the simplicity of Option 1. Attributes with array values, but agree it creates a challenge for additional information about each batch operation. The most important piece of additional information for Cosmos DB is operation count, so maybe something like the following could work: |
@lmolkova isn't that conflating "batch" with "bulk", by proposing Regardless, the commands contained in a batch are typically the same in every aspect as a standalone command not executed in a batch; each command has a SQL (so Note that it's true that certain attributes must be the same across all commands in the batch, e.g. the hostname, network info, etc. So these attributes could optionally be lifted up to the span representing the batch, leaving on the command only attributes which can vary (e.g. SQL, parameter info). |
@roji one difference between batch operations and standalone operations is the duration. If you add each operation in a batch as its own span, is the duration of each sub operation the same as the parent? Is it 0? This could also create confusion because it may make those operations appear either abnormally quick or abnormally long |
@jcocchi that's true indeed... I don't know if there are other OTel cases where a larger "logical container" span wrap nested spans as in this case, and how that's best represented... |
Our API has a small alphabet of relatively simple operations key-value operations (
get
,put
,delete
, &c.) For these, the operation names seem clear. We also have a set of operations that support bulk/batching of operations. these can be homogeneous or heterogeneous. For example,batch
can accept a set of any combination ofget
,put
,delete
, &c. We also support a generic system for server-side compare-and-mutate, where some predicate based on a query over existing data is provided, and when the predicate returns true, some operation is applied — that operation can be a simple or a batch operation. for these collections of heterogenous operations, how should be annotate the span?The text was updated successfully, but these errors were encountered: