Add attributes to GenAI message events indicating position within list of messages #1912

alexmojaki · 2025-02-17T11:50:09Z

Area(s)

area:gen-ai

What's missing?

If users want to count the number of times that a user message contains some keyword, their query has to account for the fact that the same user message can appear many times in the logs, since the whole message history is resent and relogged with each request in a back-and-forth conversation. Conceptually this means they need to filter down to user messages which are the last message in the message history, because if there are other messages after it then it was sent and logged before.

Currently this is at least difficult, maybe impossible. A query has to do something like get the child event with the latest timestamp in each parent span. But as pointed out in #1883 a single message can contain both text from the user and tool call responses, generating multiple events. Getting the the last gen_ai.user.message event wouldn't work either, e.g. this would double count in the case where gen_ai.user.message is followed by a tool call and response with no further gen_ai.user.message.

Describe the solution you'd like

I propose that the gen_ai.*.message events generated for each request message as in https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/ should have two additional attributes representing:

The total number of messages sent in the API request, i.e. the length of the array representing the message history
The position/index within that array of messages of the message corresponding to the event in question

A single message can produce multiple events, so multiple events can have the same value for the position attribute. This can be used to reconstruct the actual API request based on the full list of events as requested in #1883.

Filtering for the last user message then means checking that the position is equal to the total, or the total minus one if the position attribute is a 0-based index.

The text was updated successfully, but these errors were encountered:

alexmojaki · 2025-02-18T11:10:06Z

Thinking about this more, clients can send multiple user messages at the end of the list, even if that's weird. so maybe it would be better to have a boolean which is set to true for user messages that have no non-user messages after it.

michaelsafyan · 2025-02-27T18:07:31Z

+1 to this. It is strange that index exists on response event but not on the request event. It would be good to be able to identify the index/position for the prompt, too.

github-actions bot added triage:needs-triage area:gen-ai labels Feb 17, 2025

github-project-automation bot added this to GenAI Semantic Conventions and Instrumentation libraries and DRAFT - SemConv Issue Triage Feb 17, 2025

github-project-automation bot moved this to New issues in GenAI Semantic Conventions and Instrumentation libraries Feb 17, 2025

github-project-automation bot moved this to Need triage in DRAFT - SemConv Issue Triage Feb 17, 2025

alexmojaki mentioned this issue Feb 17, 2025

Model AI chat events as a list of request/response messages, with each message containing a list of parts #1913

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add attributes to GenAI message events indicating position within list of messages #1912

Add attributes to GenAI message events indicating position within list of messages #1912

alexmojaki commented Feb 17, 2025

alexmojaki commented Feb 18, 2025

michaelsafyan commented Feb 27, 2025

Add attributes to GenAI message events indicating position within list of messages #1912

Add attributes to GenAI message events indicating position within list of messages #1912

Comments

alexmojaki commented Feb 17, 2025

Area(s)

What's missing?

Describe the solution you'd like

alexmojaki commented Feb 18, 2025

michaelsafyan commented Feb 27, 2025