open-telemetry · cartermp · Sep 13, 2023 · Sep 13, 2023 · Sep 13, 2023 · Sep 13, 2023
@@ -0,0 +1,32 @@
+<!--- Hugo front matter used to generate the website version of this page:
+linkTitle: AI
+path_base_for_github_subdir:
+  from: content/en/docs/specs/semconv/ai/_index.md
+  to: database/README.md
+--->
+
+# Semantic Conventions for AI systems
+
+**Status**: [Experimental][DocumentStatus]
+
+This document defines semantic conventions for the following kind of AI systems:
+
+* LLMs
+* LLM Chains and Agents
+* LLM Frameworks (e.g., LangChain, LlamaIndex)
+* Vector Embeddings
+* Vector Databases (e.g., Pinecone, Milvus)
+
+Semantic conventions for LLM operations are defined for the following signals:
+
+* [LLM Spans](llm-spans.md): Semantic Conventions for LLM requests - *spans*.
+* [LLM Chains and Agents](llm-chains-agents.md): Semantic Conventions for LLM chains and agents - *spans*.
+
+Technology specific semantic conventions are defined for the following LLM providers:
+
+* [OpenAI](openai.md): Semantic Conventions for *OpenAI*.
+* [Anthropic](anthropic.md): Semantic Conventions for *Anthropic*.
+* [Cohere](cohere.md): Semantic Conventions for *Cohere*.
+* [Replicate](replicate.md): Semantic Conventions for *Replicate*.
+
+[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md
@@ -0,0 +1,35 @@
+<!--- Hugo front matter used to generate the website version of this page:
+linkTitle: Anthropic
+--->
+
+# Semantic Conventions for Anthropic
+
+**Status**: [Experimental][DocumentStatus]
+
+The Semantic Conventions for [Anthropic](https://docs.anthropic.com/claude/docs) extend the [LLM Semantic Conventions](llm-spans.md)
+that describe common LLM request attributes in addition to the Semantic Conventions
+described on this page.
+
+## Anthropic LLM request attributes
+
+These are additional attributes when instrumenting Anthropic LLM requests.
+
+<!-- semconv llm.anthropic(tag=llm-request-tech-specific) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.anthropic.top_k` | int | If present, represents the value used to only sample from the top K options for each subsequent token. | `5` | Required |
+| `llm.anthropic.metadata.user_id` | string | If present, the `user_id` used in an Anthropic request. | `bob` | Required |
+
+## Anthropic LLM response attributes
+
+These are additional attributes when instrumenting Anthropic LLM responses.
+
+### Chat completion attributes
+
+These are the attributes for a full chat completion (no streaming).
+
+<!-- semconv llm.anthropic(tag=llm-response-tech-specific) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.anthropic.stop_reason` | string | The reason why the model stopped sampling. | `stop_sequence` | Required |
+| `llm.anthropic.model` | string | The name of the model used for the completion. | `claude-instant-1` | Recommended |
@@ -0,0 +1 @@
+todo
@@ -0,0 +1 @@
+todo
@@ -0,0 +1,49 @@
+# Semantic Conventions for LLM requests in Chains or Agents
+
+**Status**: [Experimental][DocumentStatus]
+
+<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` -->
+
+<!-- toc -->
+
+- [LLM Request attributes](#llm-request-attributes)
+- [Semantic Conventions for specific LLM technologies](#semantic-conventions-for-specific-llm-technologies)
+
+<!-- tocstop -->
+
+A chain is defined as a sequence of actions, some of which may involve a call to an LLM, controlled by a program. Some requests are made in parallel, following a map-reduce pattern, and some are sequential. Crucially, requests to an LLM are initiated programmatically.
+
+An agent is defined as an executable that, given instructions, performs any number of actions, some of which may involve requests to an LLM or other services, until certain criteria is satisfied (such as a known end state being reached, an error, or an output evaluating a certain way). Although similar to a chain, and agent is distinguished by the ability to make a request to an LLM on behalf of a program. Requests to an LLM are not controlled by a program, but rather by the agent itself.
+
+In both cases, traces model the behavior of a chain or an agent. As such, spans in a chain or agent should follow the guidance in [llm-spans](llm-spans.md).
+
+However, a key conceptual difference between traces used to model LLM behavior and distributed traces is that a group of one or more spans may represent a *step* of a chain or an agent. In simpler applications, such as directly chaining a fixed number of LLM requets together, a single span can adequately represent each step in the chain. However, more complex applications often require a group of spans.
+
+For example, consider an agent that continuously reads data from a knowledge base, makes a request to an LLM to summarize the data, and evaluates the effectiveness of that summarization, repeating the process until success criteria is met:
+
+- One or more spans that tracks retrieving a subset of the knowledge base
+- One or more spans that tracks one or more requests to an LLM (perhaps in parallel)
+- One or more spans that tracks parsing, validation, and/or merging of results from LLM requests
+- One or more spans that tracks an evaluation of the final result
+
+Each of the above groups of spans may represent a single *step* of a chain or agent, indicating a need to distinguish each *step*.
+
+## LLM Chain attributes
+
+Despite the similarity with agent attributes, chain attributes are distinguished to represent the difference between a chain and an agent, especially when the two are mixed together.
+
+<!-- semconv ai(tag=llm-chain-step) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.chain.name`|string|The name of the chain.|`answer-question`|Required|
+| `llm.chain.step`|int|Denotes the current step or iteration of an LLM chain.|`0`|Required|
+
+## LLM Agent Step attributes
+
+Despite the similarity with chain attributes, agent attributes are distinguished to represent the difference between a chain and an agent, especially when the two are mixed together.
+
+<!-- semconv ai(tag=llm-agent-step) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.agent.name`|int|The name of the agent.|`document-system-analyzer`|Required|
+| `llm.agent.step`|int|Indicates the current step or iteration an agent is performing one or more tasks.|`0`|Required|
@@ -0,0 +1,76 @@
+<!--- Hugo front matter used to generate the website version of this page:
+linkTitle: LLM Calls
+--->
+
+# Semantic Conventions for LLM requests
+
+**Status**: [Experimental][DocumentStatus]
+
+<!-- Re-generate TOC with `markdown-toc --no-first-h1 -i` -->
+
+<!-- toc -->
+
+- [LLM Request attributes](#llm-request-attributes)
+- [Configuration](#configuration)
+- [Semantic Conventions for specific LLM technologies](#semantic-conventions-for-specific-llm-technologies)
+
+<!-- tocstop -->
+
+A request to an LLM is modeled as a span in a trace.
+
+The **span name** SHOULD be set to a low cardinality value representing the request made to an LLM.
+It MAY be a name of the API endpoint for the LLM being called.
+
+## Configuration
+
+Instrumentations for LLMs MUST offer the ability to turn off capture of raw inputs to LLM requests and the completion response text for LLM responses. This is for two primary reasons:
+
+1. Data privacy concerns. End users of LLM applications may input sensitive information or personally identifiable information (PII) that they do not wish to be sent to a telemetry backend.
+2. Data size concerns. Although there is no specified limit to the size of an attribute, there are practical limitations in programming languages and telemety systems. Some LLMs allow for extremely large context windows that end users may take full advantage of.
+
+By default, these configurations SHOULD capture inputs and outputs.
+
+## LLM Request attributes
+
+These attributes track input data and metadata for a request to an LLM. Each attribute represents a concept that is common to most LLMs.
+
+<!-- semconv ai(tag=llm-request) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.model` | string | The name of the LLM a request is being made to. If the LLM is supplied by a vendor, then the value must be the exact name of the model used. If the LLM is a fine-tuned custom model, the value SHOULD have a more specific name than the base model that's been fine-tuned. | `gpt-4` | Required |
+| `llm.prompt` | string | The full prompt string sent to an LLM in a request. If the LLM accepts a more complex input like a JSON object made up of several pieces (such as OpenAI's different message types), this field is that entire JSON object encoded as a string. | `\n\nHuman:You are an AI assistant that tells jokes. Can you tell me a joke about OpenTelemetry?\n\nAssistant:` | Required |
+| `llm.max_tokens` | int | The maximum number of tokens the LLM generates for a request. | `100` | Recommended |
+| `llm.temperature` | float | The temperature setting for the LLM request. | `0.0` | Recommended |
+| `llm.top_p` | float | The top_p sampling setting for the LLM request. | `1.0` | Recommended |
+| `llm.stream` | bool | Whether the LLM responds with a stream. | `false` | Recommended |
+| `llm.stop_sequences` | array | Array of strings the LLM uses as a stop sequence. | `["stop1"]` | Recommended |
+
+`llm.model` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.
+
+| Value  | Description |
+|---|---|
+| `gpt-4` | GPT-4 |
+| `gpt-4-32k` | GPT-4 with 32k context window |
+| `gpt-3.5-turbo` | GPT-3.5-turbo |
+| `gpt-3.5-turbo-16k` | GPT-3.5-turbo with 16k context window|
+| `claude-instant-1` | Claude Instant (latest version) |
+| `claude-2` | Claude 2 (latest version) |
+ `other-llm` | Any LLM not listed in this table. Use for any fine-tuned version of a model. |
+<!-- endsemconv -->
+
+## LLM Response attributes
+
+These attributes track output data and metadata for a response from an LLM. Each attribute represents a concept that is common to most LLMs.
+
+<!-- semconv ai(tag=llm-response) -->
+| Attribute  | Type | Description  | Examples  | Requirement Level |
+|---|---|---|---|---|
+| `llm.completion` | string | The full response string from an LLM. If the LLM responds with a more complex output like a JSON object made up of several pieces (such as OpenAI's message choices), this field is the content of the response. If the LLM produces multiple responses, then this field is left blank, and each response is instead captured in an attribute determined by the specific LLM technology semantic convention for responses.| `Why did the developer stop using OpenTelemetry? Because they couldn't trace their steps!` | Required |
+
+## Semantic Conventions for specific LLM technologies
+
+More specific Semantic Conventions are defined for the following database technologies:
+
+* [OpenAI](openai.md): Semantic Conventions for *OpenAI*.
+
+[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md