Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding inference trace injection #36890

Closed
Closed
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
06cef91
adding inference trace injection
Aug 14, 2024
9dc2cf9
changing the interface based on feedback
Aug 16, 2024
58a032b
updates
Aug 16, 2024
ec1cd16
changing name of environment variable
Aug 20, 2024
3270076
changes based on review comments and some other changes
Sep 6, 2024
7cbbc0b
file name change
Sep 6, 2024
941a9ae
fixing exception handling
Sep 10, 2024
bcc6e74
relocating inference trace instrumentation
Sep 10, 2024
709923c
reverting change in azure core tracing
Sep 10, 2024
baac83f
Merge branch 'main' into mhietala/inference_genai_tracing
Sep 16, 2024
a64d870
fixes
Sep 16, 2024
198b9cd
changing span and model name for cases when model info not available
Sep 17, 2024
cd8bba2
some fixes
Sep 17, 2024
b28a3fe
adding sync trace tests
Sep 20, 2024
b549b38
fix and async trace test
Sep 23, 2024
469d32c
updating readme and setup
Sep 23, 2024
f1424a1
adding tracing sample
Sep 23, 2024
92da09a
changes based on review comments
Sep 25, 2024
d9652f5
changed to readme based on review comments
Sep 26, 2024
6da2a7d
removed distributed_trace and some other updates
Sep 26, 2024
521f7f0
fixing pre python v3.10 issue
Sep 26, 2024
814f87f
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Sep 26, 2024
8c80099
test fixes
Sep 26, 2024
514dea4
Fix some of the non-trace tests
dargilco Sep 26, 2024
83f85d6
fixing issues reported by tools
Sep 27, 2024
79ea9b3
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Sep 27, 2024
e8dd67d
adding uninstrumentation to the beginning of tracing tests
Sep 27, 2024
0c286c3
updating readme and sample
Sep 27, 2024
1aaf87c
adding ignore related to tool issue
Sep 27, 2024
a1b1f13
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Sep 30, 2024
510a6ca
updating code snippet in readme
Sep 30, 2024
04da0e6
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Sep 30, 2024
fa8e8b0
Add missing `@recorded_by_proxy` decorators to new tracing tests
dargilco Oct 1, 2024
e410c31
Push new recordings
dargilco Oct 1, 2024
18b3d92
fixing issues reported by tools
Oct 2, 2024
200ab61
Merge branch 'mhietala/inference_genai_tracing' of https://github.com…
Oct 2, 2024
4a56354
adding inference to shared requirements
Oct 2, 2024
3113e35
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Oct 2, 2024
58a754f
remove inference from setup
Oct 2, 2024
4ed67dc
adding comma to setup
Oct 3, 2024
5a0aa71
updating version requirement for core
Oct 3, 2024
1214978
changes based on review comments
Oct 7, 2024
1350293
Merge branch 'Azure:main' into mhietala/inference_genai_tracing
M-Hietala Oct 10, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
91 changes: 91 additions & 0 deletions sdk/ai/azure-ai-inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,14 @@ To update an existing installation of the package, use:
pip install --upgrade azure-ai-inference
```

If you want to install Azure AI Inferencing package with support for OpenTelemetry based tracing, use the following command:

```bash
pip install azure-ai-inference[trace]
```



## Key concepts

### Create and authenticate a client directly, using API key or GitHub token
Expand Down Expand Up @@ -451,6 +459,89 @@ TBD
To generate embeddings for additional phrases, simply call `client.embed` multiple times using the same `client`.
-->

## Tracing
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions.

### Setup

The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be included in the traces or not. By default, the message contents are not include as part of the trace. Set the value of the environment variable to true (case insensitive) for the message contents to be included as part of the trace. Any other value will cause the message contents not to be traced.
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

You also need to configure the tracing implementation in your code, like so:
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

```
from azure.core.settings import settings
settings.tracing_implementation = "opentelemetry"
```

M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
### Trace Exporter(s)
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

In order for the traces to be captured, you need to setup the applicable trace exporters. The chosen exporter will be based on where you want the traces to be output. You can also implement your own exporter. The first example below shows how to setup an exporter to Azure Monitor.
Please refer to [this](https://learn.microsoft.com/en-us/azure/azure-monitor/app/create-workspace-resource?tabs=bicep) documentation for more information about how to created Azure Monitor resource.
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
Configure the APPLICATIONINSIGHTS_CONNECTION_STRING based on your Azure Monitor resource.
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

```
# Setup tracing to Azure Monitor
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
from azure.monitor.opentelemetry.exporter import AzureMonitorLogExporter
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
span_processor = BatchSpanProcessor(
AzureMonitorTraceExporter.from_connection_string(
os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"]
)
)
trace.get_tracer_provider().add_span_processor(span_processor)
```

The following example shows how to setup tracing to console output.

```
# Setup tracing to console
exporter = ConsoleSpanExporter()
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter))
```
### Instrumentation

Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API.

```
from azure.core.tracing import AIInferenceApiInstrumentor
# Instrument AI Inference API
AIInferenceApiInstrumentor().instrument()
```

It is also possible to uninstrument the Azure AI Inferencing API by using the uninstrument call. After this call, the LLM traces will no longer be emitted by the Azure AI Inferencing API until instrument is called again.

```
AIInferenceApiInstrumentor().uninstrument()
```

### Tracing Your Own Functions
The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator.
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved

```
# The @tracer.start_as_current_span decorator will
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
# trace the function call and enable adding additional attributes
# to the span in the function implementation.
@tracer.start_as_current_span("get_temperature")
def get_temperature(city: str) -> str:

# Adding attributes to the current span
span = trace.get_current_span()
span.set_attribute("requested_city", city)

if city == "Seattle":
return "75"
elif city == "New York City":
return "80"
else:
return "Unavailable"


```

## Troubleshooting

### Exceptions
Expand Down
4 changes: 3 additions & 1 deletion sdk/ai/azure-ai-inference/dev_requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
-e ../../../tools/azure-sdk-tools
../../core/azure-core
aiohttp
../../core/azure-core-tracing-opentelemetry
aiohttp
opentelemetry-sdk
Original file line number Diff line number Diff line change
@@ -0,0 +1,168 @@
# ---------------------------------------------------------
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
# Copyright (c) Microsoft Corporation. All rights reserved.
# ---------------------------------------------------------

M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
import os
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.trace.export import ConsoleSpanExporter
from opentelemetry.sdk.trace.export import SimpleSpanProcessor
from azure.ai.inference import ChatCompletionsClient
from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason
from azure.core.credentials import AzureKeyCredential
from azure.core.tracing.ai.inference import AIInferenceApiInstrumentor
from azure.core.settings import settings


# Setup tracing to console
exporter = ConsoleSpanExporter()
trace.set_tracer_provider(TracerProvider())
tracer = trace.get_tracer(__name__)
trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter))

# Use the following code to setup tracing to Application Insights
# from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
# trace.set_tracer_provider(TracerProvider())
# tracer = trace.get_tracer(__name__)
# span_processor = BatchSpanProcessor(
# AzureMonitorTraceExporter.from_connection_string(
# os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"]
# )
# )
# trace.get_tracer_provider().add_span_processor(span_processor)


def chat_completion_streaming(key, endpoint, model_name):
client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
response = client.complete(
stream=True,
messages=[
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="Tell me about software engineering in five sentences."),
],
model=model_name,
)
for update in response:
if update.choices:
print(update.choices[0].delta.content or "", end="")
pass
client.close()


# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes
# to the span in the function implementation. Note that this will trace the function parameters and their values.
@tracer.start_as_current_span("get_temperature")
def get_temperature(city: str) -> str:

# Adding attributes to the current span
span = trace.get_current_span()
span.set_attribute("requested_city", city)

if city == "Seattle":
return "75"
elif city == "New York City":
return "80"
else:
return "Unavailable"


def get_weather(city: str) -> str:
if city == "Seattle":
return "Nice weather"
elif city == "New York City":
return "Good weather"
else:
return "Unavailable"


def chat_completion_with_function_call(key, endpoint, model_name):
import json
from azure.ai.inference.models import ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition

weather_description = ChatCompletionsToolDefinition(
function=FunctionDefinition(
name="get_weather",
description="Returns description of the weather in the specified city",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city for which weather info is requested",
},
},
"required": ["city"],
},
)
)

temperature_in_city = ChatCompletionsToolDefinition(
function=FunctionDefinition(
name="get_temperature",
description="Returns the current temperature for the specified city",
parameters={
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The name of the city for which temperature info is requested",
},
},
"required": ["city"],
},
)
)

client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
messages=[
SystemMessage(content="You are a helpful assistant."),
UserMessage(content="What is the weather and temperature in Seattle?"),
]

response = client.complete(messages=messages, model=model_name, tools=[weather_description, temperature_in_city])

if response.choices[0].finish_reason == CompletionsFinishReason.TOOL_CALLS:
# Append the previous model response to the chat history
messages.append(AssistantMessage(tool_calls=response.choices[0].message.tool_calls))
# The tool should be of type function call.
if response.choices[0].message.tool_calls is not None and len(response.choices[0].message.tool_calls) > 0:
for tool_call in response.choices[0].message.tool_calls:
if type(tool_call) is ChatCompletionsToolCall:
function_args = json.loads(tool_call.function.arguments.replace("'", '"'))
print(f"Calling function `{tool_call.function.name}` with arguments {function_args}")
callable_func = globals()[tool_call.function.name]
function_response = callable_func(**function_args)
print(f"Function response = {function_response}")
# Provide the tool response to the model, by appending it to the chat history
messages.append(ToolMessage(tool_call_id=tool_call.id, content=function_response))
# With the additional tools information on hand, get another response from the model
response = client.complete(messages=messages, model=model_name, tools=[weather_description, temperature_in_city])

print(f"Model response = {response.choices[0].message.content}")


def main():
# Setup Azure Core settings to use OpenTelemetry tracing
settings.tracing_implementation = "OpenTelemetry"

# Instrument AI Inference API
AIInferenceApiInstrumentor().instrument()

# Read AI Inference API configuration
endpoint = os.environ.get("AZUREAI_ENDPOINT_URL")
key = os.environ.get("AZUREAI_ENDPOINT_KEY")
model_name = os.environ.get("AZUREAI_MODEL_NAME")

M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
print("===== starting chat_completion_streaming() =====")
chat_completion_streaming(key, endpoint, model_name)
print("===== chat_completion_streaming() done =====")

print("===== starting chat_completion_with_function_call() =====")
chat_completion_with_function_call(key, endpoint, model_name)
print("===== chat_completion_with_function_call() done =====")
AIInferenceApiInstrumentor().uninstrument()


if __name__ == "__main__":
main()
3 changes: 3 additions & 0 deletions sdk/ai/azure-ai-inference/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,7 @@
"typing-extensions>=4.6.0",
],
python_requires=">=3.8",
extras_require={
'trace': ['azure-core-tracing-opentelemetry', 'opentelemetry-sdk', 'azure-monitor-opentelemetry-exporter']
M-Hietala marked this conversation as resolved.
Show resolved Hide resolved
}
)
103 changes: 103 additions & 0 deletions sdk/ai/azure-ai-inference/tests/gen_ai_trace_verifier.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# ------------------------------------
# Copyright (c) Microsoft Corporation.
# ------------------------------------
import datetime
import json
from opentelemetry.sdk.trace import Span


class GenAiTraceVerifier:

def check_span_attributes(self, span, attributes):
# Convert the list of tuples to a dictionary for easier lookup
attribute_dict = dict(attributes)

for attribute_name in span.attributes.keys():
# Check if the attribute name exists in the input attributes
if attribute_name not in attribute_dict:
return False

attribute_value = attribute_dict[attribute_name]
if isinstance(attribute_value, list):
# Check if the attribute value in the span matches the provided list
if span.attributes[attribute_name] != attribute_value:
return False
elif isinstance(attribute_value, tuple):
# Check if the attribute value in the span matches the provided list
if span.attributes[attribute_name] != attribute_value:
return False
else:
# Check if the attribute value matches the provided value
if attribute_value != "" and span.attributes[attribute_name] != attribute_value:
return False
# Check if the attribute value in the span is not empty when the provided value is ""
elif attribute_value == "" and not span.attributes[attribute_name]:
return False

return True

def is_valid_json(self, my_string):
try:
json_object = json.loads(my_string)
except ValueError as e1:
return False
except TypeError as e2:
return False
return True

def check_json_string(self, expected_json, actual_json):
if self.is_valid_json(expected_json) and self.is_valid_json(actual_json):
return self.check_event_attributes(json.loads(expected_json), json.loads(actual_json))
else:
return False

def check_event_attributes(self, expected_dict, actual_dict):
if set(expected_dict.keys()) != set(actual_dict.keys()):
return False
for key, expected_val in expected_dict.items():
if key not in actual_dict:
return False
actual_val = actual_dict[key]

if self.is_valid_json(expected_val):
if not self.is_valid_json(actual_val):
return False
if not self.check_json_string(expected_val, actual_val):
return False
elif isinstance(expected_val, dict):
if not isinstance(actual_val, dict):
return False
if not self.check_event_attributes(expected_val, actual_val):
return False
elif isinstance(expected_val, list):
if not isinstance(actual_val, list):
return False
if len(expected_val) != len(actual_val):
return False
for expected_list, actual_list in zip(expected_val, actual_val):
if not self.check_event_attributes(expected_list, actual_list):
return False
elif isinstance(expected_val, str) and expected_val == "*":
if actual_val == "":
return False
elif expected_val != actual_val:
return False
return True

def check_span_events(self, span, expected_events):
span_events = list(span.events) # Create a list of events from the span

for expected_event in expected_events:
for actual_event in span_events:
if expected_event['name'] == actual_event.name:
if not self.check_event_attributes(expected_event['attributes'], actual_event.attributes):
return False
span_events.remove(actual_event) # Remove the matched event from the span_events
break
else:
return False # If no match found for an expected event

if len(span_events) > 0: # If there are any additional events in the span_events
return False

return True
Loading
Loading