Azure · M-Hietala · Aug 14, 2024 · Aug 16, 2024 · Aug 16, 2024 · Aug 20, 2024
@@ -57,6 +57,14 @@ To update an existing installation of the package, use:
 pip install --upgrade azure-ai-inference
 ```
 
+If you want to install Azure AI Inferencing package with support for OpenTelemetry based tracing, use the following command:
+
+```bash
+pip install azure-ai-inference[trace]
+```
+
+
+
 ## Key concepts
 
 ### Create and authenticate a client directly, using API key or GitHub token
@@ -451,6 +459,89 @@ TBD
 To generate embeddings for additional phrases, simply call `client.embed` multiple times using the same `client`.
 -->
 
+## Tracing
+
+The Azure AI Inferencing API Tracing library provides tracing for Azure AI Inference client library for Python. Refer to Installation chapter above for installation instructions.
+
+### Setup
+
+The environment variable AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED controls whether the actual message contents will be included in the traces or not. By default, the message contents are not include as part of the trace. Set the value of the environment variable to true (case insensitive) for the message contents to be included as part of the trace. Any other value will cause the message contents not to be traced.
+
+You also need to configure the tracing implementation in your code, like so:
+
+```
+from azure.core.settings import settings
+settings.tracing_implementation = "opentelemetry"
+```
+
+### Trace Exporter(s)
+
+In order for the traces to be captured, you need to setup the applicable trace exporters. The chosen exporter will be based on where you want the traces to be output. You can also implement your own exporter. The first example below shows how to setup an exporter to Azure Monitor.
+Please refer to [this](https://learn.microsoft.com/en-us/azure/azure-monitor/app/create-workspace-resource?tabs=bicep) documentation for more information about how to created Azure Monitor resource.
+Configure the APPLICATIONINSIGHTS_CONNECTION_STRING based on your Azure Monitor resource.
+
+```
+# Setup tracing to Azure Monitor
+from azure.monitor.opentelemetry.exporter import AzureMonitorLogExporter
+trace.set_tracer_provider(TracerProvider())
+tracer = trace.get_tracer(__name__)
+span_processor = BatchSpanProcessor(
+    AzureMonitorTraceExporter.from_connection_string(
+        os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"]
+    )
+)
+trace.get_tracer_provider().add_span_processor(span_processor)
+```
+
+The following example shows how to setup tracing to console output.
+
+```
+# Setup tracing to console
+exporter = ConsoleSpanExporter()
+trace.set_tracer_provider(TracerProvider())
+tracer = trace.get_tracer(__name__)
+trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter))
+```
+### Instrumentation
+
+Use the AIInferenceInstrumentor to instrument the Azure AI Inferencing API for LLM tracing, this will cause the LLM traces to be emitted from Azure AI Inferencing API.
+
+```
+from azure.core.tracing import AIInferenceApiInstrumentor
+# Instrument AI Inference API
+AIInferenceApiInstrumentor().instrument()
+```
+
+It is also possible to uninstrument the Azure AI Inferencing API by using the uninstrument call. After this call, the LLM traces will no longer be emitted by the Azure AI Inferencing API until instrument is called again.
+
+```
+AIInferenceApiInstrumentor().uninstrument()
+```
+
+### Tracing Your Own Functions
+The @tracer.start_as_current_span decorator can be used to trace your own functions. This will trace the function parameters and their values. You can also add further attributes to the span in the function implementation as demonstrated below. Note that you will have to setup the tracer in your code before using the decorator.
+
+```
+# The @tracer.start_as_current_span decorator will
+# trace the function call and enable adding additional attributes
+# to the span in the function implementation.
+@tracer.start_as_current_span("get_temperature")
+def get_temperature(city: str) -> str:
+
+    # Adding attributes to the current span
+    span = trace.get_current_span()
+    span.set_attribute("requested_city", city)
+
+    if city == "Seattle":
+        return "75"
+    elif city == "New York City":
+        return "80"
+    else:
+        return "Unavailable"
+
+
+```
+
 ## Troubleshooting
 
 ### Exceptions

@@ -1,3 +1,5 @@
 -e ../../../tools/azure-sdk-tools
 ../../core/azure-core
-aiohttp
+../../core/azure-core-tracing-opentelemetry
+aiohttp
+opentelemetry-sdk
@@ -0,0 +1,168 @@
+# ---------------------------------------------------------
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# ---------------------------------------------------------
+
+import os
+from opentelemetry import trace
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor
+from opentelemetry.sdk.trace.export import ConsoleSpanExporter
+from opentelemetry.sdk.trace.export import SimpleSpanProcessor
+from azure.ai.inference import ChatCompletionsClient
+from azure.ai.inference.models import SystemMessage, UserMessage, CompletionsFinishReason
+from azure.core.credentials import AzureKeyCredential
+from azure.core.tracing.ai.inference import AIInferenceApiInstrumentor
+from azure.core.settings import settings
+
+
+# Setup tracing to console
+exporter = ConsoleSpanExporter()
+trace.set_tracer_provider(TracerProvider())
+tracer = trace.get_tracer(__name__)
+trace.get_tracer_provider().add_span_processor(SimpleSpanProcessor(exporter))
+
+# Use the following code to setup tracing to Application Insights
+# from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
+# trace.set_tracer_provider(TracerProvider())
+# tracer = trace.get_tracer(__name__)
+# span_processor = BatchSpanProcessor(
+#     AzureMonitorTraceExporter.from_connection_string(
+#         os.environ["APPLICATIONINSIGHTS_CONNECTION_STRING"]
+#     )
+# )
+# trace.get_tracer_provider().add_span_processor(span_processor)
+
+
+def chat_completion_streaming(key, endpoint, model_name):
+    client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
+    response = client.complete(
+        stream=True,
+        messages=[
+            SystemMessage(content="You are a helpful assistant."),
+            UserMessage(content="Tell me about software engineering in five sentences."),
+        ],
+        model=model_name,
+    )
+    for update in response:
+        if update.choices:
+            print(update.choices[0].delta.content or "", end="")
+            pass
+    client.close()
+
+
+# The tracer.start_as_current_span decorator will trace the function call and enable adding additional attributes
+# to the span in the function implementation. Note that this will trace the function parameters and their values.
+@tracer.start_as_current_span("get_temperature")
+def get_temperature(city: str) -> str:
+
+    # Adding attributes to the current span
+    span = trace.get_current_span()
+    span.set_attribute("requested_city", city)
+
+    if city == "Seattle":
+        return "75"
+    elif city == "New York City":
+        return "80"
+    else:
+        return "Unavailable"
+
+
+def get_weather(city: str) -> str:
+    if city == "Seattle":
+        return "Nice weather"
+    elif city == "New York City":
+        return "Good weather"
+    else:
+        return "Unavailable"
+
+
+def chat_completion_with_function_call(key, endpoint, model_name):
+    import json
+    from azure.ai.inference.models import ToolMessage, AssistantMessage, ChatCompletionsToolCall, ChatCompletionsToolDefinition, FunctionDefinition
+
+    weather_description = ChatCompletionsToolDefinition(
+        function=FunctionDefinition(
+            name="get_weather",
+            description="Returns description of the weather in the specified city",
+            parameters={
+                "type": "object",
+                "properties": {
+                    "city": {
+                        "type": "string",
+                        "description": "The name of the city for which weather info is requested",
+                    },
+                },
+                "required": ["city"],
+            },
+        )
+    )
+
+    temperature_in_city = ChatCompletionsToolDefinition(
+        function=FunctionDefinition(
+            name="get_temperature",
+            description="Returns the current temperature for the specified city",
+            parameters={
+                "type": "object",
+                "properties": {
+                    "city": {
+                        "type": "string",
+                        "description": "The name of the city for which temperature info is requested",
+                    },
+                },
+                "required": ["city"],
+            },
+        )
+    )
+
+    client = ChatCompletionsClient(endpoint=endpoint, credential=AzureKeyCredential(key))
+    messages=[
+        SystemMessage(content="You are a helpful assistant."),
+        UserMessage(content="What is the weather and temperature in Seattle?"),
+    ]
+
+    response = client.complete(messages=messages, model=model_name, tools=[weather_description, temperature_in_city])
+
+    if response.choices[0].finish_reason == CompletionsFinishReason.TOOL_CALLS:
+        # Append the previous model response to the chat history
+        messages.append(AssistantMessage(tool_calls=response.choices[0].message.tool_calls))
+        # The tool should be of type function call.
+        if response.choices[0].message.tool_calls is not None and len(response.choices[0].message.tool_calls) > 0:
+            for tool_call in response.choices[0].message.tool_calls:
+                if type(tool_call) is ChatCompletionsToolCall:
+                    function_args = json.loads(tool_call.function.arguments.replace("'", '"'))
+                    print(f"Calling function `{tool_call.function.name}` with arguments {function_args}")
+                    callable_func = globals()[tool_call.function.name]
+                    function_response = callable_func(**function_args)
+                    print(f"Function response = {function_response}")
+                    # Provide the tool response to the model, by appending it to the chat history
+                    messages.append(ToolMessage(tool_call_id=tool_call.id, content=function_response))
+                    # With the additional tools information on hand, get another response from the model
+            response = client.complete(messages=messages, model=model_name, tools=[weather_description, temperature_in_city])
+
+    print(f"Model response = {response.choices[0].message.content}")
+
+
+def main():
+    # Setup Azure Core settings to use OpenTelemetry tracing
+    settings.tracing_implementation = "OpenTelemetry"
+
+    # Instrument AI Inference API
+    AIInferenceApiInstrumentor().instrument()
+
+    # Read AI Inference API configuration
+    endpoint = os.environ.get("AZUREAI_ENDPOINT_URL")
+    key = os.environ.get("AZUREAI_ENDPOINT_KEY")
+    model_name = os.environ.get("AZUREAI_MODEL_NAME")
+
+    print("===== starting chat_completion_streaming() =====")
+    chat_completion_streaming(key, endpoint, model_name)
+    print("===== chat_completion_streaming() done =====")
+
+    print("===== starting chat_completion_with_function_call() =====")
+    chat_completion_with_function_call(key, endpoint, model_name)
+    print("===== chat_completion_with_function_call() done =====")
+    AIInferenceApiInstrumentor().uninstrument()
+
+
+if __name__ == "__main__":
+    main()
@@ -68,4 +68,7 @@
         "typing-extensions>=4.6.0",
     ],
     python_requires=">=3.8",
+    extras_require={  
+        'trace': ['azure-core-tracing-opentelemetry', 'opentelemetry-sdk', 'azure-monitor-opentelemetry-exporter']  
+    }
 )
@@ -0,0 +1,103 @@
+# ------------------------------------
+# Copyright (c) Microsoft Corporation.
+# ------------------------------------
+import datetime
+import json
+from opentelemetry.sdk.trace import Span
+
+
+class GenAiTraceVerifier:
+
+    def check_span_attributes(self, span, attributes):
+        # Convert the list of tuples to a dictionary for easier lookup  
+        attribute_dict = dict(attributes)
+
+        for attribute_name in span.attributes.keys():
+            # Check if the attribute name exists in the input attributes  
+            if attribute_name not in attribute_dict:
+                return False
+
+            attribute_value = attribute_dict[attribute_name]
+            if isinstance(attribute_value, list):
+                # Check if the attribute value in the span matches the provided list
+                if span.attributes[attribute_name] != attribute_value:
+                    return False
+            elif isinstance(attribute_value, tuple):
+                # Check if the attribute value in the span matches the provided list
+                if span.attributes[attribute_name] != attribute_value:
+                    return False                    
+            else:
+                # Check if the attribute value matches the provided value
+                if attribute_value != "" and span.attributes[attribute_name] != attribute_value:
+                    return False
+                # Check if the attribute value in the span is not empty when the provided value is ""
+                elif attribute_value == "" and not span.attributes[attribute_name]:
+                    return False
+
+        return True
+
+    def is_valid_json(self, my_string):
+        try:
+            json_object = json.loads(my_string)
+        except ValueError as e1:
+            return False
+        except TypeError as e2:
+            return False
+        return True
+
+    def check_json_string(self, expected_json, actual_json):
+        if self.is_valid_json(expected_json) and self.is_valid_json(actual_json):
+            return self.check_event_attributes(json.loads(expected_json), json.loads(actual_json))
+        else:
+            return False
+
+    def check_event_attributes(self, expected_dict, actual_dict):
+        if set(expected_dict.keys()) != set(actual_dict.keys()):
+            return False
+        for key, expected_val in expected_dict.items():
+            if key not in actual_dict:
+                return False  
+            actual_val = actual_dict[key]
+
+            if self.is_valid_json(expected_val):
+                if not self.is_valid_json(actual_val):
+                    return False
+                if not self.check_json_string(expected_val, actual_val):
+                    return False
+            elif isinstance(expected_val, dict):
+                if not isinstance(actual_val, dict):
+                    return False  
+                if not self.check_event_attributes(expected_val, actual_val):
+                    return False
+            elif isinstance(expected_val, list):  
+                if not isinstance(actual_val, list):  
+                    return False
+                if len(expected_val) != len(actual_val):
+                    return False
+                for expected_list, actual_list in zip(expected_val, actual_val):  
+                    if not self.check_event_attributes(expected_list, actual_list):  
+                        return False                 
+            elif isinstance(expected_val, str) and expected_val == "*":
+                if actual_val == "":
+                    return False
+            elif expected_val != actual_val:
+                return False
+        return True
+
+    def check_span_events(self, span, expected_events):
+        span_events = list(span.events)  # Create a list of events from the span
+
+        for expected_event in expected_events:
+            for actual_event in span_events:
+                if expected_event['name'] == actual_event.name:
+                    if not self.check_event_attributes(expected_event['attributes'], actual_event.attributes):
+                        return False
+                    span_events.remove(actual_event)  # Remove the matched event from the span_events
+                    break
+            else:
+                return False  # If no match found for an expected event
+
+        if len(span_events) > 0:  # If there are any additional events in the span_events
+            return False
+
+        return True