Repository detailing an advanced logging pattern for the Azure OpenAI Service.
-
Supports models with larger token sizes The advanced logging pattern supports capturing an event up to 200KB while the basic logging pattern supports a maximum size of 8,192 bytes. This allows the pattern to support capturing prompts and responses from models that support larger token sizes such as GPT4.
-
Enforce strong identity controls and audit logging Authentication to the Azure OpenAI Service resource is restricted to Azure Active Directory identities such as service principals and managed identities. This is accomplished through the usage of an Azure API Management custom policy. The identity of the application making the request is captured in the logs streamed to the Azure Event Hub.
-
Log the information important to you Azure API Management custom policies can be used to filter the information captured in the event to what is important to your organization. This can include prompts and responses, the number of tokens used, the Azure Active Directory identity making the call, or the model response time. This information can be used for both compliance and chargeback purposes.
-
Deliver events to a wide variety of data stores Events in this pattern are streamed to an Azure Event Hub. These events can be further processed by the integration with Azure Stream Analytics and then delivered to data stores such as Azure SQL, Azure CosmosDB, or a PowerBI Dataset.
This project framework provides the following features:
-
Enterprise logging of OpenAI usage metrics:
- Prompt Input
- Prompt Response
- Token Usage
- Model Usage
- Application Usage
- Model Response Times
-
High Availability of OpenAI service with region failover.
-
Integration with latest OpenAI libraries-
Provisioning artifacts, begin by provisioning the solution artifacts listed below:
(Optional)
- Next-Gen Firewall Appliance
- Azure Virtual Network
-
To begin, provision a resource for Azure OpenAI in your preferred region. Please note the current primary region is East US, new models and capacity will be provisioned in this location before others: Provision resource
-
Once the resource is provisioned, create a deployment with model of choice: Deploy Model
-
After the model has been deployed, go to the OpenAI studio to test your newly created model with the studio playground: oai.azure.com/portal
-
API Management can be provisioned through Azure Portal using the instructions at this link.
-
Once the API Management service has been provisioned you must import your OpenAI API layer using the OpenAPI specification for the service.
-
Open the APIM - API blade and Select the Import option for an existing API.
-
Select the Update option to update the API to the current OpenAI specifications. The Completions OpenAPI specification is found at https://raw.githubusercontent.com/Azure/azure-rest-api-specs/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/stable/2023-05-15/inference.json.
-
Test the endpoint to validate the API Management instance resource can communicate with the Azure OpenAI Service resource. Provide the "deployment-id", "api-version" and a sample prompt as seen in the screenshot below. The deployment-id is the name of the model deployment you created in the Azure OpenAI Service resource.
-
You must deploy an Event Hub Namespace resource and Event Hub. This can be done in the Azure Portal.
-
Record the information listed below. It will be required when creating the API Management Logger.
-
Event Hub Namespace FQDN. Example mynamespace.servicebus.windows.net
-
Event Hub Name. This is the name of the Event Hub you created within the Event Hub Namespace.
-
Resource Id of the Event Hub Namespace.
-
-
It is recommended to use a user-assigned managed identity to authenticate the API Management resource to the Event Hub. You can create the User-assigned Managed Identity in the Azure Portal using these instructions.
-
Record the client id of the user-assigned managed identity. It will be required when creating the API Management Logger.
-
Assign the user-assigned managed identity you created in the earlier step to the Azure Event Hubs Data Sender Azure RBAC role.
-
You can assign the role using the Azure Portal using these instructions.
-
Wait at least 15 minutes for the role to propagate throughout Azure. If you do not wait at least 15 minutes, you may encounter an error when creating the API Management Logger.
- You must add the user-assigned managed identity you created to the API Management resource. You can do this through the Azure Portal using these instructions.
-
The API Management Logger can only be created through CLI or an ARM template. A sample ARM template is provided in this repository. You can deploy the ARM template using the Azure Portal using these instructions. You must provide the relevant information collected in the previous step. Take care to provide the exact information detailed in the ARM template. If you do not provide the exact information you will encounter non-descriptive errors.
-
Record the name that you assign to the logger. This will be required in the next step.
-
In Designsection of the API in the API Management resource select the </> link in the Inbound policies section as seen in the screenshot below.
-
Copy and paste the custom Azure API Management Policy provided in this repository. You must modify the variables in the comment section of the policy with the values that match your implementation. The policy will create two events, one for the request and one for the response. The events are correlated with the message-id property which a unique GUID generated for each message to the API.
-
When complete click Save to commit the policy. If you receive any errors, likely you missed a variable or added a character.
Test the configuration to ensure it is working as intended. Recall that the API Management policy restricts to Azure Active Directory identities so you must pass an valid access token to API Management instance. The context of the identity (such as a service principal) included in the access token must have appropriate Azure RBAC permissions on the Azure OpenAI Service resource.
Sample code in Python using a service principal can be found at this link https://github.com/mattfeltonma/demo-openai-python. You should provide the API Management FQDN for the API as the OPENAI_BASE variable.
If you receive errors double-check that the service principal has the appropriate permissions on the Azure OpenAI Service resource.
You can also test that messages are being received from the Azure Event Hub using the Azure Event Hub Explorer Visual Studio Code Add-In.
After you have verified requests and responses are being captured by the Azure Event Hub, you can capture those events in a number of ways. The integration with Azure Stream Analytics provides a number of simple ways to extract, transform, and load events from an Azure Event Hub to a data store for further analytics.
Review the documentation and select the option that works best for your organization.