AI Model Inference Readme and Samples Issue #37045

zimuli157 · 2024-08-27T08:10:46Z

Section link1, link2, link3, link4, link5, link6, link7, link8, link9, link10, link11, link12, link13, link14, link15, link16, link17, link18:

Reason:
Unauthorized. Access token is missing, invalid, audience is incorrect, or have expired.

Section link1

Reason:
Indent error.

Suggestion:
Remove whitespace.

@rohit-ganguly , @lmazuel , @achandmsft , @mayurid , @dargilco for notification.

dargilco · 2024-08-27T14:16:32Z

Thank you @zimuli157 for opening this issue! I'm the owner of this SDK.

Issue #2: I removed the white space in my current PR. Thanks!

Issue #1: Please make sure your key is valid. Is this a Serverless API endpoint (aka MaaS)? Is this a chat completion model? Can you share your endpoint URL? If so, please try this simple cURL command to make sure your key is valid, before running the samples (replace the two environment variables with the endpoint and key you see in Azure AI Studio). Note that this is example is for API key authentication, not Entra ID authentication. Please follow up with me over IM to continue discussing this. Thanks!

curl  -v "%AZURE_AI_CHAT_ENDPOINT%/chat/completions" -H "Content-Type: application/json" -H "Authorization: Bearer %AZURE_AI_CHAT_KEY%" -d "{\"messages\":[{\"role\":\"user\",\"content\":\"how many feet in a mile?\"}]}"

jerryshia · 2024-08-30T06:32:18Z

@dargilco For Issue #1:
I ran the command as you suggested, and the result was the same as when I ran the sample before. The results are shown below.

We think there is a mistake in this command. When we replaced Authorization: Bearer %AZURE_AI_CHAT_KEY% with api-key: %AZURE_AI_CHAT_KEY%, the authentication was successful. The results are as follows.

The complete command at this point is

curl  -v "https://***.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2023-03-15-preview" -H "Content-Type: application/json" -H "api-key: {api-key}" -d '{"messages":[{"role":"user","content":"how many feet in a mile?"}]}'

dargilco · 2024-09-03T15:55:38Z

@jerryshia Yes, if your model is an OpenAI model hosted on Azure OpenAI, you need the "api-key: " header. Please see new package docs here, showing how to create a ChatCompletionsClient for Azure OpenAI endpoint: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference#create-and-authenticate-a-client-directly-using-api-key-or-github-token

There are also a few samples (with file names containing azure_openai) in the samples folder: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/ai/azure-ai-inference/samples

jerryshia · 2024-09-04T02:42:04Z

@dargilco When I use the same endpoint as the endpoint URL to run the sample file, the above issue occurs. I believe the aforementioned issue also exists in the SDK, which can lead to similar errors. Please check the way requests are sent in the SDK's code.

v-xuto · 2024-09-26T07:42:56Z

@dargilco When I use the same endpoint as the endpoint URL to run the sample file, the above issue occurs. I believe the aforementioned issue also exists in the SDK, which can lead to similar errors. Please check the way requests are sent in the SDK's code.

@dargilco Any ideas?

dargilco · 2024-09-26T15:48:32Z

@v-xuto please provide full details about your issue, including source code and SDK logs from your failed run.

To enable SDK logging, add this at the top of your code:

import sys
import logging
logger = logging.getLogger("azure")
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler(stream=sys.stdout))

And add this additional input parameter to the constructor for ChatCompletionsClient:

logging_enable=True

Please make sure to remove any secrets when sharing here (api-key).

jerryshia · 2024-09-27T08:49:23Z

@dargilco Here is the log when the run failed:

XXXX-09-27 11:19:38,316 - azure.core.pipeline.policies._universal - DEBUG - Request URL: 'https://{}.openai.azure.com/openai/deployments/gpt-35-turbo/chat/completions?api-version={}'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '138'
    'Accept': 'application/json'
    'x-ms-client-request-id': '53f70d41-7c7f-11ef-b41a-cc96e5348815'
    'User-Agent': 'azsdk-python-ai-inference/1.0.0b4 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
    'Authorization': 'Bearer {key}'
Request body:
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "How many feet are in a mile?"}]}
XXXX-09-27 11:19:38,316 - azure.core.pipeline.policies.http_logging_policy - INFO - Request URL: 'https://{}.openai.azure.com/openai/deployments/gpt-35-turbo/chat/completions?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '138'
    'Accept': 'application/json'
    'x-ms-client-request-id': '53f70d41-7c7f-11ef-b41a-cc96e5348815'
    'User-Agent': 'azsdk-python-ai-inference/1.0.0b4 Python/3.11.9 (Windows-10-10.0.22631-SP0)'
    'Authorization': 'REDACTED'
A body is sent with the request
XXXX-09-27 11:19:38,319 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): {}.openai.azure.com:443
XXXX-09-27 11:19:39,584 - urllib3.connectionpool - DEBUG - https://{}.openai.azure.com:443 "POST /openai/deployments/gpt-35-turbo/chat/completions?api-version={} HTTP/11" 4XX 161
XXXX-09-27 11:19:39,588 - azure.core.pipeline.policies.http_logging_policy - INFO - Response status: 4XX
Response headers:
    'Content-Length': '161'
    'Content-Type': 'application/json'
    'x-ms-client-request-id': '53f70d41-7c7f-11ef-b41a-cc96e5348815'
    'apim-request-id': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'x-content-type-options': 'REDACTED'
    'Date': 'Fri, 27 Sep XXXX 03:19:38 GMT'
XXXX-09-27 11:19:39,588 - azure.core.pipeline.policies._universal - DEBUG - Response status: '4XX'
Response headers:
    'Content-Length': '161'
    'Content-Type': 'application/json'
    'x-ms-client-request-id': '53f70d41-7c7f-11ef-b41a-cc96e5348815'
    'apim-request-id': '89300787-b1f4-41d4-8365-ba9dd21072ee'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'Date': 'Fri, 27 Sep XXXX 03:19:38 GMT'
Response content:
{ "statusCode": 4XX, "message": "Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or have expired." }

and the file run was sample_chat_completions.py.

We found that when we added the parameter headers={'api-key': {api-key}} to ChatCompletionsClient, the run was successful. We believe that the parameter credential may not be functional here, as it does not pass the api-key. Therefore, the run was successful when the api-key was added as a parameter. Please check the code.Here is the log when the run passed.

2024-09-27 15:40:50,282 - azure.core.pipeline.policies._universal - DEBUG - Request URL: 'https://{}.openai.azure.com/openai/deployments/gpt-35-turbo/chat/completions?api-version=2024-05-01-preview'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '138'
    'api-key': '{}'
    'Accept': 'application/json'
    'x-ms-client-request-id': 'd12f43c4-7ca3-11ef-8e38-a8b13b79b46f'
    'User-Agent': 'azsdk-python-ai-inference/1.0.0b4 Python/3.10.11 (Windows-10-10.0.22631-SP0)'
    'Authorization': 'Bearer 123'
Request body:
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "How many feet are in a mile?"}]}
2024-09-27 15:40:50,282 - azure.core.pipeline.policies.http_logging_policy - INFO - Request URL: 'https://{}.openai.azure.com/openai/deployments/gpt-35-turbo/chat/completions?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '138'
    'api-key': 'REDACTED'
    'Accept': 'application/json'
    'x-ms-client-request-id': 'd12f43c4-7ca3-11ef-8e38-a8b13b79b46f'
    'User-Agent': 'azsdk-python-ai-inference/1.0.0b4 Python/3.10.11 (Windows-10-10.0.22631-SP0)'
    'Authorization': 'REDACTED'
A body is sent with the request
2024-09-27 15:40:50,282 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): {}.openai.azure.com:443
2024-09-27 15:40:51,785 - urllib3.connectionpool - DEBUG - https://{}.openai.azure.com:443 "POST /openai/deployments/gpt-35-turbo/chat/completions?api-version=2024-05-01-preview HTTP/1.1" 200 997
2024-09-27 15:40:51,785 - azure.core.pipeline.policies.http_logging_policy - INFO - Response status: 200
Response headers:
    'Cache-Control': 'no-cache, must-revalidate'
    'Content-Length': '997'
    'Content-Type': 'application/json'
    'access-control-allow-origin': 'REDACTED'
    'apim-request-id': 'REDACTED'
    'Strict-Transport-Security': 'REDACTED'
    'x-content-type-options': 'REDACTED'
    'x-ms-region': 'REDACTED'
    'x-ratelimit-remaining-requests': 'REDACTED'
    'x-ratelimit-remaining-tokens': 'REDACTED'
    'x-accel-buffering': 'REDACTED'
    'x-ms-rai-invoked': 'REDACTED'
    'x-request-id': 'REDACTED'
    'x-ms-client-request-id': 'd12f43c4-7ca3-11ef-8e38-a8b13b79b46f'
    'azureml-model-session': 'REDACTED'
    'Date': 'Fri, 27 Sep 2024 07:40:51 GMT'
2024-09-27 15:40:51,796 - azure.core.pipeline.policies._universal - DEBUG - Response status: '200'
Response headers:
    'Cache-Control': 'no-cache, must-revalidate'
    'Content-Length': '997'
    'Content-Type': 'application/json'
    'access-control-allow-origin': '*'
    'apim-request-id': '7bab0079-9491-49ee-a467-8f409de0e96a'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'x-ms-region': 'East US'
    'x-ratelimit-remaining-requests': '119'
    'x-ratelimit-remaining-tokens': '119984'
    'x-accel-buffering': 'no'
    'x-ms-rai-invoked': 'true'
    'x-request-id': 'c9f033a4-2537-4367-aed5-e1545d83dee8'
    'x-ms-client-request-id': 'd12f43c4-7ca3-11ef-8e38-a8b13b79b46f'
    'azureml-model-session': 'turbo-0301-42727a61'
    'Date': 'Fri, 27 Sep 2024 07:40:51 GMT'
Response content:
{"choices":[{"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"protected_material_code":{"filtered":false,"detected":false},"protected_material_text":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}},"finish_reason":"stop","index":0,"logprobs":null,"message":{"content":"There are 5280 feet in a mile.","role":"assistant"}}],"created":1727422851,"id":"chatcmpl-ABzspR3IFAcXtfdu0SKJUuw5dTSqI","model":"gpt-35-turbo","object":"chat.completion","prompt_filter_results":[{"prompt_index":0,"content_filter_results":{"hate":{"filtered":false,"severity":"safe"},"jailbreak":{"filtered":false,"detected":false},"self_harm":{"filtered":false,"severity":"safe"},"sexual":{"filtered":false,"severity":"safe"},"violence":{"filtered":false,"severity":"safe"}}}],"system_fingerprint":null,"usage":{"completion_tokens":10,"prompt_tokens":27,"total_tokens":37}}

dargilco self-assigned this Aug 27, 2024

kristapratico removed the needs-team-triage Workflow: This issue needs the team to triage. label Aug 27, 2024

dargilco closed this as completed Sep 3, 2024

ChenxiJiang333 reopened this Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Model Inference Readme and Samples Issue #37045

AI Model Inference Readme and Samples Issue #37045

zimuli157 commented Aug 27, 2024

dargilco commented Aug 27, 2024

jerryshia commented Aug 30, 2024

dargilco commented Sep 3, 2024

jerryshia commented Sep 4, 2024

v-xuto commented Sep 26, 2024

dargilco commented Sep 26, 2024 •

edited

Loading

jerryshia commented Sep 27, 2024 •

edited

Loading

AI Model Inference Readme and Samples Issue #37045

AI Model Inference Readme and Samples Issue #37045

Comments

zimuli157 commented Aug 27, 2024

dargilco commented Aug 27, 2024

jerryshia commented Aug 30, 2024

dargilco commented Sep 3, 2024

jerryshia commented Sep 4, 2024

v-xuto commented Sep 26, 2024

dargilco commented Sep 26, 2024 • edited Loading

jerryshia commented Sep 27, 2024 • edited Loading

dargilco commented Sep 26, 2024 •

edited

Loading

jerryshia commented Sep 27, 2024 •

edited

Loading