[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

davidgxue · 2024-06-12T18:51:15Z

🚀 The feature, motivation and pitch

Motivation
Using universal request ID/UUID when logging is common practice for production systems involving multiple components. My team wanted to use an UUID from upstream to trace logs produced by vLLM's OpenAI compatible webserver, but it doesn't seem like this is supported. Currently, vLLM generates its own UUID for each request, but it is 1. a separate ID, adding difficulty for tracing 2. the request ID does not always return to the API calling client, such as in cases of errors.

Solution
Allow extra args defined by the user to be passed in via a request to the OpenAI compatible frontend server, such that it can be propagated and logged via the logger.

Currently, the server will throw an error for the extra args passed via the request body.
The current proposal is to allow any extra args to be passed in via request body, and the extra args not matching supported args will just be logged during logging
See below for alternatives

Alternatives

Alternatively, allow user to pass in extra args via request header instead of request body
Alternatively, allow user to pass all extra args via a single field (e.g. "extra_args": "{ field1 : val1, field2: val2}") so we don't disrupt the current behavior of throwing an error when unknown request fields are passed in.

Additional context

If this is approved and details are finalized, I can work on this and make a PR!

The text was updated successfully, but these errors were encountered:

simon-mo · 2024-06-12T21:17:13Z

I think header is a nice option as it doesn't break the OpenAI API compatibility. Thoughts? @njhill @rkooo567

rkooo567 · 2024-06-12T22:14:37Z

yeah that sounds great. We don't currently have any extra args passing via header now right?

davidgxue · 2024-06-13T04:41:20Z

I don't think so but someone correct me if I am wrong. I tried passing in random extra args in the header, they don't get outputted in the logs. I also don't see an option to switch it on or off.

If I were to work on this, should we just log all headers in the logs other than authorization? including model name and etc? or only the header args that have been already yet defined by vLLM?

njhill · 2024-06-17T23:28:27Z

I agree doing this via header would be better than in the body/payload.

Also cc @ronensc who has a PR for OpenTelemetry integration here: #4687.

davidgxue · 2024-07-23T04:20:33Z

Given the OpenTelemetry integration has been merged. Am I still okay to proceed to work on this issue? Just checking.

ronensc · 2024-07-29T13:25:00Z

The OpenTelemetry integration uses the standard trace-id.
I believe it can serve as a UUID.
However:

It is limited to 16 bytes.
Currently, it doesn't get logged to stdout.
It's extracted only when the --otlp-traces-endpoint flag is set, which ties this functionality to sending traces to an OTel collector.
It is not included in the response because, according to a comment from one of the spec authors, "for correlating events it's enough to only send headers forward in the requests. However, it has been the topic of discussion of the last W3C Trace Context working group face-to-face meeting in Seattle."

I think there is room to address points 2 and 3. What do you think?

davidgxue · 2024-07-30T18:59:08Z

Hi @ronensc. Thanks for your response.
I am not super familiar with OpenTelemetry so maybe a bit wrong here... But I suppose I was mainly checking my original feature request would have any conflicts with the OpenTelemetry integration added. It's more meant for the case when someone does not want to use OpenTelemetry with vLLM (such as myself).

This original feature request was just trying to allow users to add an upstream UUID in the request header, and allow vLLM to propagate it to its STDOUT so that every vLLM log entry can be easily correlated with an upstream service's request.

To clarify:
Right now the standalone vLLM generates a request UUID for each request, and the request UUID is both logged and returned to the client.

However, there is no way of correlating an upstream service's UUID with an individual log/request on vLLM's end. E.g. Client A has request ID 1, sends a request to vLLM where vLLM side generates an ID of 2. There currently is no easy way to link up a piece of log on vLLM's side with a request from Client A, without changing client A's code to extract vLLM's unique request ID then log it on Client A's side which is very much of an anti-pattern...

Certainly happy to help out address the 2 points you mentioned tho, but maybe a slightly separate issue? Or am I confusing something...?

ronensc · 2024-07-31T13:36:10Z

@davidgxue I totally agree that it’s reasonable for vLLM to obtain a UUID from an external source and include it in its logs for tracing purposes. Additionally, I have no objections to adding custom headers to the log.
However, I believe your original tracing use-case aligns perfectly with the official standard HTTP traceparent header, which is widely used for distributed tracing. OpenTelemetry simply uses this header as part of its protocol.

davidgxue added the feature request label Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

davidgxue commented Jun 12, 2024

simon-mo commented Jun 12, 2024 •

edited

Loading

rkooo567 commented Jun 12, 2024

davidgxue commented Jun 13, 2024

njhill commented Jun 17, 2024

davidgxue commented Jul 23, 2024

ronensc commented Jul 29, 2024

davidgxue commented Jul 30, 2024

ronensc commented Jul 31, 2024

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

Comments

davidgxue commented Jun 12, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

simon-mo commented Jun 12, 2024 • edited Loading

rkooo567 commented Jun 12, 2024

davidgxue commented Jun 13, 2024

njhill commented Jun 17, 2024

davidgxue commented Jul 23, 2024

ronensc commented Jul 29, 2024

davidgxue commented Jul 30, 2024

ronensc commented Jul 31, 2024

simon-mo commented Jun 12, 2024 •

edited

Loading