Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server #5467

Open
davidgxue opened this issue Jun 12, 2024 · 8 comments

Comments

@davidgxue
Copy link

🚀 The feature, motivation and pitch

Motivation
Using universal request ID/UUID when logging is common practice for production systems involving multiple components. My team wanted to use an UUID from upstream to trace logs produced by vLLM's OpenAI compatible webserver, but it doesn't seem like this is supported. Currently, vLLM generates its own UUID for each request, but it is 1. a separate ID, adding difficulty for tracing 2. the request ID does not always return to the API calling client, such as in cases of errors.

Solution
Allow extra args defined by the user to be passed in via a request to the OpenAI compatible frontend server, such that it can be propagated and logged via the logger.

  • Currently, the server will throw an error for the extra args passed via the request body.
  • The current proposal is to allow any extra args to be passed in via request body, and the extra args not matching supported args will just be logged during logging
  • See below for alternatives

Alternatives

  • Alternatively, allow user to pass in extra args via request header instead of request body
  • Alternatively, allow user to pass all extra args via a single field (e.g. "extra_args": "{ field1 : val1, field2: val2}") so we don't disrupt the current behavior of throwing an error when unknown request fields are passed in.

Additional context

If this is approved and details are finalized, I can work on this and make a PR!

@simon-mo
Copy link
Collaborator

simon-mo commented Jun 12, 2024

I think header is a nice option as it doesn't break the OpenAI API compatibility. Thoughts? @njhill @rkooo567

@rkooo567
Copy link
Collaborator

yeah that sounds great. We don't currently have any extra args passing via header now right?

@davidgxue
Copy link
Author

I don't think so but someone correct me if I am wrong. I tried passing in random extra args in the header, they don't get outputted in the logs. I also don't see an option to switch it on or off.

If I were to work on this, should we just log all headers in the logs other than authorization? including model name and etc? or only the header args that have been already yet defined by vLLM?

@njhill
Copy link
Member

njhill commented Jun 17, 2024

I agree doing this via header would be better than in the body/payload.

Also cc @ronensc who has a PR for OpenTelemetry integration here: #4687.

@davidgxue
Copy link
Author

Given the OpenTelemetry integration has been merged. Am I still okay to proceed to work on this issue? Just checking.

@ronensc
Copy link
Contributor

ronensc commented Jul 29, 2024

The OpenTelemetry integration uses the standard trace-id.
I believe it can serve as a UUID.
However:

  1. It is limited to 16 bytes.
  2. Currently, it doesn't get logged to stdout.
  3. It's extracted only when the --otlp-traces-endpoint flag is set, which ties this functionality to sending traces to an OTel collector.
  4. It is not included in the response because, according to a comment from one of the spec authors, "for correlating events it's enough to only send headers forward in the requests. However, it has been the topic of discussion of the last W3C Trace Context working group face-to-face meeting in Seattle."

I think there is room to address points 2 and 3. What do you think?

@davidgxue
Copy link
Author

Hi @ronensc. Thanks for your response.
I am not super familiar with OpenTelemetry so maybe a bit wrong here... But I suppose I was mainly checking my original feature request would have any conflicts with the OpenTelemetry integration added. It's more meant for the case when someone does not want to use OpenTelemetry with vLLM (such as myself).

This original feature request was just trying to allow users to add an upstream UUID in the request header, and allow vLLM to propagate it to its STDOUT so that every vLLM log entry can be easily correlated with an upstream service's request.

To clarify:
Right now the standalone vLLM generates a request UUID for each request, and the request UUID is both logged and returned to the client.

However, there is no way of correlating an upstream service's UUID with an individual log/request on vLLM's end. E.g. Client A has request ID 1, sends a request to vLLM where vLLM side generates an ID of 2. There currently is no easy way to link up a piece of log on vLLM's side with a request from Client A, without changing client A's code to extract vLLM's unique request ID then log it on Client A's side which is very much of an anti-pattern...

Certainly happy to help out address the 2 points you mentioned tho, but maybe a slightly separate issue? Or am I confusing something...?

@ronensc
Copy link
Contributor

ronensc commented Jul 31, 2024

@davidgxue I totally agree that it’s reasonable for vLLM to obtain a UUID from an external source and include it in its logs for tracing purposes. Additionally, I have no objections to adding custom headers to the log.
However, I believe your original tracing use-case aligns perfectly with the official standard HTTP traceparent header, which is widely used for distributed tracing. OpenTelemetry simply uses this header as part of its protocol.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants