-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Contributing an OpenAI instrumentor. #1796
Comments
Hey @jxnl here's a draft of an instrumentor for that OpenAI package: https://github.com/cartermp/opentelemetry-instrument-openai-py The author (@cartermp) tells me its mostly done and he's just looking for some feedback. Maybe it's worth trying and sharing yours? |
Thanks for the ping! @jxnl if you're happy to try it out and contribute, I'm happy to take contributions. It'd also be great if the instrumentations here also had an OpenAI library. I'm happy to contribute mine and work with others to get it into the right shape, and be on the hook to maintain it. |
I wrote https://github.com/jxnl/llamatry/blob/main/llamatry/openai.py without thinking about contributing. It works but not sure about best practices. @cartermp I wrap kwargs rather than each attribute. Moreover. In practice I don't know if I want to always save messages. Since data might have privacy issues and for gpt4-32k the text is massive and may add additional latency. |
So I think capturing messages by default is crucial for Observability. Otherwise, you have no way of seeing patterns that in user behavior and act on it (either directly or by using it as input to an evaluation system for offline testing). But yes, that should be something someone can turn off for data privacy concerns. |
@open-telemetry/opentelemetry-python-contrib-approvers, myself and @slobsv have been maintaining https://github.com/cartermp/opentelemetry-instrument-openai-py and think it might actually be worth contributing here into this repo. Some context in slack. How might be go about considering this for contribution to the repo? We're up to maintain it. |
I agree with @cartermp that including the messages would likely be pretty crucial for most observability use cases. But it's interesting to hear @jxnl that you don't really care (or want) to see them. I'm curious to hear what your main motivation / use case is for tracing your openai calls if you're not interested in the messages? That'd be great context to hear. If this ends up being a frequent interest, I guess we could add an optional initiation parameter of sorts to not include the messages (although we'd probably default to including them). I think I'd want to see more requests for this before we implement it though. Otherwise, this other issue on the golang otel repo might provide some inspiration for how to best sanitize risky spans. |
yeah imagine a diary app, or some system where you dont want to retain a user's data. more over as context length increases a long convo has n^2 tokens at maybe 32k token length, it gets wild. (imagine anthropic's 100k) |
Describe the solution you'd like
I noticed that datadog has an instrumentor for the OpenAI package in python here: https://ddtrace.readthedocs.io/en/stable/integrations.html#openai
I was building one for personal use here but i think it might make sense to try to contribute, however im not sure about best practices or testing this kind of stuff. (mostly a Data Scientist) here is what i have so far: https://github.com/jxnl/llamatry/blob/main/llamatry/openai.py
What the datadog one does seems mush more sophisticated.
The text was updated successfully, but these errors were encountered: