-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: allow users to configure a base_url for the vectorizer OpenAI embedder #351
Conversation
dddc80b
to
a7d3164
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you get this content into the cassette file? I don't see how I would easily be able to reproduce this test if I need to at a later date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed back to what I used when creating this file. I ran a mitmproxy
in my local env (localhost:8000) rewriting the request to OpenAI API endpoint. Made this by running this oneliner:
mitmproxy --listen-port 8000 --mode reverse:https://api.openai.com
Do you believe it's worth to add a comment in the test with this? Do you think there is a better alternative though? I though on rather create the reverse proxy by code in python but since this is just "one time" for creating the cassette, I discarded it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at the very least we should comment it, but ideally we would programatically set up the proxy in the test, so that we don't need manual setup when re-generating the cassette.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to write a simple reverse proxy in the conftest.py file and, even though it works, VCR intercepts the call and the cassette ends up thinking the final URL is the openai one. Also entered in a infinite loop but that's most probably related to the fact I'm running the proxy in a separate thread (didn't spent more time on it).
Unless you know a simplest alternative, I will go with adding a comment with the mitmproxy command as the way to do it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to make it work integrating mitmproxy in the test code. Now It runs a proxy on that test, making the cassette be reproducible again.
a7d3164
to
bf9dd5a
Compare
4cff968
to
58f0a43
Compare
58f0a43
to
7931136
Compare
a1e6ce3
to
98f0537
Compare
98f0537
to
999a70a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, left a couple of comments
@@ -0,0 +1,3 @@ | |||
|
|||
-- dropping in favour of the new signature (adding base_url param) | |||
drop function if exists ai.embedding_openai(text,integer,text,text); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, missing end of line new line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of questions/comments in case those make sense/can help 🙂
5e1efae
to
d26603b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks!
(Not formal "approve" as I am missing context)
This PR introduces the ability to configure the base_url in the OpenAI embedded for vectorizers. This enables greater flexibility for connecting to custom OpenAI API endpoints. This change adds support not only for private API deployments or testing environments, but future implementation of embedders that use the same API as OpenAI, such as Azure OpenAI.
The changes include adding a new class
BaseURLMixin
which can be used in any other compatible embedder.