Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

community: upgrade openai>=1.30 for LocalAIEmbedings #22399

Conversation

mkhludnev
Copy link
Contributor

Thank you for contributing to LangChain!

  • PR title: "package: description"

  • Description: migrate LocalAIEmbeddings code to OpenAI 1.30

  • Issue: LocalAIEmbeddings fails due to old OpenAI 0.28 code openai.error see module 'openai' has no attribute 'error' #13368

  • Dependencies: OpenAI is already bumped 1.30

  • Add tests and docs: integration test with vcr

  • Lint and test:

Additional guidelines:

  • Make sure optional dependencies are imported within a function.
  • Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests.
  • Most PRs should not touch more than one package.
  • Changes should be backwards compatible.
  • If you are adding something to community, do not re-import it in langchain.

If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Copy link

vercel bot commented Jun 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Visit Preview Sep 22, 2024 2:46pm

@mkhludnev mkhludnev marked this pull request as ready for review June 3, 2024 13:05
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Ɑ: embeddings Related to text embedding models module 🔌: openai Primarily related to OpenAI integrations 🤖:improvement Medium size change to existing code to handle new use-cases labels Jun 3, 2024
@mkhludnev mkhludnev force-pushed the adjust-localai-embeds-for-openai-1.x branch from 951d59b to 42e1049 Compare June 3, 2024 15:27
import httpx
openai = LocalAIEmbeddings(
openai_api_key="random-string",
client=openai.OpenAI(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't need to pass in client directly

Copy link
Contributor Author

@mkhludnev mkhludnev Jun 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't need to pass in client directly

I copied openai_proxy handling from OpenAIEmbedding

@mkhludnev mkhludnev mentioned this pull request Jun 7, 2024
4 tasks
@ccurme ccurme added the community Related to langchain-community label Jun 18, 2024
@mkhludnev
Copy link
Contributor Author

I got back to the problem.
Let me ask very basic question: which version of openai we expect for langchain_community 1.x or 0.x?
We can assume that 90% of users have both langchain_community and langchain-openai in their dependencies.
Although I don't know whether langchain-openai is able to work with openai 0.x.

@mkhludnev mkhludnev force-pushed the adjust-localai-embeds-for-openai-1.x branch from 402231a to 5f1c46b Compare September 14, 2024 20:14
@dosubot dosubot bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Sep 20, 2024
@mkhludnev
Copy link
Contributor Author

mkhludnev commented Sep 20, 2024

@baskaryan kindly asking to check this PR out. I resolved the question regarding proxy. Note, after this PR is merged. I want to improve #22666 localai embeddings performance.

@mkhludnev
Copy link
Contributor Author

mkhludnev commented Sep 20, 2024

ahhhh it's a bummer

       we can conclude that friendli-client>=1.2.4,<2 depends on
      httpx>=0.24.1,<0.25.0.
      And because you require friendli-client>=1.2.4,<2 and httpx>=0.27.0, we
      can conclude that your requirements are unsatisfiable.

UPD managed to keep up with old httpx

@mkhludnev mkhludnev changed the title community:Adjust LocalAI embedings for OpenAI 1.30.x community: upgrade LocalAIEmbedings for OpenAI 1.x Sep 23, 2024
@mkhludnev mkhludnev changed the title community: upgrade LocalAIEmbedings for OpenAI 1.x community: upgrade openai>=1.30 for LocalAIEmbedings Sep 23, 2024
@mkhludnev
Copy link
Contributor Author

ping @efriis pleaase

@dawu415
Copy link

dawu415 commented Nov 1, 2024

Curious if there's any update on this or alternative solutions in the meantime?

@mkhludnev
Copy link
Contributor Author

Curious if there's any update on this or alternative solutions in the meantime?

I can only rebase it from time to time. Let me know if it's necessary. I'll refresh.

@efriis
Copy link
Member

efriis commented Dec 11, 2024

Unfortunately there isn't a great way to update these old community integrations that rely on openai sdk v0 in a way that's both backwards compatible + supports v1 without being a huge mess to maintain. This is one of the main motivations behind moving towards dedicated integration packages, which have replaced the community package as the right path forward. I'll close this PR, and would recommend reopening with just docs updates, as well as registering your package in libs/packages.yml! We'll be able to review simple PRs that only modify these two things much faster :)

Would also love to @deprecated this integration pointing to your new integration package, which can even lock to openai v1 sdk to avoid support for v0!

Here's the guide, and if you have questions, feel free to leave them in the comments on those pages so others can see them! https://python.langchain.com/docs/contributing/how_to/integrations/

@efriis efriis closed this Dec 11, 2024
@mkhludnev
Copy link
Contributor Author

mkhludnev commented Dec 12, 2024

Thanks @efriis for checking stale PRs during the holiday season!
I read it as a need to copy LocalAI Embeddings to the separate repo that I did https://github.com/mkhludnev/langchain-localai/

The first question is, which account this integration package will belong to:

UPD: I've fixed the build.

PS. The next (real target) is a performance via #22666

@efriis
Copy link
Member

efriis commented Dec 13, 2024

we won't create one under langchain-ai, and one under mudler would probably be best as an integration package.

However, that being said, because it relies on the openai sdk anyways, I would strongly recommend just rewriting the Local AI docs page in langchain to use the ChatOpenAI integration instead. That way it benefits from all fixes folks make against that package, and we can deprecate ChatLocalAI in favor of just using that.

@mkhludnev
Copy link
Contributor Author

However, that being said, because it relies on the openai sdk anyways, I would strongly recommend just rewriting the Local AI docs page

aaaghh I've got your point after all. Let me check if embeddings work this way!

@mkhludnev
Copy link
Contributor Author

@efriis
I tried to invoke LocalAI embeddings via OpenAIEmbeddings as far as I've got your suggestion.

embeddings = OpenAIEmbeddings(
    model="bge-m3",
    openai_api_base="https://localai.my/v1",
    api_key="foo bar")
vectors = embeddings.embed_documents(["lorem ipsum","dolor sit amet"])

I noticed that underneath OpenAIEmbeddings tokenizes text by cl100k_base and submits int array per the given string. LocalAI handles it somehow and yields vectors. Probably they are meaningful, but their values are different from what I've got from LocalAIEmbeddings, which submits texts as-is.
So, even if we got confirmation that "intput":[[1,2,3,..],[4,5,6]] a valid way to call https://localai.io/features/embeddings/ I don't think users were encouraged by need to reindex all vectors just due to client upgrade.
So, wdyt? Is there a country for LocalAIEmbeddings?

@mkhludnev
Copy link
Contributor Author

ChatLocalAI

Let me clarify. There's no chat model for LocalAI here, there's only LocalAIEmbeddings

@efriis
Copy link
Member

efriis commented Dec 15, 2024

I see makes sense. Feel free to publish a langchain-localai then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Related to langchain-community Ɑ: embeddings Related to text embedding models module 🤖:improvement Medium size change to existing code to handle new use-cases 🔌: openai Primarily related to OpenAI integrations size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants