Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

Merged
merged 30 commits into from
Nov 14, 2024

Conversation

simonw
Copy link
Owner

@simonw simonw commented Nov 6, 2024

Refs:

Still to figure out:

  • How do plugins register their async versions? Might need a register_async_models() hook.
  • Get another plugin working - Anthropic or Gemini or Ollama would be good
  • How does logging to the DB work? That's currently a method on Model, needs to move out and an async version needs to be considered.
  • How about loading FROM the database? Will that work for things like attachments too?
  • Make mypy happy
  • Refactor to avoid generics
  • Testing for all of this
  • Python API documentation
  • Documentation for writing async model plugins

Punted on these:

  • Are there any other weird knock-on impacts of this I haven't considered yet?
  • How should embeddings work? async support for embeddings #628
  • Maybe try getting a llm-gguf plugin to work with this? Might help confirm the API design further by showing it working with a blocking plugin (maybe via a run-in-thread hack). SmolLM2 might be good for this.

@simonw simonw marked this pull request as draft November 6, 2024 07:47
@simonw
Copy link
Owner Author

simonw commented Nov 6, 2024

Turns out Claude can help resolve merge conflicts: https://gist.github.com/simonw/84104386e788e2797a93581c7985ea32

@simonw simonw added the enhancement New feature or request label Nov 6, 2024
@simonw
Copy link
Owner Author

simonw commented Nov 6, 2024

Nasty test failure:

FAILED tests/test_aliases.py::test_set_alias[gpt-3.5-turbo] - RecursionError: maximum recursion depth exceeded while calling a Python object

@simonw
Copy link
Owner Author

simonw commented Nov 7, 2024

Got the existing tests passing again!

It does not matter that this is a blocking call, since it is a classmethod
I am unhappy with this, had to duplicate some code.
@simonw
Copy link
Owner Author

simonw commented Nov 7, 2024

I broke this, found out while trying to write tests for it:

>>> import asyncio
>>> import llm
llm.model = llm.get_async_model("gpt-4o-mini")-4o-mini")
>>> model
<AsyncChat 'gpt-4o-mini'>
>>> async for token in model.prompt("describe a good dog in french"):
...     print(token, end="", flush=True)
... 
Traceback (most recent call last):
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/asyncio/__main__.py", line 58, in runcode
    return future.result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "<console>", line 1, in <module>
  File "/Users/simon/Dropbox/Development/llm/llm/models.py", line 376, in __aiter__
    async for chunk in await self.model.execute(
TypeError: object async_generator can't be used in 'await' expression

@simonw
Copy link
Owner Author

simonw commented Nov 7, 2024

Found the fix

diff --git a/llm/models.py b/llm/models.py
index 25a016b..1e6c165 100644
--- a/llm/models.py
+++ b/llm/models.py
@@ -373,7 +373,7 @@ class AsyncResponse(_BaseResponse["AsyncModel", Optional["AsyncConversation"]]):
                 yield chunk
             return
 
-        async for chunk in await self.model.execute(
+        async for chunk in self.model.execute(
             self.prompt,
             stream=self.stream,
             response=self,

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

One last mypy problem to figure out:

  mypy
llm/models.py:390: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
llm/models.py:390: note: Maybe you forgot to use "await"?
Found 1 error in 1 file (checked 15 source files)
error: Recipe `lint` failed on line 22 with exit code 1

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

Trying this:

files-to-prompt llm/models.py -c | llm -m o1-preview 'llm/models.py:390: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
llm/models.py:390: note: Maybe you forgot to use "await"?'

Wow that cost 20 cents!

CleanShot 2024-11-12 at 21 13 17@2x

https://gist.github.com/simonw/8447d433e5924bfa11999f445af9010f

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

The fix to all of my horrible mypy errors turned out to be doing this:

class AsyncModel(_BaseModel["AsyncResponse", "AsyncConversation"]):
    def conversation(self) -> "AsyncConversation":
        return AsyncConversation(model=self)

    @abstractmethod
    async def execute(
        self,
        prompt: Prompt,
        stream: bool,
        response: "AsyncResponse",
        conversation: Optional["AsyncConversation"],
    ) -> AsyncGenerator[str, None]:
        """
        Returns an async generator that executes the prompt and yields chunks of text,
        or yields a single big chunk.
        """
        yield ""

Note the yield "" in the abstract method, which made it match that AsyncGenerator[str, None] signature.

AsyncGenerator[str, None] means it yields strings and does not have any special return when it finishes.

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

I might make Response.log_to_db() a class method.

llm/models.py Outdated
Comment on lines 29 to 33
ModelT = TypeVar("ModelT", bound=Union["Model", "AsyncModel"])
ConversationT = TypeVar(
"ConversationT", bound=Optional[Union["Conversation", "AsyncConversation"]]
)
ResponseT = TypeVar("ResponseT")
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not happy about this at all, it's way too hard to understand.

llm/models.py Outdated

class Response(_BaseResponse["Model", Optional["Conversation"]]):
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this - I'm going to try refactoring to not use generics like this.

Comment on lines +480 to +484
class Options(SharedOptions):
json_object: Optional[bool] = Field(
description="Output a valid JSON object {...}. Prompt must mention JSON.",
default=None,
)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I refactor this to avoid duplication?

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

... and after all of this, I'm thinking maybe a single model class with an aexecute() method might be better after all? I might try it just to see how it feels.

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

I got this working:

>>> await llm.get_async_model("gpt-4o-mini").prompt("hi")
'Hello! How can I assist you today?'

Using this:

diff --git a/llm/models.py b/llm/models.py
index 3ed61bf..8539d08 100644
--- a/llm/models.py
+++ b/llm/models.py
@@ -416,6 +416,9 @@ class AsyncResponse(_BaseResponse):
         await self._force()
         return self._start_utcnow.isoformat() if self._start_utcnow else ""
 
+    def __await__(self):
+        return self.text().__await__()
+
     @classmethod
     def fake(
         cls,

@simonw
Copy link
Owner Author

simonw commented Nov 13, 2024

I'm going to add a llm models --async option for listing all async models.

@simonw
Copy link
Owner Author

simonw commented Nov 14, 2024

>>> await llm.get_async_model("gpt-4o-mini").prompt("hi")
'Hello! How can I assist you today?'

This is an API design mistake. It means you can't get the response object, which is necessary for things like looking up usage tokens (when I implement #610).

I think awaiting this should return a fully resolved Response that has executed and completed.

@simonw
Copy link
Owner Author

simonw commented Nov 14, 2024

This is better:

>>> import asyncio
>>> import llm
>>> m = llm.get_async_model("gpt-4o-mini")
>>> response = await m.prompt("say hi in spanish")
>>> response
<Response prompt='say hi in spanish' text='¡Hola!'>
>>> await response.text()
'¡Hola!'

simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 14, 2024
@simonw simonw marked this pull request as ready for review November 14, 2024 01:49
@simonw simonw changed the title WIP: asyncio support llm.get_async_model(), llm.AsyncModel base class and OpenAI async models Nov 14, 2024
@simonw simonw merged commit ba75c67 into main Nov 14, 2024
62 checks passed
@simonw simonw deleted the asyncio branch November 14, 2024 01:51
simonw added a commit that referenced this pull request Nov 14, 2024
simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 14, 2024
* Tip about pytest --record-mode once

Plus mechanism for setting API key during tests with PYTEST_ANTHROPIC_API_KEY

* Async support for Claude models

Closes #25
Refs simonw/llm#507
Refs simonw/llm#613

* Depend on llm>=0.18a0, refs #25
simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 14, 2024
simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant