llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

simonw · 2024-11-06T07:47:44Z

Refs:

Ability to execute prompts against an asyncio (non-blocking) API #507

Still to figure out:

How do plugins register their async versions? Might need a register_async_models() hook.
Get another plugin working - Anthropic or Gemini or Ollama would be good
How does logging to the DB work? That's currently a method on Model, needs to move out and an async version needs to be considered.
How about loading FROM the database? Will that work for things like attachments too?
Make mypy happy
Refactor to avoid generics
Testing for all of this
Python API documentation
Documentation for writing async model plugins

Punted on these:

Are there any other weird knock-on impacts of this I haven't considered yet?
How should embeddings work? async support for embeddings #628
Maybe try getting a llm-gguf plugin to work with this? Might help confirm the API design further by showing it working with a blocking plugin (maybe via a run-in-thread hack). SmolLM2 might be good for this.

Refs #507 (comment)

simonw · 2024-11-06T07:53:28Z

Turns out Claude can help resolve merge conflicts: https://gist.github.com/simonw/84104386e788e2797a93581c7985ea32

simonw · 2024-11-06T07:55:28Z

Nasty test failure:

FAILED tests/test_aliases.py::test_set_alias[gpt-3.5-turbo] - RecursionError: maximum recursion depth exceeded while calling a Python object

simonw · 2024-11-07T00:44:27Z

Got the existing tests passing again!

It does not matter that this is a blocking call, since it is a classmethod

I am unhappy with this, had to duplicate some code.

simonw · 2024-11-07T01:24:39Z

I broke this, found out while trying to write tests for it:

>>> import asyncio
>>> import llm
llm.model = llm.get_async_model("gpt-4o-mini")-4o-mini")
>>> model
<AsyncChat 'gpt-4o-mini'>
>>> async for token in model.prompt("describe a good dog in french"):
...     print(token, end="", flush=True)
... 
Traceback (most recent call last):
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/asyncio/__main__.py", line 58, in runcode
    return future.result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 446, in result
    return self.__get_result()
  File "/Users/simon/.pyenv/versions/3.10.4/lib/python3.10/concurrent/futures/_base.py", line 391, in __get_result
    raise self._exception
  File "<console>", line 1, in <module>
  File "/Users/simon/Dropbox/Development/llm/llm/models.py", line 376, in __aiter__
    async for chunk in await self.model.execute(
TypeError: object async_generator can't be used in 'await' expression

simonw · 2024-11-07T01:30:39Z

Found the fix

diff --git a/llm/models.py b/llm/models.py
index 25a016b..1e6c165 100644
--- a/llm/models.py
+++ b/llm/models.py
@@ -373,7 +373,7 @@ class AsyncResponse(_BaseResponse["AsyncModel", Optional["AsyncConversation"]]):
                 yield chunk
             return
 
-        async for chunk in await self.model.execute(
+        async for chunk in self.model.execute(
             self.prompt,
             stream=self.stream,
             response=self,

simonw · 2024-11-13T05:11:47Z

One last mypy problem to figure out:

  mypy
llm/models.py:390: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
llm/models.py:390: note: Maybe you forgot to use "await"?
Found 1 error in 1 file (checked 15 source files)
error: Recipe `lint` failed on line 22 with exit code 1

simonw · 2024-11-13T05:12:29Z

Trying this:

files-to-prompt llm/models.py -c | llm -m o1-preview 'llm/models.py:390: error: "Coroutine[Any, Any, AsyncIterator[str]]" has no attribute "__aiter__" (not async iterable)  [attr-defined]
llm/models.py:390: note: Maybe you forgot to use "await"?'

Wow that cost 20 cents!

https://gist.github.com/simonw/8447d433e5924bfa11999f445af9010f

simonw · 2024-11-13T05:42:11Z

The fix to all of my horrible mypy errors turned out to be doing this:

class AsyncModel(_BaseModel["AsyncResponse", "AsyncConversation"]):
    def conversation(self) -> "AsyncConversation":
        return AsyncConversation(model=self)

    @abstractmethod
    async def execute(
        self,
        prompt: Prompt,
        stream: bool,
        response: "AsyncResponse",
        conversation: Optional["AsyncConversation"],
    ) -> AsyncGenerator[str, None]:
        """
        Returns an async generator that executes the prompt and yields chunks of text,
        or yields a single big chunk.
        """
        yield ""

Note the yield "" in the abstract method, which made it match that AsyncGenerator[str, None] signature.

AsyncGenerator[str, None] means it yields strings and does not have any special return when it finishes.

simonw · 2024-11-13T06:03:00Z

I might make Response.log_to_db() a class method.

simonw · 2024-11-13T14:31:35Z

llm/models.py

+ModelT = TypeVar("ModelT", bound=Union["Model", "AsyncModel"])
+ConversationT = TypeVar(
+    "ConversationT", bound=Optional[Union["Conversation", "AsyncConversation"]]
+)
+ResponseT = TypeVar("ResponseT")


I'm not happy about this at all, it's way too hard to understand.

simonw · 2024-11-13T14:31:56Z

llm/models.py


+class Response(_BaseResponse["Model", Optional["Conversation"]]):


Also this - I'm going to try refactoring to not use generics like this.

simonw · 2024-11-13T17:29:42Z

llm/default_plugins/openai_models.py

+    class Options(SharedOptions):
+        json_object: Optional[bool] = Field(
+            description="Output a valid JSON object {...}. Prompt must mention JSON.",
+            default=None,
+        )


Can I refactor this to avoid duplication?

simonw · 2024-11-13T17:43:59Z

... and after all of this, I'm thinking maybe a single model class with an aexecute() method might be better after all? I might try it just to see how it feels.

simonw · 2024-11-13T20:08:04Z

I got this working:

>>> await llm.get_async_model("gpt-4o-mini").prompt("hi")
'Hello! How can I assist you today?'

Using this:

diff --git a/llm/models.py b/llm/models.py
index 3ed61bf..8539d08 100644
--- a/llm/models.py
+++ b/llm/models.py
@@ -416,6 +416,9 @@ class AsyncResponse(_BaseResponse):
         await self._force()
         return self._start_utcnow.isoformat() if self._start_utcnow else ""
 
+    def __await__(self):
+        return self.text().__await__()
+
     @classmethod
     def fake(
         cls,

Refs #507

simonw · 2024-11-13T20:30:41Z

I'm going to add a llm models --async option for listing all async models.

Refs #613 (comment)

simonw · 2024-11-14T01:24:45Z

>>> await llm.get_async_model("gpt-4o-mini").prompt("hi")
'Hello! How can I assist you today?'

This is an API design mistake. It means you can't get the response object, which is necessary for things like looking up usage tokens (when I implement #610).

I think awaiting this should return a fully resolved Response that has executed and completed.

simonw · 2024-11-14T01:28:05Z

This is better:

>>> import asyncio
>>> import llm
>>> m = llm.get_async_model("gpt-4o-mini")
>>> response = await m.prompt("say hi in spanish")
>>> response
<Response prompt='say hi in spanish' text='¡Hola!'>
>>> await response.text()
'¡Hola!'

Refs #613 (comment)

Refs #25 Refs simonw/llm#507 Refs simonw/llm#613

Refs #507, #599, #600, #603, #608, #611, #612, #613, #614, #615, #616, #621, #622, #623, #626, #629

* Tip about pytest --record-mode once Plus mechanism for setting API key during tests with PYTEST_ANTHROPIC_API_KEY * Async support for Claude models Closes #25 Refs simonw/llm#507 Refs simonw/llm#613 * Depend on llm>=0.18a0, refs #25

Refs #25 Refs simonw/llm#507 Refs simonw/llm#613

Refs #25, #20, #24 Refs simonw/llm#507 Refs simonw/llm#613

simonw added 5 commits November 5, 2024 19:01

First WIP prototype of async mode, refs #507

e26e7f7

Fix for llm hi --async --no-stream, refs #507

1d8c3f8

Fix for coroutine in __repr__

b27b275

Refs #507 (comment)

register_model is now async aware

44e6be1

Refs #507 (comment)

Refactor Chat and AsyncChat to use _Shared base class

2b6f5cc

Refs #507 (comment)

simonw marked this pull request as draft November 6, 2024 07:47

Merge branch 'main' into asyncio

7f6bea4

simonw added the enhancement New feature or request label Nov 6, 2024

simonw and others added 4 commits November 6, 2024 03:24

fixed function name

d9ed54f

Fix for infinite loop

55830df

Applied Black

5466a18

Ran cog

3309528

simonw added 4 commits November 6, 2024 16:44

Applied Black

d310df5

Add Response.from_row() classmethod back again

61dfc1d

It does not matter that this is a blocking call, since it is a classmethod

Made mypy happy with llm/models.py

b3a6ec7

mypy fixes for openai_models.py

91732d0

I am unhappy with this, had to duplicate some code.

simonw added 4 commits November 6, 2024 17:31

First test for AsyncModel

2e1045d

Still have not quite got this working

f311dbf

Fix for not loading plugins during tests, refs #626

4f3e82a

audio/wav not audio/wave, refs #603

145b5cd

Black and mypy and ruff all happy

8ab5ea3

Merge branch 'main' into asyncio

9e82131

simonw commented Nov 13, 2024

View reviewed changes

Refactor to avoid generics

c4a7583

simonw commented Nov 13, 2024

View reviewed changes

Removed obsolete response() method

9b1e720

simonw added 4 commits November 13, 2024 12:09

Support text = await async_mock_model.prompt("hello")

1c83a4e

Initial docs for llm.get_async_model() and await model.prompt()

ceb60d2

Refs #507

Initial async model plugin creation docs

5f66149

duration_ms ANY to pass test

6684715

simonw added 3 commits November 13, 2024 12:32

llm models --async option

5279921

Refs #613 (comment)

Removed obsolete TypeVars

6322040

Expanded register_models() docs for async

e677e2c

await model.prompt() now returns AsyncResponse

cb2f151

Refs #613 (comment)

simonw mentioned this pull request Nov 14, 2024

Async support simonw/llm-claude-3#25

Closed

simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 14, 2024

Async support for Claude models

041386e

Refs #25 Refs simonw/llm#507 Refs simonw/llm#613

simonw marked this pull request as ready for review November 14, 2024 01:49

simonw changed the title ~~WIP: asyncio support~~ llm.get_async_model(), llm.AsyncModel base class and OpenAI async models Nov 14, 2024

simonw merged commit ba75c67 into main Nov 14, 2024
62 checks passed

simonw deleted the asyncio branch November 14, 2024 01:51

simonw added a commit that referenced this pull request Nov 14, 2024

Release 0.18a0

041730d

Refs #507, #599, #600, #603, #608, #611, #612, #613, #614, #615, #616, #621, #622, #623, #626, #629

simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 14, 2024

Release 0.9a0

b1bab19

Refs #25 Refs simonw/llm#507 Refs simonw/llm#613

simonw added a commit to simonw/llm-claude-3 that referenced this pull request Nov 17, 2024

Release 0.9

1e6ffef

Refs #25, #20, #24 Refs simonw/llm#507 Refs simonw/llm#613

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

simonw commented Nov 6, 2024 •

edited

Loading

simonw commented Nov 6, 2024

simonw commented Nov 6, 2024

simonw commented Nov 7, 2024

simonw commented Nov 7, 2024

simonw commented Nov 7, 2024

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024 •

edited

Loading

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw Nov 13, 2024

simonw Nov 13, 2024

simonw Nov 13, 2024

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw commented Nov 14, 2024 •

edited

Loading

simonw commented Nov 14, 2024


		class Response(_BaseResponse["Model", Optional["Conversation"]]):

llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

llm.get_async_model(), llm.AsyncModel base class and OpenAI async models #613

Conversation

simonw commented Nov 6, 2024 • edited Loading

simonw commented Nov 6, 2024

simonw commented Nov 6, 2024

simonw commented Nov 7, 2024

simonw commented Nov 7, 2024

simonw commented Nov 7, 2024

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024 • edited Loading

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw Nov 13, 2024

Choose a reason for hiding this comment

simonw Nov 13, 2024

Choose a reason for hiding this comment

simonw Nov 13, 2024

Choose a reason for hiding this comment

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw commented Nov 13, 2024

simonw commented Nov 14, 2024 • edited Loading

simonw commented Nov 14, 2024

simonw commented Nov 6, 2024 •

edited

Loading

simonw commented Nov 13, 2024 •

edited

Loading

simonw commented Nov 14, 2024 •

edited

Loading