Allow users to stop message streaming #1022

krassowski · 2024-10-09T11:06:46Z

Fixes Allow to interrupt/stop streaming #965
Fixes Check for collaborative extension needs updating to work with v3 #1027
Alleviates ChatUser does not comply with the jupyter identity model #1025 (username is now sourced from identity provider, but display name and name are kept as-is)

This PR not only stops the stream from being sent to the user, but also propagates the information up to the provider/model making to possible for custom deployments to actually kill the generating process and thus safe cost.

packages/jupyter-ai/jupyter_ai/chat_handlers/default.py

michaelchia · 2024-10-09T11:21:06Z

does this remove the ability to send multiple messages before receiving a reply? i personally don't think sending multiple messages is useful but i just want to understand if that is a change.

also creating a new chat or deleting the message while it is streaming should similarly stop the stream. currently it stops it from showing in the UI but does not interupt the provider. may not need to be within the same PR but something that could be added eventually.

krassowski · 2024-10-09T12:20:59Z

does this remove the ability to send multiple messages before receiving a reply? i personally don't think sending multiple messages is useful but i just want to understand if that is a change.

Great question! Technically it does not - it is still possible to send multiple messages using Enter or Shift + Enter:

michaelchia · 2024-10-09T12:23:43Z

so looks like the button will stop all pending messages?

krassowski · 2024-10-09T12:48:12Z

so looks like the button will stop all pending messages?

Currently yes. What do you think it should do in that case? I think user expectations are loosely defined (if any) because other applications often do not support concurrent messages.

michaelchia · 2024-10-09T13:00:46Z

i don't have any strong opinions on this, stopping all pending messages seems fine.

any idea on how llm memory is affected? if GenerationInterrupted is not handled (default), can i assume the llm chain will just continue to run? if that is the case, the llm memory would contain the fully completed AIMessage while the UI will show the partial message. if the chain is actually interrupted, the exchange (both HumanMessage and AIMessage pair) would not be added to memory.

i am not sure which would be preferable in this case. i would lean toward the former

EDIT: i am actually not sure what the default behaviour would be. will need to look into RunnableWithMessageHistory to figure it out.

krassowski · 2024-10-09T13:14:42Z

any idea on how llm memory is affected? if GenerationInterrupted is not handled (default), can i assume the llm chain will just continue to run? if that is the case, the llm memory would contain the fully completed AIMessage while the UI will show the partial message. if the chain is actually interrupted, the exchange (both HumanMessage and AIMessage pair) would not be added to memory.

No, I think it should not contain the complete message. I think it is stateless with RunnableWithMessageHistory taking care of the history provisioning.

also creating a new chat or deleting the message while it is streaming should similarly stop the stream. currently it stops it from showing in the UI but does not interupt the provider

Done in 1d8b3f9

krassowski · 2024-10-09T13:25:46Z

i don't have any strong opinions on this, stopping all pending messages seems fine.

I think so, it feels that the main button should act like "stop all" but maybe there should be a menu entry under three dots next to each message allowing to stop streaming of specific messages. I think that should be implemented in a future PR.

packages/jupyter-ai/jupyter_ai/extension.py

packages/jupyter-ai/jupyter_ai/chat_handlers/default.py

dlqqq

@krassowski Thank you for opening this PR and for documenting your original issue so clearly! It is really helpful to see examples from other AI assistant applications showing the UX you're proposing here.

Jupyter AI is intended to support collaboration, and has integrations with jupyter-collaboration to allow for multiple users to share the same chat UI and use the same configured LLM. My concerns with this PR are that:

Whether the stop button is shown is determined by if any streaming messages are incomplete, and
Clicking the "Stop" button halts streaming on all replies.

I consider these concerns as blockers, because these to two consequences:

When one user sends a message, it blocks all other users from sending one through the UI until the response is finished streaming. (yes, they can use the keyboard, but the UI suggests they can't send a message)
Users can halt streaming of responses addressed to other users.

I suggest that we re-define how stopping messages works. Specifically:

Whether the "Stop" button is shown should be based on whether the response to the user's last question is still streaming.
Pressing the "Stop" button should only stop streaming of the last message asked by that user.
If a user rapidly sends 2 messages in serial, pressing the "Stop" button only needs to stop the last message from streaming.
- This is an engineering shortcut I recommend taking so we don't need to store and manage an array of all the agent responses that are still streaming; we just need a reference to the most recent one.
- I think this is fine given that there's not much of a use-case for a single user to be sending multiple messages. A user with multiple questions would expect to just ask them all at once in a single message.
StopRequest should be modified to include an id field, which should contain a agent message ID. On receipt, the backend should stop only that message from streaming further.

Here are some extra details about the codebase that I thought would help with this proposed implementation:

You can modify props.onSend() in send-button.tsx to return the message ID of the human message that was just sent. This is defined in chat-input.tsx, and calls ChatHandler.sendMessage(), which returns a Promise that resolves to the message ID of the HumanChatMessage the user just created by sending the message.
ChatHandler.replyFor() accepts a human message ID and returns a Promise that resolves to the AgentChatMessage object that is replying to that human message. This method only works when not streaming, and is unused in the codebase today. You can modify the implementation to return a promise that resolves to an AgentStreamMessage object as soon as the stream starts. It should be as easy as changing this:
```
// resolve promise from `replyFor()` if it exists
if (
  newMessage.type === 'agent' &&
  newMessage.reply_to in this._replyForResolverDict
) {
```
to:
```
// resolve promise from `replyFor()` if it exists
if (
  (newMessage.type === 'agent' || newMessage.type === 'agent-stream') &&
  newMessage.reply_to in this._replyForResolverDict
) {
```
Then, update the relevant types from AgentChatMessage to AgentChatMessage | AgentStreamMessage.

Let me know your thoughts on the changes I'm proposing above. Again, thanks for contributing this!

dlqqq · 2024-10-10T23:39:56Z

One extra comment: The "Stop" button should revert to a "Send" button if the user types while the previous response is still streaming. This preserves the current UX for stopping a stream, while also having the UI correctly reflect that it is still possible to send a message while a response is still streaming.

krassowski · 2024-10-11T11:29:00Z

Thank you for the feedback @dlqqq, this should be ready for another round of review :)

dlqqq

@krassowski The new revision of this PR looks fantastic! I left a couple of minor callouts and also would like to add additional changes to this PR to minimize the performance impact. Since the additional changes I'm proposing can be lengthy, I think it would be best to open this as a separate PR, possibly targeting your fork's branch. See below.

packages/jupyter-ai/src/components/chat.tsx

packages/jupyter-ai/jupyter_ai/chat_handlers/base.py

packages/jupyter-ai/jupyter_ai/handlers.py

dlqqq · 2024-10-15T20:10:22Z

@krassowski I pushed a couple of commits that:

Changes StopRequest to stop streaming all messages from the current user, without requiring a target field.
- After looking at the code more, I realized that my original suggestion of only stopping one message-at-a-time was not the best approach (sorry!). What's actually important is that only messages from the user who issued the stop request are stopped. This PR is similar to the original one you opened, except the backend only interrupts the messages from the current user based on self.current_user.username.
Dropped StopMessage model and removed interrupted: boolean from AgentStreamMessage's frontend model, as they were both unused. These can be easily re-implemented in a future PR that requires them.
Adds a message tombstone to interrupted messages for the purposes of future chat memory passed to the LLM. A partial response in the history could confuse the LLM, so interrupted messages now always end in (AI response stopped by user).

Here's a demo video:

Screen.Recording.2024-10-15.at.1.08.32.PM.mov

This PR looks good to me as-is (no need to split this PR if you're too busy to do so). However, I'll leave this PR open for your feedback or additional code changes.

(p.s. In the future, if I have more complex suggestions, I will be more inclined to push to your branch and ask for your feedback. It would be more respectful of your time for me to write the code myself and be accountable for my suggestions working/not working.)

krassowski · 2024-10-16T09:42:58Z

Dropped StopMessage model and removed interrupted: boolean from AgentStreamMessage's frontend model, as they were both unused. These can be easily re-implemented in a future PR that requires them.

Agreed, this was not necessary. I was thinking about adding some kind of UI indicator for interrupted messages (but most other chat interfaces do not have one) and also about allowing the telemetry to get that info but since you added the tombstone this is not a dire need.

krassowski · 2024-10-16T09:59:53Z

Looks good to me. I am not a fan of "here" in the streamingReplyHere name; I know what you want to express but do not have better suggestions so I guess it stays.

I opened PRs for the sub-tasks:

for more information, see https://pre-commit.ci

dlqqq

Awesome, thank you @krassowski! Really looking forward to using this feature myself. 🎉

After this, I need to lift some of the more complex streaming logic out of DefaultChatHandler into BaseChatHandler so authors of custom slash commands don't need to constantly duplicate our implementation every minor release. With this PR as-is, the stop button would show but not work for custom slash commands that implement streaming.

I'm targeting this feature to become available in a minor release next Monday (10/21). 💪

krassowski · 2024-10-16T16:46:23Z

Thanks for the help with review @dlqqq! I agree it would be great if streaming/stopping streaming worked for slash commands too.

krassowski added the enhancement New feature or request label Oct 9, 2024

krassowski commented Oct 9, 2024

View reviewed changes

packages/jupyter-ai/jupyter_ai/chat_handlers/default.py Outdated Show resolved Hide resolved

krassowski commented Oct 9, 2024

View reviewed changes

packages/jupyter-ai/jupyter_ai/extension.py Show resolved Hide resolved

krassowski changed the title ~~Allow to stop streaming of agent messages~~ Allow to stop streaming agent messages Oct 9, 2024

krassowski commented Oct 9, 2024

View reviewed changes

packages/jupyter-ai/jupyter_ai/chat_handlers/default.py Outdated Show resolved Hide resolved

dlqqq requested changes Oct 10, 2024

View reviewed changes

krassowski mentioned this pull request Oct 11, 2024

ChatUser does not comply with the jupyter identity model #1025

Open

brichet mentioned this pull request Oct 11, 2024

Use @jupyter/chat as chat UI #862

Open

11 tasks

dlqqq requested changes Oct 14, 2024

View reviewed changes

packages/jupyter-ai/src/components/chat.tsx Outdated Show resolved Hide resolved

packages/jupyter-ai/jupyter_ai/chat_handlers/base.py Show resolved Hide resolved

dlqqq requested changes Oct 14, 2024

View reviewed changes

packages/jupyter-ai/jupyter_ai/handlers.py Outdated Show resolved Hide resolved

dlqqq approved these changes Oct 15, 2024

View reviewed changes

dlqqq changed the title ~~Allow to stop streaming agent messages~~ Allow users to stop message streaming Oct 15, 2024

krassowski mentioned this pull request Oct 16, 2024

Always use username from IdentityProvider #1034

Merged

krassowski added 5 commits October 16, 2024 08:51

Allow to stop streaming of agent messages

9126f3d

Lint

e8a1191

Interrupt message streaming when they get deleted

0a8320b

Cleanup code and ensure that old events are deleted

90092e6

Fix unit tests

c90091c

krassowski and others added 7 commits October 16, 2024 08:51

Add an await

45feea3

Store information about message being interrupted on the frontend

c0cbf61

Only allow users to stop streaming response to >their< last message

924a260

Default to "send" action when there is any input (and lint)

08a07d2

rework stop button to stop streaming on all replies to current user

f7a031c

add message tombstone to interrupted messages

56e3d33

[pre-commit.ci] auto fixes from pre-commit.com hooks

5acb7e7

for more information, see https://pre-commit.ci

dlqqq force-pushed the feat/stop-streaming branch from d25cf69 to 5acb7e7 Compare October 16, 2024 15:51

dlqqq approved these changes Oct 16, 2024

View reviewed changes

dlqqq enabled auto-merge (squash) October 16, 2024 15:54

dlqqq merged commit 1eb3b55 into jupyterlab:main Oct 16, 2024
10 checks passed

krassowski deleted the feat/stop-streaming branch October 16, 2024 16:40

This was referenced Oct 18, 2024

Unify message clearing & broadcast logic #1038

Merged

Migrate streaming logic to BaseChatHandler #1039

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow users to stop message streaming #1022

Allow users to stop message streaming #1022

krassowski commented Oct 9, 2024 •

edited

Loading

michaelchia commented Oct 9, 2024

krassowski commented Oct 9, 2024

michaelchia commented Oct 9, 2024

krassowski commented Oct 9, 2024

michaelchia commented Oct 9, 2024 •

edited

Loading

krassowski commented Oct 9, 2024

krassowski commented Oct 9, 2024

dlqqq left a comment •

edited

Loading

dlqqq commented Oct 10, 2024 •

edited

Loading

krassowski commented Oct 11, 2024

dlqqq left a comment

dlqqq commented Oct 15, 2024 •

edited

Loading

krassowski commented Oct 16, 2024

krassowski commented Oct 16, 2024

dlqqq left a comment

krassowski commented Oct 16, 2024

Allow users to stop message streaming #1022

Allow users to stop message streaming #1022

Conversation

krassowski commented Oct 9, 2024 • edited Loading

michaelchia commented Oct 9, 2024

krassowski commented Oct 9, 2024

michaelchia commented Oct 9, 2024

krassowski commented Oct 9, 2024

michaelchia commented Oct 9, 2024 • edited Loading

krassowski commented Oct 9, 2024

krassowski commented Oct 9, 2024

dlqqq left a comment • edited Loading

Choose a reason for hiding this comment

dlqqq commented Oct 10, 2024 • edited Loading

krassowski commented Oct 11, 2024

dlqqq left a comment

Choose a reason for hiding this comment

dlqqq commented Oct 15, 2024 • edited Loading

krassowski commented Oct 16, 2024

krassowski commented Oct 16, 2024

dlqqq left a comment

Choose a reason for hiding this comment

krassowski commented Oct 16, 2024

krassowski commented Oct 9, 2024 •

edited

Loading

michaelchia commented Oct 9, 2024 •

edited

Loading

dlqqq left a comment •

edited

Loading

dlqqq commented Oct 10, 2024 •

edited

Loading

dlqqq commented Oct 15, 2024 •

edited

Loading