Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream responses #39

Merged
merged 2 commits into from
Feb 13, 2025
Merged

Stream responses #39

merged 2 commits into from
Feb 13, 2025

Conversation

jtpio
Copy link
Member

@jtpio jtpio commented Feb 13, 2025

Fixes #36

jupyterlite-ai-stream.mp4

The tokens do seem to be correctly streamed when looking at the request in the dev tools:

image

For reference langchain handles "auto streaming": https://js.langchain.com/docs/concepts/streaming/#auto-streaming-chat-models

@jtpio jtpio added the enhancement New feature or request label Feb 13, 2025
time: Date.now(),
type: 'msg'
};
this.messageAdded(errorMsg);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be doing the job, although there may be some future optimizations to be done in jupyter-chat.

Currently jupyter-chat seems to be removing the previous message before adding the one with the update here:

https://github.com/jupyterlab/jupyter-chat/blob/73512c16a6c602b81f873ecf9a3c20cb73cefd94/packages/jupyter-chat/src/model.ts#L435-L449

Maybe it should also compare the timestamps like the comment suggests?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment do not refer to the correct line, the message is sent/updated above:

this.messageAdded(botMsg);

In your implementation, the timestamp is not updated, so the position of the message should not change. And it is probably better like this, to avoid confusion.
For comparison, streaming in jupyter-ai lead to update_message in the back-end, which do not update the timestamp.

We should probably homogenize this in jupyter-chat, and allow consumer extension to change or not the timestamp...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes right it was about the line above (which Github shows as not modified).

the timestamp is not updated, so the position of the message should not change. And it is probably better like this, to avoid confusion.

Yes, in terms of UX this seems fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also on another note for a different issue, after using the jupyter-chat API, I was wondering if changing to the following could make sense:

  • messageAdded -> addMessage
  • messagesDeleted -> deleteMessage

In general when working with extensions, when properties finish with ed they tend to refer to signals. Also it may more of a personal opinion but addMessage sounds more natural than messageAdded when interacting with the chat model.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but in this case it could also be confusing with sendMessage and deleteMessage that already exist.

This is the distinction between:

  • the actions the user do in its chat (sendMessage and deleteMessage) that must be propagated to the other users
  • the actions done by remote users (messageAdded and messagesDeleted) and that must update the chat view...

In the case of jupyterlite chats, it is most confusing because there is no back-end, the sendMessage propagates itself the messages from the user and the AI.

It may worth an issue in jupyter-chat to discuss the naming.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah happy to open an issue on https://github.com/jupyterlab/jupyter-chat to discuss this 👍

@jtpio
Copy link
Member Author

jtpio commented Feb 13, 2025

Do we want to let users choose between streaming and not streaming responses?

Probably it's likely fine to default to streaming for now if we can, since this is what all AI chats are doing at the moment. But if we want to let users choose there could be a new option in the settings.

@jtpio jtpio marked this pull request as ready for review February 13, 2025 07:58
@jtpio jtpio requested a review from brichet February 13, 2025 11:50
for await (const chunk of await this._aiProvider.chatModel.stream(
messages
)) {
content += chunk.content;
Copy link
Collaborator

@brichet brichet Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the stream are always additional or could try to modify previous part of the message.
I don't know if there is a setting for that, and if it worth looking for that feature in future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guessing they should be additional as otherwise it would be difficult to tell if that's the case or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or you mean if some providers would return the complete updated message each time?

For example:

  • He
  • Hello
  • Hello,
  • Hello, world

Instead of:

  • He
  • llo
  • ,
  • world

The second one is more efficient in terms of bandwidth and data being transferred, so there may not be many advantages for the first approach?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I agree that additional is the best approach.

Guessing they should be additional as otherwise it would be difficult to tell if that's the case or not?

Indeed, it must always act in the same way.

To explain my question, it looks like it's writing word by word, like a human. But sometime you realise in the middle of the sentence that you could have wrote it differently for more consistency, and want to reformat the whole sentence. I was wondering if some model could act similarly to human. But as you said, it would be difficult to know with this API.

Copy link
Collaborator

@brichet brichet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks @jtpio

@brichet
Copy link
Collaborator

brichet commented Feb 13, 2025

Do we want to let users choose between streaming and not streaming responses?

Probably it's likely fine to default to streaming for now if we can, since this is what all AI chats are doing at the moment. But if we want to let users choose there could be a new option in the settings.

We can think about it in a follow up PR, I agree that having it as default seems good for now.

@jtpio
Copy link
Member Author

jtpio commented Feb 13, 2025

Thanks @brichet for the review!

Merging to be able to use this change in other PRs. Happy to track questions and improvements separately.

@jtpio jtpio merged commit fb52aa8 into jupyterlite:main Feb 13, 2025
7 checks passed
@jtpio jtpio deleted the stream branch February 13, 2025 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support streaming responses
2 participants