-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream responses #39
Stream responses #39
Conversation
time: Date.now(), | ||
type: 'msg' | ||
}; | ||
this.messageAdded(errorMsg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be doing the job, although there may be some future optimizations to be done in jupyter-chat.
Currently jupyter-chat
seems to be removing the previous message before adding the one with the update here:
Maybe it should also compare the timestamps like the comment suggests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the comment do not refer to the correct line, the message is sent/updated above:
this.messageAdded(botMsg);
In your implementation, the timestamp is not updated, so the position of the message should not change. And it is probably better like this, to avoid confusion.
For comparison, streaming in jupyter-ai lead to update_message in the back-end, which do not update the timestamp.
We should probably homogenize this in jupyter-chat
, and allow consumer extension to change or not the timestamp...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes right it was about the line above (which Github shows as not modified).
the timestamp is not updated, so the position of the message should not change. And it is probably better like this, to avoid confusion.
Yes, in terms of UX this seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on another note for a different issue, after using the jupyter-chat
API, I was wondering if changing to the following could make sense:
messageAdded
->addMessage
messagesDeleted
->deleteMessage
In general when working with extensions, when properties finish with ed
they tend to refer to signals. Also it may more of a personal opinion but addMessage
sounds more natural than messageAdded
when interacting with the chat model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but in this case it could also be confusing with sendMessage
and deleteMessage
that already exist.
This is the distinction between:
- the actions the user do in its chat (
sendMessage
anddeleteMessage
) that must be propagated to the other users - the actions done by remote users (
messageAdded
andmessagesDeleted
) and that must update the chat view...
In the case of jupyterlite
chats, it is most confusing because there is no back-end, the sendMessage
propagates itself the messages from the user and the AI.
It may worth an issue in jupyter-chat
to discuss the naming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah happy to open an issue on https://github.com/jupyterlab/jupyter-chat to discuss this 👍
Do we want to let users choose between streaming and not streaming responses? Probably it's likely fine to default to streaming for now if we can, since this is what all AI chats are doing at the moment. But if we want to let users choose there could be a new option in the settings. |
for await (const chunk of await this._aiProvider.chatModel.stream( | ||
messages | ||
)) { | ||
content += chunk.content; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if the stream are always additional or could try to modify previous part of the message.
I don't know if there is a setting for that, and if it worth looking for that feature in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing they should be additional as otherwise it would be difficult to tell if that's the case or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you mean if some providers would return the complete updated message each time?
For example:
- He
- Hello
- Hello,
- Hello, world
Instead of:
- He
- llo
- ,
- world
The second one is more efficient in terms of bandwidth and data being transferred, so there may not be many advantages for the first approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case I agree that additional is the best approach.
Guessing they should be additional as otherwise it would be difficult to tell if that's the case or not?
Indeed, it must always act in the same way.
To explain my question, it looks like it's writing word by word, like a human. But sometime you realise in the middle of the sentence that you could have wrote it differently for more consistency, and want to reformat the whole sentence. I was wondering if some model could act similarly to human. But as you said, it would be difficult to know with this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks @jtpio
We can think about it in a follow up PR, I agree that having it as default seems good for now. |
Thanks @brichet for the review! Merging to be able to use this change in other PRs. Happy to track questions and improvements separately. |
Fixes #36
jupyterlite-ai-stream.mp4
The tokens do seem to be correctly streamed when looking at the request in the dev tools:
For reference langchain handles "auto streaming": https://js.langchain.com/docs/concepts/streaming/#auto-streaming-chat-models