-
Notifications
You must be signed in to change notification settings - Fork 13.3k
webui: remove client-side context pre-check and rely on backend for limits #16506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
webui: remove client-side context pre-check and rely on backend for limits #16506
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few cosmetic changes 😄 also, could u add screenshots/video to the PR description with comparison of before/after changes? Will be great for adding context for the future lookback.
tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte
Outdated
Show resolved
Hide resolved
tools/server/webui/src/lib/components/app/chat/ChatScreen/ChatScreen.svelte
Outdated
Show resolved
Hide resolved
I’d love to make a temporary mini version of the model selector : just a simple field in Settings to declare the model in the JSON request. That way my llama-swap would work on master, and I could make videos of the master branch more easily! |
I’ve added two videos, running on my Raspberry Pi 5 (16 GB) with Qwen3 30B A3B, fully synced with the master branch. You can see the bug where I got stuck : once the context overflows, the interface is completely blocked until you hit F5. With the current PR build, it’s much better: if a message block is too large, it can still slip into the context and needs to be deleted manually. But since the backend decides, it never fully blocks. We could still improve it a bit by preventing oversized messages from being sent into the context in the first place. |
Toolcall testing (Node.js proxy) Google.what.the.weather-AVC-750kbps.mp4 |
@ServeurpersoCom Curious are you doing some OCR in the last video to detect text elements in the screenshots? Would love to learn more, but maybe after the PR is reviewed to avoid getting offtopic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ServeurpersoCom let's just rebuild fresh webui static output and we good to go :)
…imits Removed the client-side context window pre-check and now simply sends messages while keeping the dialog imports limited to core components, eliminating the maximum context alert path Simplified streaming and non-streaming chat error handling to surface a generic 'No response received from server' error whenever the backend returns no content Removed the obsolete maxContextError plumbing from the chat store so state management now focuses on the core message flow without special context-limit cases
Co-authored-by: Aleksander Grygier <[email protected]>
Co-authored-by: Aleksander Grygier <[email protected]>
…Screen.svelte Co-authored-by: Aleksander Grygier <[email protected]>
…Screen.svelte Co-authored-by: Aleksander Grygier <[email protected]>
28badc5
to
be85c24
Compare
@ServeurpersoCom actually I will improve the UI/UX of the new Alert Dialog in a separate PR so that we don't block this change :) |
Not OCR : the proxy just parses streamed text and DOM elements in real time. The model actually sees the entire page: it can analyze the full DOM and reach elements outside the viewport through an abstraction layer that simulates human actions (scroll, click, type). ![]() |
Awesome can’t wait to see your pure Svelte touch on that dialog 😄 |
Nice. So this seems like some sort of ingenious way to control a headless? browser with an LLM. And the images in the WebUI are just "progress report" from the browser. It's a bit over my head, but definitely looks interesting. |
Exactly, but not headless, full real browser with GPU capability (inside a software box) ! the goal is to convert the DOM (with all bounding boxes) into labeled text tokens for the LLM. Idea: we could add a small module in llama.cpp that exposes every ToolCall event through a user-defined HTTP hook : that would let anyone easily connect their model to external actions or systems! |
webui: remove client-side context pre-check and rely on backend for limits
Removed the client-side context window pre-check and now simply sends messages
while keeping the dialog imports limited to core components, eliminating the
maximum context alert path
Simplified streaming and non-streaming chat error handling to surface a generic
'No response received from server' error whenever the backend returns no content
Removed the obsolete maxContextError plumbing from the chat store so state
management now focuses on the core message flow without special context-limit cases
close #16437
Master branch :
https://github.com/user-attachments/assets/edc1337d-2e19-4f99-a7ba-78f40146022f
This PR (don't care about Model Selector) :
https://github.com/user-attachments/assets/e9952e04-e189-434f-8536-84184193d704