Skip to content

Commit ff990fd

Browse files
authored
Merge pull request #160 from esokullu/main
15.2
2 parents f689cd8 + 2e99771 commit ff990fd

28 files changed

Lines changed: 346 additions & 58 deletions

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,19 @@ This changelog was generated from the repository Git history and release tags. V
66

77
## [Unreleased]
88

9+
## [15.2.0] - 2026-06-22
10+
11+
### Added
12+
- Jan, vLLM, and SGLang as built-in local providers (Chrome + Firefox). All three use OpenAI-compatible `/v1` endpoints (Jan on port 1337, vLLM on port 8000, SGLang on port 30000), support model listing via `/v1/models`, accept an optional API key for auth-enabled servers, and default to enabled with vision on and a 16 K context window.
13+
14+
### Changed
15+
- Onboarding local-model detection copy now lists Jan, vLLM, and SGLang alongside LM Studio, Ollama, and llama.cpp.
16+
- LLM request-timeout settings description and provider info panel updated to cover all six local backends.
17+
- Updated documentation (README, architecture docs, providers guide) to reflect the expanded local-provider lineup.
18+
19+
### Tests
20+
- Added coverage for `categoryFor` and `listProviderModels` with Jan, vLLM, and SGLang — including auth header forwarding and model-list deduplication — and for `_defaultConfigs` asserting all three new providers are present, enabled, local-categorized, and localhost-defaulted.
21+
922
## [15.1.1] - 2026-06-22
1023

1124
### Changed

README.md

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Open-source AI browser agent for Chrome and Firefox. Chat with any web page, aut
1818
- **Continue from Limit** — When the agent hits the step limit, click Continue to keep going
1919
- **Multi-Provider LLM** — Supports local and cloud models:
2020
- **WebBrain Cloud 1.0** (cloud, default) — Built-in managed cloud option; no local setup required
21-
- **llama.cpp** (local) — No API key needed. Also **Ollama** and **LM Studio**
21+
- **llama.cpp** (local) — No API key needed. Also **Ollama**, **LM Studio**, **Jan**, **vLLM**, and **SGLang**
2222
- **OpenAI** (GPT-5.5, etc.)
2323
- **Anthropic Claude** (native API)
2424
- **Google Gemini**, **Mistral AI**, **DeepSeek**, **xAI Grok**, **Groq**
@@ -68,6 +68,13 @@ llama-server -m your-model.gguf --port 8080
6868
# Or using Ollama (OpenAI-compatible)
6969
ollama serve
7070
# Then set base URL to http://localhost:11434/v1 in settings
71+
72+
# Or using Jan (OpenAI-compatible)
73+
# Start Jan's local API server and use http://localhost:1337/v1
74+
75+
# Or using vLLM / SGLang (OpenAI-compatible)
76+
vllm serve your-model --port 8000
77+
python -m sglang.launch_server --model-path your-model --port 30000
7178
```
7279

7380
> **Context window:** For reliable agent runs, load a local model with **at least a 16k-token context window** (the usable minimum). 8k can work with **Compact mode** enabled (Settings → per-provider checkbox); 4k is too small to hold the system prompt + tool schemas. WebBrain auto-compacts the conversation as it nears the window — it assumes 16k for local models unless you set an explicit context size, so give the model server (e.g. `llama-server -c 16384`) enough room.
@@ -97,6 +104,9 @@ Click the gear icon or go to the extension's Options page to configure:
97104
| llama.cpp | `http://localhost:8080` | Not needed | (your loaded model) |
98105
| Ollama | `http://localhost:11434/v1` | Not needed | (your loaded model) |
99106
| LM Studio | `http://localhost:1234/v1` | Not needed | (your loaded model) |
107+
| Jan | `http://localhost:1337/v1` | Not needed | (your loaded model) |
108+
| vLLM | `http://localhost:8000/v1` | Optional | (your served model) |
109+
| SGLang | `http://localhost:30000/v1` | Optional | (your served model) |
100110
| OpenAI | `https://api.openai.com/v1` | Required | gpt-5.5 |
101111
| Anthropic Claude | `https://api.anthropic.com` | Required | claude-sonnet-4-6 |
102112
| Google Gemini | `https://generativelanguage.googleapis.com/v1beta/openai` | Required | gemini-3.1-flash |
@@ -180,7 +190,7 @@ Deeper docs live in [`docs/`](docs/): [architecture](docs/architecture.md), [sit
180190
| `solve_captcha` | -- | Yes | Yes | Solve CAPTCHAs via CapSolver API (optional, requires API key) |
181191
| `done` | Yes | Yes | Yes | Signal task completion |
182192

183-
**Compact mode** is a reduced tool set + shorter system prompt designed for small local models (2B-8B). In both Chrome and Firefox builds, it cuts the Act-mode schema from 40+ tools to about 20, reducing decision surface and hallucination. Enable it per-provider in Settings (checkbox on llama.cpp, Ollama, LM Studio; off by default).
193+
**Compact mode** is a reduced tool set + shorter system prompt designed for small local models (2B-8B). In both Chrome and Firefox builds, it cuts the Act-mode schema from 40+ tools to about 20, reducing decision surface and hallucination. Enable it per-provider in Settings (checkbox on local providers; off by default).
184194

185195
> **Shadow DOM note:** The accessibility tree only traverses light DOM. On Web Component-heavy pages (Stripe, Salesforce, Shopify), use `get_interactive_elements` (pierces open shadow roots) or `get_shadow_dom` / `shadow_dom_query` for targeted reads.
186196

docs/THREAT-MODEL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ So the question this document answers is: **what is the agent equivalent of the
1313
## 2. System overview & trust boundaries
1414

1515
- **Extension (Manifest V3).** The agent loop, prompt assembly, and tool dispatch run in the extension's standard MV3 sandbox.
16-
- **Local model process.** llama.cpp (via LM Studio / Ollama) runs as a *separate* process and is reached over `localhost` HTTP. No custom binaries, no elevated privileges; the model itself has only the extension's permissions, indirectly.
16+
- **Local model process.** llama.cpp, Ollama, LM Studio, Jan, vLLM, or SGLang runs as a *separate* process and is reached over `localhost` HTTP. No custom binaries, no elevated privileges; the model itself has only the extension's permissions, indirectly.
1717
- **Automation surface.** Page reads and actions are performed through the extension APIs and, for richer control, CDP/debugger automation.
1818
- **Cloud option.** The same agent can target a cloud model instead of the local one.
1919

docs/privacy-and-data-flow.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ The user's message, the current page content (AX tree, screenshot, or extracted
2525
The user chooses their provider in Settings. Options include:
2626

2727
- **Cloud providers**: OpenAI, Anthropic, Google Gemini, Mistral, DeepSeek, xAI, Groq, OpenRouter, etc. — data leaves the user's machine for these
28-
- **Local providers**: llama.cpp, Ollama, LM Studio — data stays on the user's machine
28+
- **Local providers**: llama.cpp, Ollama, LM Studio, Jan, vLLM, SGLang — data stays on the user's machine
2929

3030
The extension itself never receives or stores user data on any remote server.
3131

docs/providers-and-models.md

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@ class BaseLLMProvider {
3838
| `llamacpp` | `llamacpp` | local | (loaded model) | Yes (default on) |
3939
| `ollama` | `openai` | local | (loaded model) | Yes (default on) |
4040
| `lmstudio` | `openai` | local | (loaded model) | Yes (default on) |
41+
| `jan` | `openai` | local | (loaded model) | Yes (default on) |
42+
| `vllm` | `openai` | local | (loaded model) | Yes (default on) |
43+
| `sglang` | `openai` | local | (loaded model) | Yes (default on) |
4144
| `openai` | `openai` | cloud | `gpt-5.5` | Model-name regex |
4245
| `anthropic` | `anthropic` | cloud | `claude-sonnet-4-6` | Model-name regex |
4346
| `claude_subscription` | `anthropic_oauth` | cloud | `claude-sonnet-4-6` | Yes |
@@ -53,13 +56,17 @@ class BaseLLMProvider {
5356

5457
### Local Providers
5558

56-
Three local providers are enabled by default with no API key needed:
59+
Six local providers are enabled by default with no API key needed unless the
60+
local server was started with auth:
5761

5862
- **llama.cpp**: `http://localhost:8080` — runs `llama-server -m model.gguf`
5963
- **Ollama**: `http://localhost:11434/v1``ollama serve`
6064
- **LM Studio**: `http://localhost:1234/v1` — LM Studio's local inference server
65+
- **Jan**: `http://localhost:1337/v1` — Jan's local OpenAI-compatible API server
66+
- **vLLM**: `http://localhost:8000/v1` — vLLM's OpenAI-compatible server
67+
- **SGLang**: `http://localhost:30000/v1` — SGLang's OpenAI-compatible server
6168

62-
All three default `supportsVision: true` since most models loaded locally in 2026 are multimodal.
69+
All six default `supportsVision: true` since most models loaded locally in 2026 are multimodal.
6370

6471
**Context window.** Load local models with **at least a 16k-token context window** for reliable agent runs — that's the usable minimum. 8k can work with Compact mode enabled; 4k is too small to hold the system prompt + tool schemas. The agent reads the window from `provider.contextWindow` (`providers/base.js`) to drive auto-compaction; when a provider config doesn't set `contextWindow`, local providers default to a conservative **16k** (cloud/router default to 128k). Set `config.contextWindow` explicitly to match a larger local window, and make sure the model server is actually started with that much context (e.g. `llama-server -c 16384`).
6572

@@ -74,7 +81,7 @@ filters the exposed tools through `COMPACT_TOOL_NAMES`; Ask mode is unchanged.
7481
| OpenAI-compatible | Regex against model name (`gpt-4o`, `gpt-5`, `claude-3`, `claude-sonnet-4`, `gemini-2.0-flash`, etc.) |
7582
| Anthropic | `claude-(3\|sonnet-4\|opus-4)` patterns |
7683
| llama.cpp | Explicit `supportsVision` config toggle |
77-
| Ollama / LM Studio | Explicit `supportsVision` config toggle (via OpenAI provider) |
84+
| Ollama / LM Studio / Jan / vLLM / SGLang | Explicit `supportsVision` config toggle (via OpenAI provider) |
7885

7986
### Anthropic Conversion
8087

manifest.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
{
22
"manifest_version": 3,
33
"name": "WebBrain",
4-
"version": "15.1.1",
4+
"version": "15.2.0",
55
"description": "Open-source AI browser agent — chat with pages, automate tasks, multi-provider LLM support.",
66
"permissions": [
77
"sidePanel",

package-lock.json

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "webbrain",
3-
"version": "15.1.1",
3+
"version": "15.2.0",
44
"description": "Open-source AI browser agent — chat with pages, automate tasks, multi-provider LLM support.",
55
"private": true,
66
"type": "module",

0 commit comments

Comments
 (0)