You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: support disable thinking when VLLM model (#616)
* feat: add deep think switch button
Signed-off-by: Bob Du <[email protected]>
* feat: api model support VLLM
Signed-off-by: Bob Du <[email protected]>
* feat: api support disable thinking when VLLM model
Signed-off-by: Bob Du <[email protected]>
* docs: update readme
Signed-off-by: Bob Du <[email protected]>
---------
Signed-off-by: Bob Du <[email protected]>
Copy file name to clipboardExpand all lines: README.en.md
+63Lines changed: 63 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,8 @@ Some unique features have been added:
34
34
35
35
[✓] Web Search functionality (Real-time web search based on Tavily API)
36
36
37
+
[✓] VLLM API model support & Optional disable deep thinking mode
38
+
37
39
> [!CAUTION]
38
40
> This project is only published on GitHub, based on the MIT license, free and for open source learning usage. And there will be no any form of account selling, paid service, discussion group, discussion group and other behaviors. Beware of being deceived.
39
41
@@ -125,6 +127,10 @@ For all parameter variables, check [here](#docker-parameter-example) or see:
125
127
126
128
[✓] Interface themes
127
129
130
+
[✓] VLLM API model support
131
+
132
+
[✓] Deep thinking mode switch
133
+
128
134
[✗] More...
129
135
130
136
## Prerequisites
@@ -318,6 +324,63 @@ PS: You can also run `pnpm start` directly on the server without packaging.
318
324
pnpm build
319
325
```
320
326
327
+
## VLLM API Deep Thinking Mode Control
328
+
329
+
> [!TIP]
330
+
> Deep thinking mode control is only available when the backend is configured to use VLLM API, allowing users to choose whether to enable the model's deep thinking functionality.
331
+
332
+
### Features
333
+
334
+
- **VLLM API Exclusive Feature**: Only available when the backend uses VLLM API
335
+
- **Per-conversation Control**: Each conversation can independently enable or disable deep thinking mode
336
+
- **Real-time Switching**: Deep thinking mode can be switched at any time during conversation
337
+
- **Performance Optimization**: Disabling deep thinking can improve response speed and reduce computational costs
338
+
339
+
### Prerequisites
340
+
341
+
**The following conditions must be met to use this feature:**
342
+
343
+
1. **Backend Configuration**: Backend must be configured to use VLLM API interface
344
+
2. **Model Support**: The model used must support deep thinking functionality
345
+
3. **API Compatibility**: VLLM API version needs to support thinking mode control parameters
346
+
347
+
### Usage
348
+
349
+
#### 1. Enable/Disable Deep Thinking Mode
350
+
351
+
1. **Enter Conversation Interface**: In a conversation session that supports VLLM API
352
+
2. **Find Control Switch**: Locate the "Deep Thinking" toggle button in the conversation interface
353
+
3. **Switch Mode**:
354
+
- Enable: Model will perform deep thinking, providing more detailed and in-depth responses
355
+
- Disable: Model will respond directly, faster but potentially more concise
356
+
357
+
#### 2. Usage Scenarios
358
+
359
+
**Recommended to enable deep thinking when:**
360
+
- Complex problems require in-depth analysis
361
+
- Logical reasoning and multi-step thinking are needed
362
+
- High-quality responses are required
363
+
- Time is not sensitive
364
+
365
+
**Recommended to disable deep thinking when:**
366
+
- Simple questions need quick answers
367
+
- Fast response is required
368
+
- Need to reduce computational costs
369
+
- Batch processing simple tasks
370
+
371
+
#### 3. Technical Implementation
372
+
373
+
- **API Parameter**: Controlled through VLLM API's `disable_thinking` parameter
374
+
- **State Persistence**: Each conversation session independently saves the deep thinking switch state
375
+
- **Real-time Effect**: Takes effect immediately for the next message after switching
376
+
377
+
### Notes
378
+
379
+
- **VLLM API Only**: This feature is only available when the backend uses VLLM API, other APIs (such as OpenAI API) do not support this feature
380
+
- **Model Dependency**: Not all models support deep thinking mode, please confirm that your model supports this feature
381
+
- **Response Differences**: Disabling deep thinking may affect the detail and quality of responses
382
+
- **Cost Considerations**: Enabling deep thinking typically increases computational costs and response time
383
+
321
384
## Frequently Asked Questions
322
385
323
386
Q: Why does Git always report an error when committing?
0 commit comments