[Bug]:因为错误导致知识库查询失败后，即使错误修复，使用相同的查询词也会继续失败

### Do you need to file an issue?

- [x] I have searched the existing issues and this bug is not already filed.
- [x] I believe this is a legitimate bug, not just a question or feature request.

### Describe the bug

1. 第一次由于API配置错误，系统查询知识库失败；但是由于某种回退机制，访问API却是成功的；

> 负责对话的模块 (chat_agent) 在调用 LLM 之前，会自动检查并修复 URL。即使 .env 中错误地配置了 .../v1/chat/completions，代码会自动将其截断为 .../v1，所以对话功能正常工作。
> 
> 负责知识库检索的模块 (rag_tool / LightRAG) 直接读取了环境变量中的 URL，没有进行自动修复。
> 它请求了 .../v1/chat/completions/chat/completions，导致 404 Not Found 错误。
> 当 RAG 失败时，系统会记录一条 WARNING，然后降级为普通对话（不带知识库上下文），所以您仍然能看到回复。

2. 当我修复API配置问题后，查询知识库，在回答页面，AI仍然说他查询不到内容；

<img width="846" height="1103" alt="Image" src="https://github.com/user-attachments/assets/2ac0b29d-75e7-4299-9975-b7ea5cfe2c02" />

``` markdown
  1. LLM 缓存过期(主要问题)
- `lightrag/operate.py` 中的 `kg_query` 函数在调用 LLM 之前会检查是否有缓存的响应。
  - 机制: 缓存键 (`args_hash`) 是根据查询字符串和参数 (如 mode, top_k 等) 生成的。
  - 缺陷: 生成的缓存键并没有包含检索到的上下文数据 (实体、关系、文本块)。
  - 影响:
    1. 之前对 "AutoAgent" 的查询可能返回了无结果 (可能当时上下文为空)。
    2. 这个 "无结果" 的响应被保存在了 `{kv_store_llm_response_cache.json}` 中。
    3. 后续即使检索到了大量上下文 (142,338 字符)，只要查询参数不变，系统就会命中缓存，直接忽略新的上下文。
    4. 系统返回过期的 "未找到" 响应。
- **证据**:
  - 日志显示: `[Backend] [Main] INFO: ══ LLM cache ══ Query cache hit, using cached response as query result`
  - 代码 (`lightrag/operate.py`): `compute_args_hash` 的参数中不包含 `context_result`
```
我修改了提示词，之后确实能够查询到知识库信息了。

<img width="1064" height="917" alt="Image" src="https://github.com/user-attachments/assets/d9b47e34-ca82-4ee1-a221-c57220e20048" />



### Steps to reproduce

1. 使用错误但不完全错误的API询问知识库
2. 修复API，再次询问知识库，出现无法查询bug




### Expected Behavior

1. API错误应当直接提出，而不是使用回退机制，这像AI写的屎山代码
2. 知识库缓存刷新应该更频繁，比如基于时间的过期机制

### Related Module

Knowledge Base Management

### Configuration Used

_No response_

### Logs and screenshots

[Backend] [Main] INFO: Executing VLM enhanced query: 我想了解AutoAgent相关的内容...
[Backend] [Main] DEBUG: [aquery_llm] Query param: QueryParam(mode='hybrid', only_need_context=False, only_need_prompt=True, response_type='Multiple Paragraphs', stream=False, top_k=40, chunk_top_k=20, max_entity_tokens=6000, max_relation_tokens=8000, max_total_tokens=30000, hl_keywords=[], ll_keywords=[], conversation_history=[], history_turns=0, model_func=None, user_prompt=None, enable_rerank=True, include_references=False)
[Backend] [Main] DEBUG: Flattened cache hit(key:hybrid:keywords:f92959b52e76f0bb296fcb7854886fbf)
[Backend] [Main] DEBUG: High-level keywords: ['AutoAgent', 'Automated agents', 'AI agent technology']
[Backend] [Main] DEBUG: Low-level  keywords: ['AutoAgent']
[Backend] [Main] INFO: Embedding func: 8 new workers initialized (Timeouts: Func: 30s, Worker: 60s, Health Check: 75s)
[Frontend]  GET /settings 200 in 79ms (compile: 4ms, render: 76ms)
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/config/status HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/config/ports HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/settings HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/settings/sidebar HTTP/1.1" 200 OK
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] DEBUG: Pre-computed query embedding for all vector operations
[Backend] [Main] INFO: Query nodes: AutoAgent (top_k:40, cosine:0.2)
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/config/embedding HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/config/embedding HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/config/llm HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/config/llm HTTP/1.1" 200 OK
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] INFO: Local query: 40 entites, 184 relations
[Backend] [Main] INFO: Query edges: AutoAgent, Automated agents, AI agent technology (top_k:40, cosine:0.2)
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/config/tts HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/config/tts HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51054 - "GET /api/v1/config/search HTTP/1.1" 200 OK
[Backend] INFO:     127.0.0.1:51056 - "GET /api/v1/config/search HTTP/1.1" 200 OK
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] INFO: Global query: 41 entites, 40 relations
[Backend] [Main] INFO: Raw search results: 63 entities, 184 relations, 0 vector chunks
[Backend] [Main] DEBUG: Before truncation: 63 entities, 184 relations
[Backend] [Main] INFO: After truncation: 63 entities, 173 relations
[Backend] [Main] DEBUG: Finding text chunks from 63 entities
[Backend] [Main] DEBUG: Vector similarity chunk selection: num_of_chunks=157, entity_info_count=63
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 unique chunk IDs collected
[Backend] [Main] DEBUG: Using pre-computed query embedding for vector similarity chunk selection
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 chunk vectors Retrieved
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 chunks from 79 candidates
[Backend] [Main] INFO: Selecting 79 from 79 entity-related chunks by vector similarity
[Backend] [Main] DEBUG: Finding text chunks from 173 relations
[Backend] [Main] INFO: Find 38 additional chunks in 36 relations (deduplicated 10)
[Backend] [Main] DEBUG: Vector similarity chunk selection: num_of_chunks=90, entity_info_count=36
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 unique chunk IDs collected
[Backend] [Main] DEBUG: Using pre-computed query embedding for vector similarity chunk selection
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 chunk vectors Retrieved
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 chunks from 38 candidates
[Backend] [Main] INFO: Selecting 38 from 38 relation-related chunks by vector similarity
[Backend] [Main] DEBUG: KG related chunks: 20 from entitys, 38 from relations
[Backend] [Main] INFO: Round-robin merged chunks: 20 -> 20 (deduplicated 0)
[Backend] [Main] DEBUG: Token allocation - Total: 30000, SysPrompt: 560, Query: 8, KG: 13848, Buffer: 200, Available for chunks: 15384
[Backend] [Main] WARNING: Rerank is enabled but no rerank model is configured. Please set up a rerank model or set enable_rerank=False in query parameters.
[Backend] [Main] DEBUG: Kept chunk_top-k: 20 chunks (deduplicated original: 20)
[Backend] [Main] DEBUG: Token truncation: 11 chunks from 20 (chunk available tokens: 15384, source: hybrid)
[Backend] [Main] INFO: Final context: 63 entities, 173 relations, 11 chunks
[Backend] [Main] INFO: Final chunks S+F/O: E14/2 E1/19 E2/22 E5/26 E1/32 E1/35 E2/38 E1/39 E1/41 E8/47 E2/48
[Backend] [Main] DEBUG: [_build_context_str] Converting to user format: 63 entities, 173 relations, 11 chunks
[Backend] [Main] DEBUG: [convert_to_user_format] Formatted 11/11 chunks
[Backend] [Main] DEBUG: [_build_context_str] Final data after conversion: 0 entities, 0 relationships, 0 chunks
[Backend] [Main] DEBUG: [_build_query_context] Context length: 142338
[Backend] [Main] DEBUG: [_build_query_context] Raw data entities: 63, relationships: 173, chunks: 11
[Backend] [Main] DEBUG: Retrieved raw prompt from LightRAG
[Backend] [Main] INFO: Found 0 image path matches in prompt
[Backend] [Main] INFO: No valid images found, falling back to normal query
[Backend] [Main] DEBUG: [aquery_llm] Query param: QueryParam(mode='hybrid', only_need_context=False, only_need_prompt=False, response_type='Multiple Paragraphs', stream=False, top_k=40, chunk_top_k=20, max_entity_tokens=6000, max_relation_tokens=8000, max_total_tokens=30000, hl_keywords=[], ll_keywords=[], conversation_history=[], history_turns=0, model_func=None, user_prompt=None, enable_rerank=True, include_references=False)
[Backend] [Main] DEBUG: Flattened cache hit(key:hybrid:keywords:f92959b52e76f0bb296fcb7854886fbf)
[Backend] [Main] DEBUG: High-level keywords: ['AutoAgent', 'Automated agents', 'AI agent technology']
[Backend] [Main] DEBUG: Low-level  keywords: ['AutoAgent']
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] DEBUG: Pre-computed query embedding for all vector operations
[Backend] [Main] INFO: Query nodes: AutoAgent (top_k:40, cosine:0.2)
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] INFO: Local query: 40 entites, 184 relations
[Backend] [Main] INFO: Query edges: AutoAgent, Automated agents, AI agent technology (top_k:40, cosine:0.2)
[Backend] INFO:src.services.embedding.adapters.openai_compatible:Successfully generated 1 embeddings (model: text-embedding-3-large, dimensions: 3072)
[Backend] [EmbeddingClient] DEBUG: Generated 1 embeddings using openai
[Backend] [Main] INFO: Global query: 41 entites, 40 relations
[Backend] [Main] INFO: Raw search results: 63 entities, 184 relations, 0 vector chunks
[Backend] [Main] DEBUG: Before truncation: 63 entities, 184 relations
[Backend] [Main] INFO: After truncation: 63 entities, 173 relations
[Backend] [Main] DEBUG: Finding text chunks from 63 entities
[Backend] [Main] DEBUG: Vector similarity chunk selection: num_of_chunks=157, entity_info_count=63
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 unique chunk IDs collected
[Backend] [Main] DEBUG: Using pre-computed query embedding for vector similarity chunk selection
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 chunk vectors Retrieved
[Backend] [Main] DEBUG: Vector similarity chunk selection: 79 chunks from 79 candidates
[Backend] [Main] INFO: Selecting 79 from 79 entity-related chunks by vector similarity
[Backend] [Main] DEBUG: Finding text chunks from 173 relations
[Backend] [Main] INFO: Find 38 additional chunks in 36 relations (deduplicated 10)
[Backend] [Main] DEBUG: Vector similarity chunk selection: num_of_chunks=90, entity_info_count=36
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 unique chunk IDs collected
[Backend] [Main] DEBUG: Using pre-computed query embedding for vector similarity chunk selection
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 chunk vectors Retrieved
[Backend] [Main] DEBUG: Vector similarity chunk selection: 38 chunks from 38 candidates
[Backend] [Main] INFO: Selecting 38 from 38 relation-related chunks by vector similarity
[Backend] [Main] DEBUG: KG related chunks: 20 from entitys, 38 from relations
[Backend] [Main] INFO: Round-robin merged chunks: 20 -> 20 (deduplicated 0)
[Backend] [Main] DEBUG: Token allocation - Total: 30000, SysPrompt: 560, Query: 8, KG: 13848, Buffer: 200, Available for chunks: 15384
[Backend] [Main] WARNING: Rerank is enabled but no rerank model is configured. Please set up a rerank model or set enable_rerank=False in query parameters.
[Backend] [Main] DEBUG: Kept chunk_top-k: 20 chunks (deduplicated original: 20)
[Backend] [Main] DEBUG: Token truncation: 11 chunks from 20 (chunk available tokens: 15384, source: hybrid)
[Backend] [Main] INFO: Final context: 63 entities, 173 relations, 11 chunks
[Backend] [Main] INFO: Final chunks S+F/O: E14/2 E1/19 E2/22 E5/26 E1/32 E1/35 E2/38 E1/39 E1/41 E8/47 E2/48
[Backend] [Main] DEBUG: [_build_context_str] Converting to user format: 63 entities, 173 relations, 11 chunks
[Backend] [Main] DEBUG: [convert_to_user_format] Formatted 11/11 chunks
[Backend] [Main] DEBUG: [_build_context_str] Final data after conversion: 0 entities, 0 relationships, 0 chunks
[Backend] [Main] DEBUG: [_build_query_context] Context length: 142338
[Backend] [Main] DEBUG: [_build_query_context] Raw data entities: 63, relationships: 173, chunks: 11
[Backend] [Main] DEBUG: [kg_query] Sending to LLM: 28,518 tokens (Query: 8, System: 28510)
[Backend] [Main] DEBUG: Flattened cache hit(key:hybrid:query:8379bcf5f8b06912e43a515a315b777f)
[Backend] [Main] INFO:  == LLM cache == Query cache hit, using cached response as query result
[Backend] [Chat.chat_agent] INFO: RAG retrieved 296 chars
[Backend] [Chat.chat_agent] DEBUG: LLM Input [chat_agent:chat_stream] system=225chars, user=18chars
[Backend] [Chat.chat_agent] DEBUG: LLM Output [chat_agent:chat_stream] response=203chars
[Backend] [ChatAPI] INFO: Chat completed: session=chat_1770560814697_5c6d5d33, 1141 chars
[Backend] INFO:     connection closed

### Additional Information

- DeepTutor Version: 0.6.0
- Operating System: Ubuntu 22.04
- Python Version: 3.10
- Node.js Version: 
- Browser (if applicable): google chrome
- Related Issues:


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]:因为错误导致知识库查询失败后，即使错误修复，使用相同的查询词也会继续失败 #193

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

Related Module

Configuration Used

Logs and screenshots

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]:因为错误导致知识库查询失败后，即使错误修复，使用相同的查询词也会继续失败 #193

Description

Do you need to file an issue?

Describe the bug

Steps to reproduce

Expected Behavior

Related Module

Configuration Used

Logs and screenshots

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions