Progress: core runtime refactor checkpoints by shuxueshuxue · Pull Request #192 · OpenDCAI/Mycel

shuxueshuxue · 2026-04-03T05:58:26Z

Summary

add a minimal QueryLoop.aget_state/aupdate_state bridge for backend/web callers after the reopened ql-06 regression
cover both live caller shapes: resumed-thread __start__ appends and RemoveMessage-based repair updates
add backend-facing regression tests for _repair_incomplete_tool_calls() and get_thread_history() so the caller contract stays locked

Test Plan

uv run pytest tests/unit/test_loop.py tests/test_query_loop_backend_bridge.py -q
hostile re-review on backend :8010 reported the original caller-surface blocker no longer reproduces

- Add docs/architecture/ with 11 deep-dive docs covering CC patterns: query loop, tool execution, state/agents, security/permissions, API/prompt infra, PowerShell, plugins, settings/platform, compaction pipeline (4-layer, SM-Compact, Legacy Compact details) - Add cc-patterns.md master blueprint with LangChain mapping, implementation priority roadmap (Phase 1-5), and PARTIAL gap registry - Refactor core agent modules: chat_tool_service, delivery, service, agent runtime, registry, filesystem/search/wechat tool services - Add core/runtime/prompts.py

- Phase 1: slim system prompt — move tool usage guidance to descriptions, keep only sub-agent type routing in system prompt - Phase 2: rewrite all tool descriptions to convey non-intuitive boundary conditions (Read/Write/Edit/Glob/Grep/Bash/Agent/WebSearch/WebFetch/ TaskOutput/TaskStop/TaskCreate/tool_search/load_skill) - Phase 3: add pages param to Read schema; add line_numbers param to Grep schema and handler; add subagent_type enum to Agent schema - Phase 4: mark WebSearch/WebFetch/tool_search/load_skill/TaskGet/TaskList/ wechat_contacts as is_concurrency_safe + is_read_only - Phase 5: sub-agent tool filtering — AGENT_DISALLOWED/EXPLORE_ALLOWED/ PLAN_ALLOWED/BASH_ALLOWED constants; LeonAgent accepts extra_blocked_tools and allowed_tools; _run_agent applies per-type filters - Phase 6: add LSP placeholder to tool_catalog (deferred, default=False) - Extras: search_hint for Agent/TaskOutput/TaskStop/chat tools/wechat_send; TaskOutput marked is_read_only; Edit description adds .ipynb workaround; fix prompt caching to place cache_control on system_message content block; add forkContext parent message inheritance with _filter_fork_messages; expose set_current_messages ContextVar for sub-agent context passing

- Add --max-columns 500 to suppress minified/base64 output - Add missing VCS excludes: .svn, .hg, .bzr, .jj, .sl - Default head_limit 250 (matches CC's undocumented cap)

Registers a DEFERRED LSP tool providing code intelligence: goToDefinition, findReferences, hover, documentSymbol, workspaceSymbol. - _LSPSession: holds multilspy LanguageServer alive in a background asyncio task using start_server() context manager + Event-based lifecycle control - LSPService: lazy per-language session pool, auto-detects language from file extension, converts absolute paths to workspace-relative - Integrated into LeonAgent._init_services() with CleanupRegistry at priority 1 - Optional dep: pip install multilspy (or leonai[lsp]) - Supported: python, typescript, javascript, go, rust, java, ruby, kotlin, csharp - Language servers auto-downloaded on first use per multilspy design

- multilspy moved from optional to core dependencies (avoid restart cost) - Add 10 MB file size limit (matches CC LSP spec) - Add gitignore filtering on returned locations via git check-ignore, batched in groups of 50 (matches CC batch size) - Remove multilspy availability check from handler (always available now)

Adds 4 missing LSP operations via multilspy internal API: - goToImplementation (textDocument/implementation) - prepareCallHierarchy (textDocument/prepareCallHierarchy) - incomingCalls (callHierarchy/incomingCalls) - outgoingCalls (callHierarchy/outgoingCalls) Total supported operations: 9 (matches CC LSP tool surface). incomingCalls/outgoingCalls take the 'item' output from prepareCallHierarchy. Language auto-detected from item.uri for call hierarchy ops.

- _fmt_symbol: handle both SymbolInformation (workspaceSymbol, has location.uri) and DocumentSymbol (documentSymbol, has top-level range/selectionRange) - request_definition/references/hover/document_symbols: catch AssertionError from multilspy when server returns None (maps to empty result / no hover)

…langserver Python's Jedi server doesn't support goToImplementation or call hierarchy. Add _PyrightSession — a minimal asyncio LSP client over stdio — that talks to pyright-langserver (bundled with `pip install pyright`, already a core dep). Changes: - _PyrightSession: JSON-RPC/Content-Length stdio client, initialize handshake, textDocument/didOpen, callHierarchy/{incomingCalls,outgoingCalls}, textDocument/{implementation,prepareCallHierarchy} - Acks server-to-client requests (window/workDoneProgress/create etc.) - Keeps files open for session lifetime (required for call hierarchy) - LSPService routes Python advanced ops to pyright, other languages to multilspy - Fix _fmt_symbol: handle both SymbolInformation (workspaceSymbol) and DocumentSymbol (documentSymbol) response formats - Fix AssertionError from multilspy null responses → empty result

- pyproject.toml: add core.tools.lsp to packages list (was missing — would cause lsp tool to be absent after pip install leonai) - pyproject.toml: add pyright>=1.1.0 as core dep (required by _PyrightSession) - lsp/service.py: remove unused _wait_for_idle, _active_progress, _idle_event, _progress_started from _PyrightSession (pyright doesn't send $/progress) - plan-tool-alignment.md: replace Phase 6 placeholder with actual implementation summary (9 operations, dual-backend architecture, deps)

Language servers (multilspy + pyright) now live in a module-level _LSPSessionPool instead of per-LSPService instances. Sessions are keyed by (language, workspace_root), start lazily on first use, and survive agent restarts. Cleanup moved from CleanupRegistry to the backend lifespan finally block via `await lsp_pool.close_all()`. - Add _LSPSessionPool with asyncio.Task-based dedup for concurrent starts - Simplify LSPService to delegate all session management to lsp_pool - Remove _cleanup_lsp_service from LeonAgent and CleanupRegistry - Add lsp_pool.close_all() to backend/web/core/lifespan.py shutdown Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

shuxueshuxue · 2026-04-05T23:14:43Z

Latest closure update on current HEAD 90415ffa.

New slices landed after the previous checkpoint wave:

847f1ae5 Remove debug backdoors and fix path schemas
- deleted backend /api/debug/log
- removed frontend console.log interception + window.__debugEntries
- removed unconditional command-service debug print(...)
- simplified use-thread-stream.ts manager lifetime wiring
- fixed the real Windows CI regression from the new filesystem absolute-path schema by accepting both POSIX and Windows drive-absolute paths
90415ffa Fix ask-user question prompt identity
- AskUserQuestion answer state is now keyed by prompt position instead of rendered text, so duplicate question text no longer collides
- frontend title/message no longer present AskUserQuestion as a generic permission gate

Current verification on the integrated branch:

local Python workflow-equivalent pack
- uv run pytest tests/ --ignore=tests/test_e2e_providers.py --ignore=tests/test_sandbox_e2e.py --ignore=tests/test_daytona_e2e.py --ignore=tests/test_e2e_backend_api.py --ignore=tests/test_e2e_summary_persistence.py --ignore=tests/test_p3_e2e.py --maxfail=5 --timeout=60 -q
- 1014 passed, 44 skipped
targeted frontend proof
- vitest focused packs green
- touched-file eslint green
- cd frontend/app && npm run build green
targeted backend proof
- uv run pytest -q tests/Unit/core/test_agent_service.py -k ask_user_question tests/Integration/test_threads_router.py -k ask_user_question
- 4 passed
touched filesystem-service pyright
- 0 errors
fresh manual brutal probes still green:
- local thread m_50tMO7PmFp7f-56 -> runtime=idle, /tasks=[completed], /tmp/leon-nu56/local-agent/done.txt = NU56_LOCAL_AGENT_1775429554
- Daytona thread m_x6b9LVBMNj1l-70 -> runtime=idle, /tasks=[completed], final assistant token NU56_DAYTONA_AGENT_1775429554

Fresh GitHub / staging proof on this exact head:

CI run 24012576961 -> success
Deploy Staging run 24012576126 -> success
live staging containers are now on image tag 90415ffa64addbd8a639ab3f7d50c8ec342318ac
black-box staging check after deploy is green:
- POST /api/auth/login -> 200
- GET /api/threads/m_x6b9LVBMNj1l-21/runtime -> 200

At this point the remaining merge blocker is not CI/runtime correctness anymore; it is the required approving review gate.

shuxueshuxue · 2026-04-05T23:16:46Z

@nmhjklnm latest head 90415ffa is green on CI + staging and the remaining blocker is the required approval gate. Could you take a final pass when convenient?

shuxueshuxue · 2026-04-06T03:26:47Z

Latest closure delta on cb9262b8:

fixed AskUserQuestion stale pending lifecycle without keeping debug instrumentation
fixed Windows pricing cache/bundled models seam by making cache + bundled JSON reads/writes explicit UTF-8

Fresh proof:

local real Playwright CLI on thread m_50tMO7PmFp7f-64: rendered the AskUserQuestion card, clicked Alpha, submitted, backend /permissions cleared to [], backend detail ended with PLAYWRIGHT_ASK_OKAlpha, and the same page later rendered that terminal text too
focused local tests green
CI run 24017380550 green, including Unit Tests (windows-latest)
Deploy Staging run 24017379784 green

Checkpoint memory updated under nu-45 + nu-57.

shuxueshuxue · 2026-04-06T03:36:47Z

Latest live staging caller proof on current head cb9262b8:\n\n- real Playwright CLI on staging thread m_50tMO7PmFp7f-65\n- rendered the real AskUserQuestion card 回答问题\n- clicked Alpha, submitted\n- staging backend /api/threads/m_50tMO7PmFp7f-65/permissions then returned requests=[]\n- staging thread detail later ended with exact assistant text STAGING_PLAYWRIGHT_ASK_OK Alpha\n\nSo the AskUserQuestion caller proof is now green on both local and live staging, not only on local dev.

shuxueshuxue · 2026-04-06T04:03:23Z

superseded by #206

nmhjklnm and others added 30 commits April 1, 2026 22:48

feat(state): add three-layer state models

06d4277

feat(cleanup): add CleanupRegistry with priority ordering

7ee412e

feat(registry): add context_schema to ToolEntry

87931a9

feat(loop): implement QueryLoop replacing create_agent

4e2e25f

feat(fork): add context fork for sub-agents

b0b74a4

refactor(agent): replace create_agent with QueryLoop

e27aeb8

feat(agent-service): use context fork for sub-agent spawn

3b962d4

fix(compactor): align with CC L4b Legacy Compact design

d289d86

test: add unit tests for state/cleanup/fork/loop

914cd3d

test: add integration test for LeonAgent astream

c0d5362

fix(search): align Grep/Glob with CC ripgrep behavior

5c001d7

- Add --max-columns 500 to suppress minified/base64 output - Add missing VCS excludes: .svn, .hg, .bzr, .jj, .sl - Default head_limit 250 (matches CC's undocumented cap)

Refactor agent core through sa-04 subagent boundaries

96b6ca8

Refine subagent policy through sa-05

7aaf990

Refine sa-06 orchestration mailbox cleanup

bdb0628

Refine pt-02 tool system aggregate semantics

decd8c0

Refine pt-03 three-layer state rollup semantics

38d7451

Refine pt-04 subagent orchestration context sourcing

6f647fa

Refine pt-05 lifecycle cleanup semantics

a2f4f55

Refine pt-06 hook fan-out and prompt caching

2dec577

Tighten pt-08 framework-credit wording

03c9d3b

Refine api-01 retry and overflow recovery

c2c27d4

shuxueshuxue added 9 commits April 6, 2026 04:40

Auto-deploy staging on branch pushes

80bb966

Handle push refs in staging deploy

4327e8d

Add AskUserQuestion core interaction flow

1c4870b

Add MCP instruction delta middleware

1ebcc94

Add function-result-clearing prompt contract

84ac3e0

Remove frontend sandbox pause resume controls

3466d6a

Stabilize agent pool sync contract

4f43040

Remove debug backdoors and fix path schemas

847f1ae

Fix ask-user question prompt identity

90415ff

shuxueshuxue requested a review from nmhjklnm April 5, 2026 23:15

shuxueshuxue mentioned this pull request Apr 6, 2026

Split user-visible Resources from global monitor overview #205

Open

shuxueshuxue added 8 commits April 6, 2026 08:14

Prefer visible lease threads in resource monitor

5a6630c

Keep raw monitor truth out of resource projection

1aef5b8

Refresh resource cache on local run start

b557594

Fix Windows local metrics test patching

c23fb16

Refresh stale resource snapshots on live drift

0385da0

Abort stale thread permission fetches

ec0b2a2

Retry staging health checks during deploy

dff431d

Ignore stale permission fetches after navigation

05d12b1

shuxueshuxue mentioned this pull request Apr 6, 2026

refactor(agent): align core agent & tool system with CC engineering patterns #188

Closed

4 tasks

shuxueshuxue added 2 commits April 6, 2026 10:16

Invalidate empty cached pricing payloads

984236e

Fix ask-user state clearing and Windows pricing cache

cb9262b

mintlify bot deployed to staging - docs April 6, 2026 04:03 View deployment

shuxueshuxue closed this Apr 6, 2026

shuxueshuxue removed the deploy-staging label Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Progress: core runtime refactor checkpoints#192

Progress: core runtime refactor checkpoints#192
shuxueshuxue wants to merge 240 commits intomainfrom
pr188-agent-optimize

shuxueshuxue commented Apr 3, 2026

Uh oh!

shuxueshuxue commented Apr 5, 2026

Uh oh!

shuxueshuxue commented Apr 5, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shuxueshuxue commented Apr 3, 2026

Summary

Test Plan

Uh oh!

shuxueshuxue commented Apr 5, 2026

Uh oh!

shuxueshuxue commented Apr 5, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

shuxueshuxue commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants