Skip to content

feat(plugins): add Cognee memory provider plugin#1

Closed
nik1t7n wants to merge 236 commits into
mainfrom
cognee-memory-plugin
Closed

feat(plugins): add Cognee memory provider plugin#1
nik1t7n wants to merge 236 commits into
mainfrom
cognee-memory-plugin

Conversation

@nik1t7n

@nik1t7n nik1t7n commented May 11, 2026

Copy link
Copy Markdown
Owner

What changed

  • New plugin: plugins/memory/cognee/ — Cognee memory provider with 4 files (plugin.yaml, init.py, client.py, cli.py)
  • Tests: tests/plugins/memory/test_cognee_provider.py — 28 tests covering all tools, lifecycle, background flows, and CLI integration
  • Documentation: plugins/memory/cognee/README.md — setup guide, configuration, env vars, and tips

Plugin files

File Purpose
plugin.yaml Manifest, name/version/dependencies, lifecycle hooks
__init__.py CogneeMemoryProvider(MemoryProvider) — tool dispatch, background prefetch, sync_turn, on_session_end
client.py Async Cognee wrapper with event loop management and env var injection
cli.py hermes memory cognee subcommand

Why

Cognee is an open-source memory control plane that stores facts as a knowledge graph — not just vectors. It extracts entities, builds associations between them, and enables graph-based associative recall that traditional vector-only stores cannot provide.

Hermes currently has 8 memory providers (mem0, honcho, supermemory, etc.), but none support graph-based memory. Adding Cognee gives users:

  1. Semantic + graph recall — find related memories through graph traversal, not just embedding similarity
  2. Automatic entity extraction — facts are decomposed into nodes and edges on storage
  3. Multiple backends — works with DeepSeek, Gemini, OpenAI, and any OpenAI-compatible LLM
  4. Background flows — async prefetch before each turn, async sync after each response
  5. Self-improvement — Cognee can bridge and enrich stored memories automatically

How to test

# Run Cognee-specific tests (28 tests)
python -m pytest tests/plugins/memory/test_cognee_provider.py -v

# Verify existing memory providers are unaffected
python -m pytest tests/plugins/memory/ -v --ignore=tests/plugins/memory/test_cognee_provider.py

Checklist

  • 28/28 new tests passing
  • Conventional commit: feat(plugins): add cognee memory provider plugin
  • All 4 plugin files in plugins/memory/cognee/
  • Tests in tests/plugins/memory/test_cognee_provider.py
  • README.md with setup guide
  • No existing tests broken

Signed-off-by: Nikita Nosov 20nik.nosov21@gmail.com

Ofer LaOr and others added 30 commits May 9, 2026 00:57
Self-review follow-up: handlePauseResume read job.state directly while
the rest of the page goes through getJobState(), which falls back to
the enabled flag when state is null/undefined. With the backend
normalizer in this PR, state is always populated on the wire, so this
has no observable effect today — but using the helper keeps the page
consistent and resilient against older Hermes backends that don't run
the normalizer.
…alvage (NousResearch#22409)

Adds jhin.lee@unity3d.com → leehack so contributor_audit.py strict
mode passes when the salvage of NousResearch#22053 (telegram DM topic reply
fallback) lands on main.
The send path uses Hermes' reply-anchor fallback for DM topic lanes
(message_thread_id + reply_to_message_id), but send_chat_action only
accepts message_thread_id — Telegram's Bot API 10.0 rejects it for
these lanes. Without this short-circuit, every typing tick (~every 2s
during agent runs) makes a doomed API call that gets logged as a
'thread not found' debug warning. Skip the call entirely when the
metadata indicates a DM topic reply-fallback lane; the user-visible
behavior is unchanged (no typing indicator either way for these
lanes), but the logs stay clean.

Identified during salvage review of NousResearch#22053.
…ousResearch#22248 (NousResearch#22416)

When an auxiliary LLM provider (or an upstream proxy) returns a non-JSON
body with `Content-Type: application/json` — e.g. an HTML 502 page from a
misconfigured gateway — the OpenAI SDK's `response.json()` raises a raw
`json.JSONDecodeError` (or wraps it in `APIResponseValidationError` whose
message contains "expecting value"). Previously this fell through to the
unknown-error branch and entered a 60s cooldown without retrying on the
main model, dropping the middle conversation turns instead.

This change folds JSON-decode detection into the existing fast-path
fallback chain: detect by `isinstance(e, JSONDecodeError)` OR substring
match for "expecting value", retry once on the main model, and use a
shorter 30s cooldown when already on main (the body shape tends to flip
back to valid quickly when the upstream proxy recovers).

The three duplicated fallback bodies (model-not-found, unknown-error,
JSON-decode) are consolidated into a single `_fallback_to_main_for_compression`
helper that handles the shared bookkeeping (record aux-model failure for
`/usage`-style callers, clear summary_model, clear cooldown).

Also adds three unit tests covering: raw `JSONDecodeError` retries on main,
substring-match for wrapped exceptions, and the 30s cooldown when already
on main.

Salvage of NousResearch#22248 by @0xharryriddle. Closes NousResearch#22244.

Co-authored-by: Harry Riddle <ntconguit@gmail.com>
…/user are set

OpenViking 0.3.x requires X-OpenViking-Account and X-OpenViking-User headers for ROOT API key requests to tenant-scoped APIs. Previously the `!="default"` guard skipped these headers when account/user were the literal string "default", causing INVALID_ARGUMENT errors.

Remove the `!="default"` guard so headers are sent whenever account/user are truthy. Empty strings are still correctly skipped since `""` is falsy.

Update tests to reflect the new behavior:
- test_viking_client_headers_send_tenant_when_default: asserts "default" headers ARE present
- test_viking_client_headers_send_tenant_when_empty_falls_back_to_default: asserts "default" headers ARE present from constructor fallback

Based on NousResearch#21775 by @happy5318
…oups (NousResearch#22423)

Telegram forum supergroups address the General topic as
`message_thread_id="1"` on incoming updates, but the Bot API rejects
sends with `message_thread_id=1` ("Message thread not found"). The
gateway adapter has a `_message_thread_id_for_send` helper that maps
"1" to None for that reason; the standalone `_send_telegram` helper
used by the `send_message` tool never got the same mapping, so any
`send_message` call to a Topics-enabled group's General topic
(target shape `telegram:<chat_id>:1`) failed with "Message thread
not found."

Reuse the adapter's helper when available, with an explicit fallback
to the same mapping for environments where the adapter import path
fails (e.g. python-telegram-bot missing in this venv).

Fixes NousResearch#22267
…search#22043)

SQLite's WAL mode requires shared-memory (mmap) coordination and fcntl
byte-range locks that don't reliably work on network filesystems. Upstream
documents this explicitly:
  https://www.sqlite.org/wal.html#sometimes_queries_return_sqlite_busy_in_wal_mode

On NFS / SMB / some FUSE mounts / WSL1, 'PRAGMA journal_mode=WAL' raises
'sqlite3.OperationalError: locking protocol' (SQLITE_PROTOCOL). Before
this change, every feature backed by state.db or kanban.db broke silently:
  - /resume, /title, /history, /branch returned 'Session database not
    available.' with no cause
  - gateway logged the init failure at DEBUG (invisible in errors.log)
  - kanban dispatcher crashed every 60s, driving the known migration race
    (duplicate column name: consecutive_failures, NousResearch#21708 / NousResearch#21374)

Changes:
  - hermes_state.apply_wal_with_fallback(): shared helper that tries WAL
    and falls back to DELETE on SQLITE_PROTOCOL-style errors with one
    WARNING explaining why
  - hermes_state.get_last_init_error() + format_session_db_unavailable():
    capture the init failure cause and surface it in user-facing strings
    (with an NFS/SMB pointer for 'locking protocol')
  - hermes_cli/kanban_db.connect(): use the shared helper
  - gateway/run.py: bump SessionDB init failure log DEBUG -> WARNING
    (matches cli.py's existing correct behavior)
  - cli.py (4 sites) + gateway/run.py (5 sites): replace bare
    'Session database not available.' with format_session_db_unavailable()

Tests: 12 new tests in tests/test_hermes_state_wal_fallback.py + 1 new
test in tests/hermes_cli/test_kanban_db.py. Existing suites (state,
kanban, gateway, cli) remain green for all tests unrelated to pre-existing
failures on main.

Evidence: real-world user on NFSv3 mount (172.26.224.200:d2dfac12/home,
local_lock=none) reporting 'Session database not available.' on /resume;
'locking protocol' appears in 4 distinct log entries across backup,
kanban, TUI, and CLI paths in the same session.

closes NousResearch#22032
Maps egitimviscara@gmail.com to GitHub login uzunkuyruk so that
contributor_audit.py recognizes their authored commits in upcoming
salvage PRs (e.g. NousResearch#21933 fix).
Recover delegate_task batch inputs when open-weight models emit tasks as a JSON-encoded array string, and return clear errors for malformed task lists.

Co-authored-by: Cursor <cursoragent@cursor.com>
When _coerce_json fails to parse a string as JSON or parses to the wrong
type, log a clear WARNING instead of silently returning the original
value. When coerce_tool_args wraps a bare string into a single-element
list AND the string looks like a JSON array (starts with '['), warn
that the model likely emitted a JSON-encoded string instead of a
native array.

This improves diagnostics for the open-weight model output drift
described in NousResearch#21933 (JSON-array-as-string), as well as any other tool
whose array-typed argument arrives stringified through
handle_function_call.

Note: delegate_task does NOT go through coerce_tool_args (it is in
_AGENT_LOOP_TOOLS and dispatched directly from run_agent.py with raw
function_args from json.loads). The actual delegate_task fix for NousResearch#21933
is the previous commit. These logging changes apply to all other
array-typed arguments coerced via the shared pipeline.

Salvaged from PR NousResearch#22092.
WebUI sessions construct AIAgent(platform="webui") but PLATFORM_HINTS
had no "webui" entry, so the agent received no platform hint at all.
The WebUI frontend supports rich MEDIA:/absolute/path previews for
images, audio, video, PDF, HTML, CSV, diffs, and Excalidraw, but
without a hint the agent either ignores MEDIA: or falls back to
Markdown image syntax which silently fails for local files.

Add a webui hint that documents the MEDIA: render path and warns
against ![alt](/path) for local files.

Fixes NousResearch#21883
`ToolCall.extra_content` was annotated `Optional[Dict[str, Any]]`,
but neither `Optional` nor `Dict` are imported at the top of
`agent/transports/types.py` — only `Any` is.  The rest of the file
consistently uses PEP 604 / 585 syntax (e.g. `str | None`,
`dict[str, Any] | None`).

The file has `from __future__ import annotations`, so the missing
names don't crash class definition.  But the annotation IS evaluated
when anything calls `typing.get_type_hints(ToolCall)` —
introspection raises `NameError: name 'Optional' is not defined`.

ruff catches it cleanly:

    F821 Undefined name `Optional`  agent/transports/types.py:65:32
    F821 Undefined name `Dict`      agent/transports/types.py:65:41

Switch the annotation to `dict[str, Any] | None` to match the
rest of the file's style.  No new imports needed.

Verified:
  - ruff F-checks now pass on the file
  - `typing.get_type_hints(ToolCall)` succeeds where it raised before
  - 166/166 tests in tests/agent/transports/ pass on Windows + Python 3.12
Adds five regression tests for the Format 3 (Cloud Run relay) envelope
path:

- test_relay_flat_honors_declared_sender_type_bot: BOT sender_type
  propagates to msg['sender']['type'].
- test_relay_flat_defaults_sender_type_human_when_absent: backward
  compat \u2014 missing field still flows as HUMAN.
- test_relay_flat_coerces_unknown_sender_type_to_human: defensive
  coercion \u2014 strip+upper normalizes whitespace/case, anything outside
  {HUMAN, BOT} falls back to HUMAN.
- test_relay_flat_bot_sender_is_filtered_end_to_end: end-to-end
  through _on_pubsub_message \u2014 a relay envelope with sender_type=BOT
  is dropped by the BOT self-filter without dispatch.
- test_relay_flat_human_sender_dispatches: end-to-end negative
  control \u2014 human relay envelopes still reach the agent loop.

Also clarifies the operator contract in the adapter comment: the
relay must forward upstream sender.type as envelope.sender_type,
otherwise bot replies forwarded as HUMAN cannot be distinguished
from genuine humans by this filter.
Comments are injected into the next worker's system prompt by
build_worker_context() as '**{author}** (timestamp): {body}'. The
previous code accepted args['author'] as a free-form override and
exposed it on KANBAN_COMMENT_SCHEMA, which let a worker:

  1. Receive a prompt-injection in a malicious task body.
  2. Call kanban_comment with author='hermes-system' (or any other
     authoritative-looking name) on a sibling task.
  3. The next worker assigned to that sibling task sees the forged
     comment in its boot context as what reads like a system-authored
     directive.

Always derive author from HERMES_PROFILE (the dispatcher already sets
this per worker at hermes_cli/kanban_db.py:3718), and remove the
'author' property from the tool schema so the LLM can't see the
override surface.

Cross-task commenting itself remains unrestricted (see NousResearch#19713) —
comments are the deliberate handoff channel between tasks; only the
author-override surface is closed.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
- Renames test_comment_custom_author -> test_comment_ignores_caller_supplied_author
  and inverts its assertion: an args['author'] override is silently
  ignored; the author always comes from HERMES_PROFILE.
- Adds test_comment_schema_omits_author_override to assert the
  'author' property is gone from KANBAN_COMMENT_SCHEMA so the
  forgery surface stays closed if someone re-adds the schema field
  by accident.
- Adds test_worker_can_comment_on_foreign_task to pin the NousResearch#19713
  policy decision: cross-task commenting must remain unrestricted.
  Without this guard, a future change accidentally adding
  _enforce_worker_task_ownership to _handle_comment would close the
  documented handoff channel between tasks.
Maps zhekinmaksim@gmail.com to GitHub login Zhekinmaksim so
contributor_audit.py recognizes their authored commit in the
upcoming NousResearch#21930 salvage PR.
… contexts

Follow-up to PR NousResearch#21293 (cli.py), which fixed the same anti-pattern.
`asyncio.get_event_loop()` is documented as effectively "always returns
the running loop when called from a coroutine" and emits
DeprecationWarning/RuntimeWarning in some interpreter configurations.
The Python docs explicitly recommend get_running_loop() inside coroutines.

Replaces the remaining 9 call sites that are unconditionally inside
async def bodies:

- tools/browser_cdp_tool.py — _cdp_call() (4 sites): deadline + remaining
  computations inside the async websockets.connect context manager.
- hermes_cli/web_server.py — get_status, _start_device_code_flow,
  submit_oauth_code (3 sites): all FastAPI async endpoints offloading
  blocking httpx / PKCE work to run_in_executor.
- environments/agent_loop.py — HermesAgentLoop (1 site): tool dispatch
  inside the async rollout loop.
- environments/benchmarks/terminalbench_2/terminalbench2_env.py —
  rollout_and_score_eval (1 site): test verification thread offload.

All 9 sites are unconditionally inside async def bodies, so a running
loop is guaranteed and no try/except RuntimeError fallback is needed
(unlike the cli.py case in NousResearch#21293, which ran from a background thread).

Behavior is identical on supported Python versions; aligns the codebase
with the post-NousResearch#21293 idiom and avoids future warnings as the deprecation
hardens.

Salvaged from PR NousResearch#21930 by @Zhekinmaksim onto current main (the
original branch was 109 commits behind and carried unintended
stale-branch reverts of unrelated landed changes — _tail_lines
encoding=utf-8 and the Windows PTY bridge guard). Only the 9 swaps
from the PR's intended scope are applied here.
Maps obafemiferanmi1999@gmail.com (the commit-author email used on
PR NousResearch#21473's branch) to GitHub login KvnGz (the PR/branch owner) so
contributor_audit.py recognizes the authored commit in the upcoming
salvage PR.
Three tests in tests/agent/test_auxiliary_config_bridge.py read
in-tree source files (gateway/run.py and cli.py) via
Path.read_text() with no encoding argument.  The default falls
back to the system locale, which on Western Windows installs is
cp1252, and the read fails as soon as the source contains any
byte that isn't valid cp1252 (e.g. an em-dash in a comment):

    UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f
    in position 41190: character maps to <undefined>

Linux CI doesn't catch this because the default Linux locale is
UTF-8.  Windows contributors hit it on every run of the test suite.

Pin encoding="utf-8" on the three call sites that read repo
source files.  This matches the existing precedent in
hermes_cli/doctor.py:363, where the same pattern (with an
explanatory comment) was applied to fix the .env read on
non-UTF-8 Windows locales.

Affected tests now pass on Windows + Python 3.12:
  - TestGatewayBridgeCodeParity.test_gateway_has_auxiliary_bridge
  - TestGatewayBridgeCodeParity.test_gateway_no_compression_env_bridge
  - TestCLIDefaultsHaveAuxiliaryKeys.test_cli_defaults_can_merge_auxiliary
Plugin platforms (IRC, Teams, Google Chat) currently fail with
`No live adapter for platform '<name>'` when a `deliver=<plugin>` cron
job runs in a separate process from the gateway, even though the
platforms are eligible cron targets via `cron_deliver_env_var` (added
in NousResearch#21306). Built-in platforms (Telegram, Discord, Slack, etc.) use
direct REST helpers in `tools/send_message_tool.py` so cron can deliver
without holding the gateway in the same process; plugin platforms
historically depended on `_gateway_runner_ref()` which returns `None`
out of process.

This change adds an optional `standalone_sender_fn` field to
`PlatformEntry` so plugins can register an ephemeral send path that
opens its own connection, sends, and closes without needing the live
adapter. The dispatch site in `_send_via_adapter` falls through to the
hook when the gateway runner is unavailable, with a descriptive error
when neither path applies. The hook is optional, so existing plugins
are unaffected.

Reference migrations land in the same change for IRC, Teams, and
Google Chat, exercising the hook across stdlib (asyncio + IRC protocol),
Bot Framework OAuth client_credentials, and Google service-account
flows respectively.

Security hardening on the new code paths:
* IRC: control-character stripping on chat_id and message body to
  block CRLF command injection; bounded nick-collision retries; JOIN
  before PRIVMSG so channels with the default `+n` mode accept the
  delivery.
* Teams: TEAMS_SERVICE_URL validated against an allowlist of known
  Bot Framework hosts (`smba.trafficmanager.net`,
  `smba.infra.gov.teams.microsoft.us`) to block SSRF; chat_id and
  tenant_id constrained to the documented Bot Framework character set;
  per-request timeouts so a slow STS endpoint cannot starve the
  activity POST.
* Google Chat: chat_id and thread_id validated against strict
  resource-name regexes; service-account refresh wrapped in
  `asyncio.wait_for` so a hung token endpoint cannot stall the
  scheduler.

Test coverage: 20 new tests covering happy path, missing-config errors,
network failure modes, and each defensive validation. Existing tests
unchanged. `bash scripts/run_tests.sh tests/tools/test_send_message_tool.py
tests/gateway/test_irc_adapter.py tests/gateway/test_teams.py
tests/gateway/test_google_chat.py` reports 341 passed, 0 regressions.

Documentation: new "Out-of-process cron delivery" section in
website/docs/developer-guide/adding-platform-adapters.md and an entry
in gateway/platforms/ADDING_A_PLATFORM.md naming the hook.
…lone-all

When the source profile is the default (~/.hermes), shutil.copytree()
was copying multi-GB infrastructure alongside the ~40 MB of actual
profile data: hermes-agent/ (repo checkout + 3 GB venv), .worktrees/,
profiles/ (sibling profiles — recursive!), bin/ (installed binaries),
node_modules/ (hundreds of MB).

Add _CLONE_ALL_DEFAULT_EXCLUDE_ROOT frozenset with these five entries
and pass an ignore callback to copytree().  Exclusions are gated on
the source actually being the default profile (is_default_source) so
named-profile sources are never affected.

Also exclude at any depth: __pycache__/, *.pyc, *.pyo, *.sock, *.tmp.
Profile data (config.yaml, .env, auth.json, state.db, sessions/,
skills/, logs/) is preserved intact — clone-all means 'complete
snapshot minus infrastructure'.

Mirrors the approach already used by _default_export_ignore() and
_DEFAULT_EXPORT_EXCLUDE_ROOT (the export-side exclusion set which is
broader because it produces a portable archive, not a live clone).

Co-authored-by: MustafaKara7 <karamusti912@gmail.com>
Co-authored-by: fahdad <30740087+fahdad@users.noreply.github.com>
Fixes NousResearch#5022
Based on PRs NousResearch#5025, NousResearch#5026, and NousResearch#21728
teknium1 and others added 26 commits May 10, 2026 15:20
…esearch#23435)

xAI's Responses API returns HTTP 400 ("Model X does not support
parameter reasoningEffort") for grok-4, grok-4-0709, grok-4-fast-*,
grok-4-1-fast-*, grok-3, grok-4.20-0309-*, and grok-code-fast-1 — even
though those models reason natively. Hermes was unconditionally sending
`reasoning: {effort: 'medium'}` to xAI for every Grok model, breaking
direct `--provider xai` for the entire grok-4 line.

Add a substring allowlist predicate (verified live against api.x.ai
2026-05-10) covering the only Grok families that accept the effort dial:
grok-3-mini*, grok-4.20-multi-agent*, grok-4.3*. The Responses transport
omits the `reasoning` key entirely for everything else while still
including `reasoning.encrypted_content` so we capture native reasoning
tokens.

Verified end-to-end: `hermes chat -q hi --provider xai --model grok-4-0709`
went from HTTP 400 to a successful reply.
…ousResearch#23436)

* docs(user-stories): add 116 stories from Discord archive

Mined teknium1/nous-discord-archive for first-person user stories that match
the existing collage voice ('I run X every day', 'my family uses Hermes for
Y', 'so I built Z'). Skipped pure project pitches, Q&A, install help, and
generic announcements.

- Added 'discord' as a source in UserStoriesCollage (label + brand color)
- Added 116 entries to userStories.json (237 total, up from 121)
- Each entry links back to the discord-archive thread or channel archive file

* docs(user-stories): interleave discord stories across the full collage

Shuffle userStories.json with a fixed seed so the 116 Discord-sourced
entries are mixed evenly with the existing 121 entries instead of
appearing as a contiguous block at the end. Even distribution: 10-16
discord entries per decile across the array (ideal would be ~11).
Workers running slow models (e.g. kimi-k2.6) can spend longer than
DEFAULT_CLAIM_TTL_SECONDS inside a single tool-free LLM call, making
no tool calls and therefore not heartbeating. release_stale_claims
previously reclaimed these healthy workers, producing the
spawn-then-immediately-reclaim loop reported in NousResearch#23025.

When a stale-by-TTL claim's host-local worker PID is still alive,
extend the claim (emit a claim_extended event) rather than killing
it. enforce_max_runtime / detect_crashed_workers remain the upper
bounds for genuinely wedged or dead workers. Reclaim events now also
record claim_expires, last_heartbeat_at, worker_pid, and host_local
so operators can see why a worker was killed.
Sub-issue 5 of NousResearch#22034.

Right-click on the composer always pasted from the clipboard, even when
the user had highlighted text — diverging from terminal-native behavior
(xterm/iTerm/gnome-terminal) where right-click copies an active selection
and only pastes when nothing is selected.

Extract a small pure helper, decideRightClickAction(value, range), and
route the existing onMouseDown right-click branch through it. Selection
present and non-empty -> writeClipboardText(slice). Otherwise fall back
to the existing emitPaste path.
…(carve-out of NousResearch#7404)

The auto-reset notice ("◐ Session automatically reset…") was being sent
with metadata=getattr(event, 'metadata', None), which can drop or
mis-route in Telegram forum topics: the event's metadata isn't
guaranteed to carry the originating thread_id, so the notice could leak
into General or another topic.

Use the existing self._thread_metadata_for_source(source) helper, which
already handles thread_id construction plus the Telegram DM topic
reply-fallback shape used everywhere else in the gateway.

Carve-out of NousResearch#7404. The PR's other hunk (line 7578, queued first
response) is already redundant on main — gateway/run.py:15782 has used
_status_thread_metadata since the _thread_metadata_for_source plumbing
landed.

Closes NousResearch#7355 (path B; paths A and C closed via prior salvage merges).
When kanban_complete rejects a created_cards list as hallucinated, the
task is intentionally left in-flight (the gate runs before the write
txn) so the worker can retry with a corrected list or pass
created_cards=[] to skip the check. The retry path already worked, but
the previous error wording read like a terminal failure and workers
were observed abandoning the run instead of trying again.

Spell out the recovery path explicitly in the tool_error response
("Your task is still in-flight ... Retry kanban_complete with ...") and
add regression coverage at both the kernel and tool layers so the
retry contract — and the wording the worker depends on to discover
it — is pinned.

Fixes NousResearch#22923
…sResearch#23454)

Slash commands (/clear, /new, /undo, /reload-mcp) are dispatched from the
process_loop daemon thread.  prompt_toolkit.run_in_terminal returns a
coroutine that only the main-thread event loop can drive, so calling it
from a daemon thread orphans the coroutine — the input prompt never
renders and user keystrokes leak into the composer instead of the
confirmation prompt (issue NousResearch#23185).

Mirror the thread-aware guard already in _run_curses_picker: when off the
main thread, fall back to a direct input() call.  Also wrap
run_in_terminal in try/except so WSL / Warp / other emulators that
silently drop the scheduled coroutine fall back to input() too.

Tests: tests/cli/test_prompt_text_input_thread_safety.py covers main
thread (run_in_terminal path), daemon thread (direct input fallback),
no-app, run_in_terminal-raises, and EOF handling.
The stream consumer measured message length using Python's len() (Unicode
code points), but Telegram's actual limit is in UTF-16 code units. This
caused messages with supplementary characters (emoji, CJK, etc.) to exceed
Telegram's 4096-character limit, resulting in truncated messages with
formatting artifacts.

Changes:
- Add message_len_fn property to BasePlatformAdapter (defaults to len)
- Override in TelegramAdapter to return utf16_len
- Stream consumer uses adapter.message_len_fn for:
  - safe_limit calculation
  - overflow detection
  - truncate_message calls
  - split point calculation (via _custom_unit_to_cp)
  - fallback final send chunking

Fixes truncated messages with black square artifacts on Telegram when
the model generates responses containing multi-byte Unicode characters.
…esearch#11170

New TestUtf16OverflowDetection class covers two scenarios:
- test_emoji_text_exceeding_utf16_limit_triggers_overflow_split: feeds
  2200 emoji codepoints (4400 UTF-16 units) — under Telegram's
  codepoint-equivalent limit but over its UTF-16 limit. Asserts
  truncate_message was called with len_fn=utf16_len, confirming the
  consumer detected the overflow.
- test_codepoint_only_adapter_falls_back_to_len: documents that
  adapters which don't subclass BasePlatformAdapter (or test MagicMocks)
  fall back to plain len for backwards compat.

The contributor's PR shipped no tests for the UTF-16 path.
…3456)

* feat(goals): /goal checklist + /subgoal user controls

Two-phase judge for /goal — Phase A decomposes the goal into a detailed
checklist on first turn; Phase B evaluates each pending item harshly
against the agent's most recent response. The goal completes only when
every item is in a terminal status (completed or impossible). Adds
/subgoal so the user can append, complete, mark impossible, undo,
remove, or clear items the judge missed or got wrong.

Mechanics:
- GoalState gains `checklist` and `decomposed` fields, both backwards
  compatible (old state_meta rows load unchanged).
- Phase A: aux call writes a harsh, exhaustive checklist; biased toward
  more items not fewer. Falls through to legacy freeform judge when
  decompose fails.
- Phase B: judge gets the checklist + last-response snippet + path to
  a per-session conversation dump at <HERMES_HOME>/goals/<sid>.json.
  A bounded read_file tool (max 5 calls per turn, restricted to that
  one file) lets the judge inspect history when the snippet is
  ambiguous. Stickiness in code: terminal items are frozen, only the
  user can revert via /subgoal undo.
- Continuation prompt shows checklist progress when non-empty;
  reverts to old prompt when empty.
- Status line shows M/N done counts.

CLI + gateway + TUI gateway all pass the agent reference into
evaluate_after_turn so the dump can be written. Gateway-side
/subgoal is allowed mid-run since it only modifies the checklist
the judge consults at turn boundaries.

Tests: 24 new cases — backcompat round-trip, Phase A decompose,
Phase B updates + new_items + stickiness, user override flows,
conversation dump (incl. unsafe-sid sanitization), judge read_file
restriction. Existing freeform-mode tests updated to patch the
renamed `judge_goal_freeform` and skip Phase A explicitly.

* fix(goals): off-by-one in judge index, message-list plumbing, prompt tuning

Three live-test findings from running /goal end-to-end against
gemini-3-flash-preview as the judge:

1. Off-by-one bug — the judge sees the checklist rendered with 1-based
   indices ('1. [ ] foo, 2. [ ] bar') but the apply layer indexed
   state.checklist as 0-based. Result: every judge update landed on
   the wrong item, evidence got attached to neighbouring rows, and
   the genuine 'first pending' item (usually #1) never got marked.
   Fix: convert 1 → 0 in _parse_evaluate_response. Also tightened the
   user prompt to call out the 1-based scheme explicitly. New tests
   cover the parser conversion + an end-to-end fake-judge round-trip.

2. Conversation dump never happened — _extract_agent_messages tried
   common AIAgent attribute names (.messages, .conversation_history,
   etc.) but AIAgent doesn't expose the message list as an instance
   attribute; it lives inside run_conversation()'s scope. Result: the
   judge's read_file tool always saw history_path=unavailable. Fix:
   added an explicit messages= kwarg to evaluate_after_turn that all
   three call sites (CLI, gateway, TUI gateway) now pass directly.
   Agent-attribute extraction kept as back-compat fallback.

3. Prompt was too harsh on simple goals. The original 'be HARSH,
   default to leaving items pending' wording made the judge refuse
   to mark 'file exists' completed even after the agent ran ls,
   test -f, os.path.isfile, and find — burning the entire 8-turn
   budget on a fizzbuzz task. Softened to 'strict but not absurd'
   with explicit guidance on what counts as evidence and a directive
   not to require re-proving items already established earlier.

Re-tested live with the same fizzbuzz goal: now terminates in 2
turns with all 8 checklist items correctly attributed to their
own evidence. /subgoal user-action flow (add / complete / undo /
impossible) verified live as well.
Cherry-picked from PR NousResearch#10371. Two-layer defense for the spurious-thread_id
issue (NousResearch#3206):

1. _build_message_event filters DM thread_ids: only preserve thread_id
   for real topic messages (is_topic_message=True). Telegram puts
   message_thread_id on every DM that is a reply, but reply-chain ids
   route to nonexistent threads on send.

2. _send_message_with_thread_fallback helper: control sends
   (send_update_prompt, send_exec_approval / send_slash_confirm,
   send_model_picker) retry once without message_thread_id when
   Telegram returns BadRequest 'Message thread not found'. Mirrors
   the pattern PR NousResearch#3390 added for the streaming send path.

Salvage notes:
- Conflict 1 (line ~4099): merged the contributor's DM is_topic_message
  filter with the existing forum General-topic default from NousResearch#22423,
  preserving both behaviors.
- Conflict 2 (line ~1664 / 1690): kept main's delete_message (PR NousResearch#23416)
  alongside the new helper. Tightened the helper's exception catch
  from bare 'Exception' to use the existing _is_bad_request_error +
  _is_thread_not_found_error helpers (line 484-496) for consistency
  with the streaming send path.
- Widened the fix to send_update_prompt (was bare self._bot.send_message,
  same bug class).

Authored by rahimsais via PR NousResearch#10371 (re-attributed from donrhmexe@
local commit author).
Closes the architectural-pin part of NousResearch#19931. Most of what that issue
asked for is already implemented (logs under kanban root, env-pinned
workspace, dispatcher routing of unknown assignees, lifecycle
ownership, structured handoff conventions). What was missing:

1. A written contract integrators can point at when adding a new
   worker lane shape, and
2. The "code-changing workers should not auto-promote success to
   done" convention.

This commit ships both as docs+convention layered on existing primitives.
No kernel changes — the kanban_complete / kanban_block / kanban_comment
surfaces already support the review-required pattern; we just hadn't
written it down or made it visible to workers.

Changes:

- `agent/prompt_builder.py::KANBAN_GUIDANCE`: append the review-required
  exception to step 5 of the lifecycle. Workers get the cue
  auto-injected into their system prompt — drop structured metadata
  into a kanban_comment first, then end with
  kanban_block(reason="review-required: <summary>") instead of
  kanban_complete when the work needs review. Total prompt size went
  from ~3000 to ~3275 chars; well under the 4096 budget enforced by
  test_kanban_guidance_size.

- `skills/devops/kanban-worker/SKILL.md`: add a worked example to the
  existing "Good summary + metadata shapes" section between the
  Coding-task and Research-task examples. Same shape as the others
  (kanban_comment with structured handoff JSON, then kanban_block with
  the human-readable reason). Plus a one-line guide on when to use
  kanban_complete vs the review-required pattern.

- `website/docs/user-guide/features/kanban-worker-lanes.md` (new): the
  integrator-facing contract. Covers the hierarchy, the three things
  every lane must provide (assignee, spawn mechanism, lifecycle
  terminator), the env vars the dispatcher injects, the
  review-required convention, the failure modes the kernel handles
  for free, and an explicit "external CLI worker lane" deferred-
  pending-concrete-asker section that links to NousResearch#19931 and NousResearch#19924.

- `website/sidebars.ts`: link the new page under user-guide/features.

The "specialist worker lanes for external CLI tools (Codex / Claude
Code / OpenCode)" runner is NOT shipped here. The dispatcher's
spawn_fn parameter already supports plugin-shaped extension; the
per-CLI integration work (auth, sandbox policy, exit-code mapping)
needs a concrete asker. The new docs page tells would-be integrators
the contract any such lane must satisfy.

Refs NousResearch#19931
…Research#23482)

A Codex auxiliary timeout closes the underlying OpenAI client (so the
streaming hang doesn't sit until the user kills the session), but the
cached wrapper kept pointing at the now-dead transport. Subsequent
auxiliary calls (compression retry, memory flush, background review,
title generation routed via provider: main) reused that closed client
and failed fast with 'Connection error' until the gateway restarted —
even though the main agent route was healthy the whole time.

Sync `_get_cached_client` had no liveness check (async did, via loop
identity), and the connection-error fallback in `call_llm` only fired
on the auto provider path, so an explicit provider — including the
common `auxiliary.compression.provider: main` shape — never evicted.

Three fixes:

* New `_evict_cached_client_instance(target)` helper that drops the
  cache entry whose stored client is target (or wraps it via
  `_real_client`, for `CodexAuxiliaryClient`).
* `_CodexCompletionsAdapter._close_client_on_timeout` evicts the
  wrapper after closing the inner OpenAI client.
* `call_llm` and `async_call_llm` evict on `_is_connection_error`
  before re-raising, regardless of whether the provider is auto.

Net effect: one timeout costs one summary attempt + the existing 30s
compressor cooldown; the next compaction rebuilds the client and
works. Non-connection errors (4xx/5xx) do not evict, so cache hits
stay stable.

Closes NousResearch#23432
The existing _live_system_guard (PR NousResearch#23397) blocked os.kill / os.killpg
and a narrow subset of subprocess invocations. Tests still SIGTERMed the
live gateway today (May 10) because the guard had structural holes.

Plug them all:
- subprocess: also wrap getoutput, getstatusoutput
- os.system, os.popen - completely unwrapped before
- pty.spawn - completely unwrapped before
- asyncio.create_subprocess_exec / create_subprocess_shell - bypassed
  the subprocess module entirely; now wrapped
- Subprocess command inspection now looks at the WHOLE command string,
  not just tokens[0]. Catches sudo systemctl, env systemctl, bash -c
  'systemctl', setsid systemctl, /usr/bin/systemctl, etc.
- New process-killer block: pkill / killall / taskkill / fuser
  targeting hermes/python patterns is now refused
- os.kill PID 0 (own group) allowed; PID -1 (every process we can
  signal) refused
- subprocess.Popen wrapper preserves __class_getitem__ so third-party
  packages that use Popen[bytes] as a type annotation still import

Coverage is locked in by tests/test_live_system_guard_self_test.py -
exercises every primitive against a guaranteed-foreign PID and asserts
the guard fires. Adding a new kill primitive without updating the guard
breaks CI.

scripts/run_tests.sh now also force-loads ~/.hermes/pytest_live_guard.py
when present (developer-machine convenience), so even worktrees that
predate this commit get the protection on subsequent test runs through
the canonical wrapper.
Follow-up to TreyDong's fix: switch the auth header to
`X-Hermes-Session-Token` (the canonical pattern used by the rest of
the dashboard SPA — see `web/src/lib/api.ts` `fetchJSON()`). The
server still accepts both schemes, so the original `Authorization:
Bearer` form would also work; we standardize on X-header to match
every other dashboard fetch and only set the header when a token is
actually present.

Also add scripts/release.py AUTHOR_MAP entry for treydong.zh@gmail.com.
…9.5+)

Adds Telegram's native streaming-draft API as a streaming transport so DM
replies render with smooth animated previews as tokens arrive, dropping
the per-edit jitter of the legacy editMessageText polling path.

Adapter contract (gateway/platforms/base.py):
  - supports_draft_streaming(chat_type, metadata) -> bool. Default False.
    Telegram returns True only for DMs and only when the bound python-
    telegram-bot version exposes Bot.send_message_draft (PTB 22.6+).
  - send_draft(chat_id, draft_id, content, metadata) -> SendResult.
    Default raises NotImplementedError. Telegram delegates to PTB's
    send_message_draft. Drafts have no message_id (Bot API contract);
    SendResult.message_id is None on success.

Telegram adapter (gateway/platforms/telegram.py):
  - supports_draft_streaming gates on chat_type='dm' AND PTB capability.
  - send_draft trims to MAX_MESSAGE_LENGTH using utf16_len, threads
    message_thread_id through metadata, and routes failures back as
    SendResult(success=False, error=...) so the consumer can fall back.

Stream consumer (gateway/stream_consumer.py):
  - StreamConsumerConfig gains transport ('auto'|'draft'|'edit'|'off')
    and chat_type fields.
  - run() resolves _use_draft_streaming once via a probe at the top of
    the run, allocating a fresh class-wide draft_id_counter so each
    response animates as its own preview (no animation collision across
    consecutive responses to the same chat).
  - _send_or_edit gains a pre-edit branch: when drafts are active AND
    not finalizing AND no edit-path message_id is established, the
    frame routes through _send_draft_frame instead of edit_message.
    Drafts intentionally do NOT set _already_sent so the gateway's
    final sendMessage path still fires — drafts have no message_id and
    the user needs a real message in their chat history.
  - _reset_segment_state bumps the draft_id when the consumer is in
    draft mode so each text block after a tool boundary animates as a
    fresh preview below the tool-progress bubble (avoids the inter-
    tool-call leak openclaw documented in their NousResearch#32535).
  - Per-response fallback: any send_draft failure (transient network,
    server reject, capability gap) flips _use_draft_streaming to False
    for the rest of the run, gracefully returning to the edit path.

Gateway config (gateway/config.py):
  - StreamingConfig.transport default flips edit -> auto. The auto path
    is identical to edit on every chat type that doesn't currently
    support drafts (groups, supergroups, forum topics, every non-
    Telegram platform), so the default is backwards-compatible for
    non-DM users.

Lifecycle model (Telegram Bot API 9.5):
  1. sendMessageDraft(chat_id, draft_id, text='') opens the bubble.
  2. Repeated sendMessageDraft calls with the SAME draft_id animate
     the preview as text grows.
  3. Drafts have no message_id and cannot be edited or deleted.
  4. When the response finishes the gateway's normal sendMessage path
     delivers the final answer; the draft preview clears naturally on
     the client and the user sees a real message in their history.

Inspired by PR NousResearch#3412 by @NivOO5. Re-authored against current main
(stream_consumer.py is now ~4x larger than at NousResearch#3412's branch base, with
new _NEW_SEGMENT/_COMMENTARY/finalize/_on_new_message machinery the
original PR didn't account for) but the design call (DM-only, edit-
fallback, transport=auto|draft|edit|off) is faithful to the original
proposal, with two improvements baked in:

  1. Per-response draft_id (monotonic counter, not a time hash) — no
     collision risk across consecutive responses on the same chat.
  2. Tool-boundary draft_id bump — prevents the inter-tool-call leak
     openclaw hit during their rollout (their NousResearch#32535).

Closes NousResearch#21439 (duplicate feature request).
Added tests/gateway/test_stream_consumer_draft.py with 11 tests
covering:
- Transport selection: auto+dm-supported -> draft; auto+group -> edit;
  explicit edit; explicit draft on unsupported adapter -> edit;
  MagicMock adapter -> edit (back-compat for the existing test suite).
- Happy path: DM stream animates draft frames with a single shared
  draft_id, then finalizes via a regular adapter.send.
- Group fallback: drafts entirely skipped in non-DM chats.
- Failure fallback: send_draft returning success=False disables drafts
  for the rest of the response.
- Draft_id lifecycle: consecutive responses use distinct ids; tool
  boundaries bump the id so post-tool text animates fresh below the
  tool-progress bubble (the openclaw NousResearch#32535 leak guard).
- _already_sent contract: drafts must NOT set the flag so the gateway's
  fallback final-send still fires (drafts have no message_id).

Updated website/docs/user-guide/messaging/telegram.md with a
'Streaming transport' section explaining auto|draft|edit|off, the
DM-only constraint, and the per-response fallback behaviour.
Out-of-scope behavior change in NousResearch#23521 — the kanban notifier-routing fix
also flipped the 'kanban create --created-by' default from 'user' to the
active profile name. Revert to keep PR scope focused on the notifier
ownership fix; the profile-aware author default can be its own change.
Add Cognee (cognee.ai) as the 9th external memory provider for Hermes Agent.

Cognee provides vector + knowledge-graph recall with semantic search,
automatic entity extraction, and cross-session memory persistence.

The plugin implements the MemoryProvider ABC with three tools:
- cognee_remember — store durable facts (builds knowledge graph)
- cognee_recall — semantic + graph completion search
- cognee_forget — delete memories (with confirm guard)

Background flows:
- queue_prefetch: async recall before each turn (top_k=5, 8s timeout)
- sync_turn: async conversation save after each response
- on_session_end: persist last 40 messages with self-improvement

Configured via memory.provider: cognee in config.yaml.
Works with any OpenAI-compatible LLM + Gemini embeddings.
Env vars are set via apply_to_environment() using direct assignment
(not setdefault) so they survive inter-provider conflicts.
@nik1t7n nik1t7n closed this May 11, 2026
@github-actions

Copy link
Copy Markdown

🚨 CRITICAL Supply Chain Risk Detected

This PR contains a pattern that has been used in real supply chain attacks. A maintainer must review the flagged code carefully before merging.

🚨 CRITICAL: Install-hook file added or modified

These files can execute code during package installation or interpreter startup.

Files:

hermes_cli/setup.py

Scanner only fires on high-signal indicators: .pth files, base64+exec/eval combos, subprocess with encoded commands, or install-hook files. Low-signal warnings were removed intentionally — if you're seeing this comment, the finding is worth inspecting.

@github-actions

Copy link
Copy Markdown

🔎 Lint report: cognee-memory-plugin vs origin/main

ruff

Total: 0 on HEAD, 0 on base (➖ 0)

🆕 New issues: none

✅ Fixed issues: none

Unchanged: 0 pre-existing issues carried over.

ty (type checker)

Total: 8137 on HEAD, 7837 on base (🆕 +300)

🆕 New issues (177):

Rule Count
invalid-argument-type 70
unresolved-import 41
invalid-method-override 29
unresolved-attribute 22
invalid-assignment 9
not-subscriptable 2
unresolved-reference 2
invalid-return-type 1
unsupported-operator 1
First entries
gateway/platforms/wecom.py:1438: [invalid-method-override] invalid-method-override: Invalid override of method `send_voice`: Definition is incompatible with `BasePlatformAdapter.send_voice`
tests/plugins/memory/test_cognee_provider.py:78: [unresolved-import] unresolved-import: Cannot resolve imported module `cognee`
acp_adapter/session.py:624: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float | None`, found `str | list[str] | bool | Unknown`
tests/agent/test_plugin_llm.py:766: [unresolved-attribute] unresolved-attribute: Unresolved attribute `_config_cache` on type `<module 'hermes_cli.config'>`.
tests/tools/test_delegate.py:269: [invalid-argument-type] invalid-argument-type: Argument to function `delegate_task` is incorrect: Expected `list[dict[str, Any]] | None`, found `Literal["[{\"goal\": \"bad}"]`
tests/gateway/test_line_plugin.py:467: [not-subscriptable] not-subscriptable: Cannot subscript object of type `None` with no `__getitem__` method
tests/gateway/test_tts_media_routing.py:96: [invalid-assignment] invalid-assignment: Object of type `AsyncMock` is not assignable to attribute `send_voice` of type `def send_voice(self, chat_id: str, audio_path: str, caption: str | None = None, reply_to: str | None = None, metadata: dict[str, Any] | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]`
tests/run_agent/test_provider_parity.py:67: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float | None`, found `str | Unknown | int`
tests/run_agent/test_token_persistence_non_cli.py:86: [unresolved-attribute] unresolved-attribute: Unresolved attribute `session_search` on type `ModuleType`
tests/hermes_cli/test_destructive_slash_confirm_gate.py:32: [unresolved-attribute] unresolved-attribute: Attribute `get` is not defined on `str`, `list[Unknown]`, `list[str]`, `int` in union `str | dict[Unknown, Unknown] | list[Unknown] | ... omitted 25 union elements`
gateway/platforms/bluebubbles.py:531: [invalid-method-override] invalid-method-override: Invalid override of method `send_image_file`: Definition is incompatible with `BasePlatformAdapter.send_image_file`
tests/hermes_cli/test_curator_recent_run_notice.py:18: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
agent/plugin_llm.py:999: [invalid-argument-type] invalid-argument-type: Argument to function `async_call_llm` is incorrect: Expected `int`, found `int | None`
agent/plugin_llm.py:1000: [invalid-argument-type] invalid-argument-type: Argument to function `async_call_llm` is incorrect: Expected `int | float`, found `int | float | None`
tests/gateway/test_goal_verdict_send.py:143: [invalid-argument-type] invalid-argument-type: Argument to function `save_goal` is incorrect: Expected `GoalState`, found `GoalState | None`
tests/run_agent/test_stream_interrupt_retry.py:32: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float | None`, found `str | bool`
tests/gateway/test_slash_access_dispatch.py:25: [unresolved-import] unresolved-import: Cannot resolve imported module `pytest`
tests/gateway/test_slash_access.py:91: [invalid-argument-type] invalid-argument-type: Argument to bound method `SlashAccessPolicy.is_admin` is incorrect: Expected `str | None`, found `Literal[12345]`
agent/plugin_llm.py:955: [invalid-argument-type] invalid-argument-type: Argument to function `call_llm` is incorrect: Expected `int`, found `int | None`
tests/hermes_cli/test_destructive_slash_confirm_gate.py:26: [invalid-argument-type] invalid-argument-type: Method `__getitem__` of type `Overload[(i: SupportsIndex, /) -> str, (s: slice[SupportsIndex | None, SupportsIndex | None, SupportsIndex | None], /) -> list[str]]` cannot be called with key of type `Literal["destructive_slash_confirm"]` on object of type `list[str]`
tests/agent/test_plugin_llm.py:71: [invalid-argument-type] invalid-argument-type: Argument is incorrect: Expected `frozenset[Unknown] | None`, found `bool | None`
tests/gateway/test_tts_media_routing.py:86: [unresolved-attribute] unresolved-attribute: Object of type `bound method _MediaRoutingAdapter.send_voice(chat_id: str, audio_path: str, caption: str | None = None, reply_to: str | None = None, metadata: dict[str, Any] | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_not_awaited`
tests/hermes_cli/test_kanban_db.py:1441: [invalid-argument-type] invalid-argument-type: Argument to function `_safe_int` is incorrect: Expected `str | None`, found `Literal[1700000000]`
acp_adapter/session.py:624: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float`, found `str | list[str] | bool | Unknown`
tests/plugins/memory/test_cognee_provider.py:48: [unresolved-attribute] unresolved-attribute: Object of type `Self@forget` has no attribute `forgotten`
... and 152 more

✅ Fixed issues (38):

Rule Count
invalid-argument-type 15
unresolved-attribute 10
unresolved-import 6
invalid-assignment 3
unresolved-reference 2
no-matching-overload 1
call-non-callable 1
First entries
tests/gateway/test_tts_media_routing.py:101: [unresolved-attribute] unresolved-attribute: Object of type `bound method _MediaRoutingAdapter.send_voice(chat_id: str, audio_path: str, caption: str | None = None, reply_to: str | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_awaited_once_with`
agent/transports/types.py:65: [unresolved-reference] unresolved-reference: Name `Optional` used when not defined
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `bool`, found `str | list[str] | bool`
skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py:13: [unresolved-import] unresolved-import: Cannot resolve imported module `torch`
acp_adapter/session.py:609: [no-matching-overload] no-matching-overload: No overload of bound method `MutableMapping.update` matches arguments
plugins/platforms/google_chat/adapter.py:526: [unresolved-attribute] unresolved-attribute: Attribute `Credentials` is not defined on `None` in union `Unknown | None`
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int | float`, found `str | list[str] | bool`
tests/gateway/test_tts_media_routing.py:96: [invalid-assignment] invalid-assignment: Object of type `AsyncMock` is not assignable to attribute `send_voice` of type `def send_voice(self, chat_id: str, audio_path: str, caption: str | None = None, reply_to: str | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]`
tests/gateway/test_tts_media_routing.py:106: [unresolved-attribute] unresolved-attribute: Object of type `bound method _MediaRoutingAdapter.send_document(chat_id: str, file_path: str, caption: str | None = None, file_name: str | None = None, reply_to: str | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_not_awaited`
gateway/platforms/telegram.py:1258: [invalid-argument-type] invalid-argument-type: Argument to constructor `int.__new__` is incorrect: Expected `str | Buffer | SupportsInt | SupportsIndex | SupportsTrunc`, found `str | None`
tests/gateway/test_tts_media_routing.py:86: [unresolved-attribute] unresolved-attribute: Object of type `bound method _MediaRoutingAdapter.send_voice(chat_id: str, audio_path: str, caption: str | None = None, reply_to: str | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_not_awaited`
run_agent.py:12816: [invalid-argument-type] invalid-argument-type: Argument to function `_is_oauth_token` is incorrect: Expected `str`, found `str | dict[Unknown | str, Unknown | str | dict[str, str]] | Any | ... omitted 3 union elements`
gateway/platforms/base.py:2793: [invalid-assignment] invalid-assignment: Invalid subscript assignment with key of type `Literal["stop_event"]` and value of type `Event` on object of type `dict[str, dict[str, str] | None]`
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `int`, found `str | list[str] | bool`
plugins/platforms/google_chat/adapter.py:896: [unresolved-attribute] unresolved-attribute: Attribute `Unauthenticated` is not defined on `None` in union `Unknown | None`
plugins/platforms/google_chat/adapter.py:789: [unresolved-attribute] unresolved-attribute: Attribute `NotFound` is not defined on `None` in union `Unknown | None`
tests/gateway/test_tts_media_routing.py:81: [unresolved-attribute] unresolved-attribute: Object of type `bound method _MediaRoutingAdapter.send_document(chat_id: str, file_path: str, caption: str | None = None, file_name: str | None = None, reply_to: str | None = None, **kwargs) -> CoroutineType[Any, Any, SendResult]` has no attribute `assert_awaited_once_with`
run_agent.py:12819: [invalid-argument-type] invalid-argument-type: Argument to function `len` is incorrect: Expected `Sized`, found `(str & ~AlwaysFalsy) | (dict[Unknown | str, Unknown | str | dict[str, str]] & ~AlwaysFalsy) | (Any & ~AlwaysFalsy) | ... omitted 3 union elements`
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `str`, found `str | list[str] | bool`
plugins/example-dashboard/dashboard/plugin_api.py:6: [unresolved-import] unresolved-import: Cannot resolve imported module `fastapi`
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `IterationBudget`, found `str | list[str] | bool`
skills/mlops/training/trl-fine-tuning/templates/basic_grpo_training.py:15: [unresolved-import] unresolved-import: Cannot resolve imported module `datasets`
gateway/platforms/base.py:2797: [invalid-argument-type] invalid-argument-type: Argument to bound method `BasePlatformAdapter._keep_typing` is incorrect: Expected `Event | None`, found `dict[str, str] | None`
acp_adapter/session.py:623: [invalid-argument-type] invalid-argument-type: Argument to `AIAgent.__init__` is incorrect: Expected `list[str]`, found `str | list[str] | bool`
plugins/platforms/google_chat/adapter.py:2684: [call-non-callable] call-non-callable: Object of type `None` is not callable
... and 13 more

Unchanged: 4099 pre-existing issues carried over.

Diagnostics are surfaced as warnings — this check never fails the build.

nik1t7n pushed a commit that referenced this pull request May 15, 2026
…registries

Both web_search_registry._resolve() and image_gen_registry.get_active_provider()
walked their registered providers and returned the first one matching the
capability flag — without checking whether that provider was actually
usable. On a fresh install with no credentials at all, this meant
get_active_search_provider() returned `brave-free` (legacy preference
order) even though BRAVE_SEARCH_API_KEY was unset, leading the
dispatcher to surface a "BRAVE_SEARCH_API_KEY is not set" error for a
provider the user never chose. Same bug shape in image_gen for FAL.

Resolution semantics now match tools.web_tools._get_backend():

  1. Explicit config name wins, ignoring is_available() — the dispatcher
     surfaces a precise "X_API_KEY is not set" error rather than silently
     switching backends. Matches user expectation: "I configured X, tell
     me what's wrong with X."
  2. Fallback (no explicit config) walks the legacy preference order
     filtered by is_available() — pick the highest-priority backend the
     user actually has credentials for.

is_available() is wrapped in a try/except so a buggy provider doesn't
brick resolution.

E2E verified:
  - No creds + no config: get_active_search_provider() -> None
  - Explicit brave-free + no key: get_active_search_provider() -> brave-free
    (and .is_available() correctly reports False)

This fix was identified during the spike (NousResearch#25182 finding #1) and is
fold-in to the same PR rather than a follow-up.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.