memory: provenance tagging + injection-scanner gate (prompt-injection defence) by TinkerOfThings · Pull Request #599 · open-jarvis/OpenJarvis

TinkerOfThings · 2026-06-26T21:17:15Z

Summary

Hardens the automatic long-term-memory path against prompt injection. The fact extractor distils facts from raw chat exchanges that can contain hostile input (scraped pages, tool output, pasted content); those facts are stored and surfaced unfiltered. This adds provenance, an injection gate, and quarantine-on-surfacing — and revives the injection scanner in Rust-less environments.

Changes

Provenance — memory.store.Fact gains a trust tier, round-tripped through JSONL (legacy facts default to "" = trusted). FactStore.add/add_many accept trust.
Injection gate — MemoryService scans each exchange with InjectionScanner before extraction; an overt injection attempt is suppressed (never reaches the extraction model or the store). Scanning fails open — a scanner error can't block memory, since provenance still applies. build_memory_service wires a guarded default scanner.
Untrusted tagging — all auto-extracted facts are stored trust="untrusted".
Quarantine on surfacing — jarvis memory list shows a Trust column (⚠ untrusted) plus a data-not-instructions warning.
Scanner Python fallback — InjectionScanner hard-required the openjarvis_rust extension and crashed on construction when it wasn't built. Added a pure-Python fallback using its own _INJECTION_PATTERNS, mirroring the RUST_AVAILABLE fallback pattern used elsewhere (e.g. security.ssrf). This fixes 10 previously-failing scanner tests in Rust-less environments.

Testing

Test-first throughout; 6 new tests covering provenance round-trip, untrusted tagging, skip-on-injection, fail-open, and the CLI quarantine marker.
No regressions: affected suites went 114→104 failing (10 fixed). Remaining failures are all pre-existing openjarvis_rust-extension dependencies, unrelated to this change.

🤖 Generated with Claude Code

… defence) The automatic memory extractor distils facts from raw chat exchanges that may contain hostile input. Harden that path: - store.Fact gains a `trust` tier, round-tripped through JSONL (legacy facts default to "" = trusted). FactStore.add/add_many accept `trust`. - MemoryService stores all auto-extracted facts as trust="untrusted", and now scans each exchange with InjectionScanner BEFORE extraction — an overt injection attempt is suppressed (never reaches the extraction model or store). Scanning fails open (a scanner error can't block memory; provenance still applies). build_memory_service wires a guarded default scanner. - `jarvis memory list` surfaces a Trust column (⚠ untrusted) + a data-not-instructions warning. Also gives InjectionScanner a pure-Python fallback (its own _INJECTION_PATTERNS) when the openjarvis_rust extension isn't built — mirrors the RUST_AVAILABLE fallback pattern used elsewhere. This fixes 10 previously-failing scanner tests in rust-less environments. No regressions: affected suites 114→104 failing (10 fixed), +6 new tests; the remaining failures are all pre-existing openjarvis_rust dependencies. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ening) The JSON-array path is unchanged; the line fallback now accepts ONLY genuine list items (bullets/numbered), not arbitrary prose lines. Prevents a model from being steered into minting facts out of narrative or injected text. The bullet fallback (test_line_fallback_for_bullets) still works. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

TinkerOfThings requested review from ANarayan, jonsaadfalcon and robbym-dev as code owners June 26, 2026 21:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

memory: provenance tagging + injection-scanner gate (prompt-injection defence)#599

memory: provenance tagging + injection-scanner gate (prompt-injection defence)#599
TinkerOfThings wants to merge 2 commits into
open-jarvis:mainfrom
TinkerOfThings:memory-injection-defence

TinkerOfThings commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

TinkerOfThings commented Jun 26, 2026

Summary

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant