seccomp: honor sandbox.seccomp.syscalls.on_block (errno | kill | log | log_and_kill) by erans · Pull Request #233 · canyonroad/agentsh

erans · 2026-04-16T00:02:49Z

Summary

The sandbox.seccomp.syscalls.on_block schema was advertised with four values but only one path was wired up: the filter always resolved to SCMP_ACT_ERRNO, regardless of config. on_block: kill, log, and log_and_kill silently degraded to errno.
This branch fully implements the four modes end-to-end: errno and kill resolve kernel-side (SCMP_ACT_ERRNO(EPERM) / SCMP_ACT_KILL_PROCESS); log and log_and_kill route through SCMP_ACT_NOTIFY to a handler that emits a typed seccomp_blocked event and (for log_and_kill) sends SIGKILL via pidfd_send_signal.
Default is unchanged (errno). Deployments that explicitly set on_block: kill expecting the advertised schema now get a real kernel-side kill rather than a silent EPERM fallback — call-out in the changelog.

Behavioral change

`on_block`	Before this PR	After this PR
unset / `errno`	EPERM	EPERM (unchanged)
`kill`	EPERM (silent fallback)	SIGSYS via `SCMP_ACT_KILL_PROCESS`
`log`	EPERM (silent fallback)	EPERM + `seccomp_blocked` audit event
`log_and_kill`	EPERM (silent fallback)	SIGKILL via `pidfd_send_signal` + `seccomp_blocked` audit event

Startup warning is emitted when log / log_and_kill is selected but no audit emitter is registered, so silent event drops don't surprise operators.

Design / spec

Design doc: docs/superpowers/specs/2026-04-15-seccomp-on-block-design.md.

Notable implementation choices

TID→TGID resolution. seccomp_notif.pid is a TID, not a TGID (see man 2 seccomp). On kernel 6.8 (Ubuntu 24.04 target), pidfd_open(non_TGL_tid, 0) returns ENOENT — the trapped non-leader thread's task never becomes a valid pidfd target. Without TID→TGID resolution, log_and_kill would silently fail for multi-threaded agents whose first blocked syscall came from a non-leader thread. Fix: parse /proc/<tid>/status Tgid: before opening the pidfd. PIDFD_THREAD would solve this directly but requires kernel 6.9+, below our floor.
TOCTOU recheck. Between NotifIDValid and pidfd_send_signal, the target could exit and its PID could be recycled. The handler revalidates the notif id after opening the pidfd but before signalling. An ENOENT on recheck is the canonical "notif id is gone" signal — we skip the signal (pidfd may reference a reused PID) and tag outcome killed. Any other revalidation error is NOT evidence the target exited; we refuse to signal and tag denied.
Per-thread emitter lifecycle. Early review flagged a regression where a nil emitter short-circuited the entire notify loop — silently disabling enforcement. Fixed by splitting the entry guard: a nil fd returns, but a nil emit only warns. Regression test captures slog output to confirm the loop is actually entered.

Test plan

Unit tests: OnBlockAction enum + validation, filter construction per mode, pidfd seams (pidfd_open ESRCH/EPERM, pidfd_send_signal ESRCH/EINVAL), TOCTOU recheck (ENOENT vs non-ENOENT), TID→TGID resolution (main thread, non-leader, non-existent), event builder schema, block-list dispatch.
Integration tests (//go:build integration && linux): one per mode — TestSeccompOnBlock_{Errno,Kill,Log,LogAndKill} — plus a 100-goroutine storm test (TestSeccompOnBlock_LogAndKill_ConcurrentCalls) that fires ptrace exclusively from non-TGL threads, proving the TID→TGID path works under load. A pre-fix handler would see ENOENT on every pidfd_open; the storm test would fail st.Signaled().
Full go test ./... green.
Cross-compile: GOOS=windows go build ./... and GOOS=darwin go build ./... green (the feature is Linux-only; no-op stubs on the other platforms).
roborev review --reasoning maximum --wait run after every commit; all above-LOW findings addressed before proceeding.

Known pre-existing failures (not introduced by this branch)

internal/ptrace/TestIntegration_{NetworkDenyConnect,FileRedirect,ConnectRedirect,ScratchPage,TracerPidNotMasked,ConnectExitRetainedTLS,WriteEscalationOnTLSConnect} — fail identically on main without this branch's changes. Unrelated to seccomp on_block.
internal/platform/darwin/* build failure — macOS-only packages; expected on a Linux build host.

Docs

docs/seccomp.md: new "Syscall Block Actions" table, corrected YAML example default, updated Audit Events to match the real emitted wire format (source=seccomp, typed Fields map, TID semantics on pid).
docs/spec.md §13.3: expanded single-line "immediately kills" claim into the four-mode breakdown with the real event JSON.
internal/events/types.go: docblock above EventSeccompBlocked listing field keys and meanings so consumers don't have to grep the handler.

🤖 Generated with Claude Code

sandbox.seccomp.syscalls.on_block is parsed and defaulted ("kill") but never wired into filter construction — seccomp_linux.go hardcodes ActErrno(EPERM). Spec wires it through: errno | kill | log | log_and_kill, with kernel fast-path for silent modes and ActNotify dispatch for audit modes. Default shifts to "errno" to preserve current runtime behavior for existing deployments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Rejects unknown on_block values at config load with a clear error. Default shifts from "kill" to "errno" so upgrades preserve the EPERM runtime behavior introduced in b670835. Adds OnBlockAction typed enum + ParseOnBlock helper for use by downstream filter construction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Roborev flagged that the inline comment on Sandbox.Seccomp.Syscalls.OnBlock still listed only kill|log_and_kill. Updates it to reflect the full set validated in the previous commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Propagates sandbox.seccomp.syscalls.on_block from server Config into the agentsh-unixwrap JSON payload so the wrapper can honor it when building the seccomp filter. No behavior change yet — wrapper still hardcodes EPERM until Task 3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replaces the hardcoded ActErrno(EPERM) for blocked syscalls with a switch on OnBlockAction: errno -> ActErrno, kill -> ActKillProcess, log/log_and_kill -> ActNotify plus populated Filter.blockList map. The notify dispatch for log modes lands in Task 5. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Roborev flagged that on_block=log and on_block=log_and_kill install SCMP_ACT_NOTIFY rules on block-listed syscalls, which turns the main seccomp filter into a USER_NOTIF filter — but neither mainFilterUsesUserNotify (wrap.go) nor hasNotifyFeatures (core.go) recognized this, so: - the signal filter could stack a second USER_NOTIF filter on top, triggering the Alpine/musl notification-delivery failure documented in TestAlpineEnvInject_BashBuiltinDisabled, and - ptrace sync was incorrectly skipped for sessions that only used block-list logging. Adds a blockListUsesNotify helper (plain-string, cross-platform safe because internal/seccomp is linux+cgo only) and wires it into both gates. Adds four targeted subtests to TestWrap_SignalFilterUsesSessionPolicy covering log / log_and_kill / errno-with-block / log-with-empty-block.

Second-pass roborev on f6cf673 caught that blockListUsesNotify was comparing the raw configured names — so a block-list containing only unknown-on-this-arch syscalls would still flip the gate, causing ptrace-sync to wait for a READY/GO handshake the wrapper would never send (seccomp.ResolveSyscalls silently skips unknown names). Add resolvableBlockListCount, split by build tag: on linux+cgo it routes through internal/seccomp.ResolveSyscalls; on other targets it returns 0 (seccomp isn't loaded, so ActNotify rules are never installed). blockListUsesNotify now requires at least one resolved name before counting the block-list as a USER_NOTIF feature. Adds a subtest covering the unresolved-name case.

Second-pass roborev on a4eb566 noted that disabled_by_onblock_log and disabled_by_onblock_log_and_kill would fail on non-linux or linux without cgo: resolvableBlockListCount returns 0 there, so the gate won't close — but the tests unconditionally expect it to. Gate both subtests with a runtime check on the helper itself (resolvableBlockListCount(["ptrace"]) == 0 → skip). The !linux/!cgo behavior is already covered by enabled_when_onblock_log_with_only_unknown_names, which passes on every build.

Small wrappers around pidfd_open and pidfd_send_signal used by the seccomp notify handler to deliver SIGKILL under log_and_kill mode. *Fn indirections give tests a seam for injecting ESRCH/EPERM/EINVAL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Roborev Low finding on ea1c529: pidfd_open returns the lowest free descriptor, which can be 0 if stdin is closed. Use GreaterOrEqual so a successful syscall that happens to land at fd 0 doesn't fail the test.

handleBlockListNotify runs when a log / log_and_kill blocked syscall traps: validates the notification is still live, resolves the syscall name, kills via pidfd for log_and_kill, emits a seccomp_blocked event, and responds EPERM. Pidfd is wrapped in *Fn seams for test injection covering ESRCH / EPERM / EINVAL branches. Not yet wired into ServeNotifyWithExecve — that happens in Task 6. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… fix) Closes a TOCTOU race roborev flagged on d4e34e9. Between handleBlockListNotify's initial NotifIDValid check and attemptKill's pidfd_open, the target can exit and its PID can be recycled by an unrelated process. SIGKILL would then land on the wrong target. attemptKill now revalidates the seccomp notif id *after* pidfd_open. While the id is valid, the kernel guarantees the trapped task has not been released, so the just-opened pidfd must reference the original caller. If the recheck fails, we skip SIGKILL and report "killed" (the original target is gone regardless). The check goes through a new notifIDValidFn test seam so unit tests can drive the race path without standing up a real seccomp fd. Added TestAttemptKill_NotifIDInvalidAfterOpen to cover the branch. Also fixes the MEDIUM in blocklist_linux_test.go: TestAttemptKill_Success had its pidfd stub return a fabricated fd (42) which attemptKill would then unix.Close — potentially closing an unrelated descriptor in the test process. All attemptKill tests now open /dev/null via openDevNullFD() so deferred closes hit a real throwaway fd. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tests Roborev MED on 7eac1e1: attemptKill treated ANY notifIDValidFn error as "target is gone" and silently downgraded log_and_kill to "killed", producing inaccurate audit outcomes on unrelated failures (bad listener fd, transient ioctl error, EINVAL, etc.). Narrowed to isENOENT-only for the "killed" branch — the canonical "notif id is no longer valid" signal from the kernel. Any other error now warns and returns "denied" so the audit record reflects that we could not deliver the signal. Split the existing TestAttemptKill_NotifIDInvalidAfterOpen into two cases: _ENOENT (killed, no signal) and _UnexpectedError (denied, no signal). Also fixes roborev LOW: the test helper opened "/dev/null" literally; switch to os.DevNull so the constant is sourced from the stdlib. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tify ServeNotifyWithExecve now accepts a *BlockListConfig and routes notifications for block-listed syscall numbers to the dedicated handler. The config is built by the server from Sandbox.Seccomp.Syscalls — empty for errno/kill modes (kernel fast path), populated for log/log_and_kill modes. Warns at startup if log mode is selected but no emitter is wired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Roborev HIGH on 8f64189: ServeNotifyWithExecve returned immediately when emit==nil, so a session configured with on_block=log_and_kill but no emitter wired would not service any seccomp notifications — block-list enforcement (SIGKILL) silently stops working. Split the entry guard: return only on nil fd (cannot operate), warn on nil emit and proceed (enforcement must still run). Downstream emit consumers (buildUnixSocketEvent, handleBlockListNotify) already nil-check internally; execve/file handlers manage their own emitters. Add TestServeNotifyWithExecve_NilEmitterAllowed as a regression guard against reintroducing the short-circuit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Roborev MEDIUM on a4ce79b: the previous timing-only assertion would pass even if the emit==nil short-circuit were reintroduced, because the broken and fixed paths both exit within the 1s deadline. Install a buffer-backed slog handler for the test duration and assert that "NotifReceive error" appears in the captured logs. That log only fires if the loop actually called seccomp.NotifReceive on the pipe fd; the broken path returns before the loop and never emits it. Verified by temporarily reintroducing the short-circuit — test fails as expected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Spawns a wrapped child with each on_block value and asserts: - errno: EPERM, child exits 0, no event - kill: SIGSYS termination, no event - log: EPERM, exactly one seccomp_blocked event with outcome=denied - log_and_kill: SIGKILL termination, exactly one event with outcome=killed - concurrent ptrace storm resolves to exactly one event (no deadlock) - default block-list resolves on the current arch (regression gate) Plus non-interference skips for file_monitor/unix_socket/signal where shared fixtures don't yet exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…aded callers seccomp_notif.pid is the TID of the thread that made the blocked syscall, not the TGID (man 2 seccomp, "struct seccomp_notif"). On kernel 6.8 (Ubuntu 24.04 target) pidfd_open(non_TGL_tid, 0) returns ENOENT because PIDTYPE_TGID is not set on the struct pid for non-leader threads. PIDFD_THREAD would solve this directly but requires 6.9+. This caused log_and_kill to fail silently whenever the first notification came from a non-leader thread: pidfd_open errored, outcome was logged as "denied", and the target kept running past the supposedly-blocked call. Fix: parse /proc/<tid>/status Tgid: before opening the pidfd. Resolution is isolated behind a resolveTGIDFn test seam so unit tests can drive the non-leader path without spawning threads. Errors on the /proc read: ENOENT surfaces as unix.ESRCH ("target already gone", skip the kill attempt, tag outcome killed), non-ESRCH errors fall back to TID so attemptKill still observes whatever errno the kernel returns. The audit event's PID field still carries the TID (which thread trapped is forensically useful on multi-threaded agents); only the pidfd target changes to TGID. Regression coverage: * blocklist_linux_test.go — three unit tests exercise main-thread, non-leader-thread, and non-existent TID paths against the real /proc filesystem. * seccomp_onblock_test.go — the ptrace-storm helper now fires from 100 goroutines each pinned to a distinct OS thread via LockOSThread; no ptrace fires from the TGL thread. A pre-fix handler would see ENOENT on every pidfd_open and never deliver SIGKILL, so st.Signaled() would be false and the test would fail loudly. With the fix, SIGKILL lands and at least one event records outcome=killed. The storm test's assertion stays at >=1 event because the kernel may queue several notifications from sibling threads before SIGKILL propagates; exact count is inherently racy. The load-bearing invariant is sawKilled — that proves TGID resolution worked for a non-leader TID. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…on masking The ptrace-storm helper's goal is to guarantee every ptrace call comes from a non-leader thread — so a regression to pidfd_open(TID) would cause the test to fail loudly. Per-worker runtime.LockOSThread() on its own is not sufficient. When the main goroutine parks on wg.Wait, the Go scheduler can reuse the M that was running main (the TGL thread) for a freshly spawned worker. That worker then locks to the TGL thread and fires ptrace from it, producing a notification the broken handler CAN deliver SIGKILL to — the test would pass even against a pre-fix binary. Fix: lock the main goroutine to the TGL before spawning any workers. This reserves the TGL so no other goroutine (including worker goroutines) can execute on it (per go doc runtime.LockOSThread). Add a belt-and-braces check in each worker: after LockOSThread, if Gettid() still equals the TGL TID, abort with exit code 3 rather than silently mask a regression. This protects against future runtime changes (or unforeseen scheduling paths) that might subvert the reservation. Flagged by roborev medium on d8d7f0a. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- docs/seccomp.md: replace "immediately terminates" overview, correct the YAML example default (errno, not kill), add the four-value action table with kernel mechanism / caller effect / event outcome columns, rewrite the Audit Events stanza against the actual emitted wire format (source, typed Fields map, TID semantics for pid). - docs/spec.md §13.3: expand the one-line "immediately kills" claim into the four-mode breakdown, include the real event JSON, call out that pid is the trapping TID rather than the TGID. - internal/events/types.go: add a docblock above EventSeccompBlocked describing the Fields payload so downstream consumers know what keys and types to expect without grepping the handler implementation. No behavior change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

erans and others added 20 commits April 15, 2026 13:35

seccomp(pidfd): tighten pidfdOpen assertion to accept fd 0

b532123

Roborev Low finding on ea1c529: pidfd_open returns the lowest free descriptor, which can be 0 if stdin is closed. Use GreaterOrEqual so a successful syscall that happens to land at fd 0 doesn't fail the test.

erans merged commit 2b36edd into main Apr 16, 2026
8 of 9 checks passed

erans deleted the feat/seccomp-on-block branch April 16, 2026 00:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

seccomp: honor sandbox.seccomp.syscalls.on_block (errno | kill | log | log_and_kill)#233

seccomp: honor sandbox.seccomp.syscalls.on_block (errno | kill | log | log_and_kill)#233
erans merged 20 commits intomainfrom
feat/seccomp-on-block

erans commented Apr 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erans commented Apr 16, 2026

Summary

Behavioral change

Design / spec

Notable implementation choices

Test plan

Known pre-existing failures (not introduced by this branch)

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant