seccomp: honor sandbox.seccomp.syscalls.on_block (errno | kill | log | log_and_kill)#233
Merged
seccomp: honor sandbox.seccomp.syscalls.on_block (errno | kill | log | log_and_kill)#233
Conversation
sandbox.seccomp.syscalls.on_block is parsed and defaulted ("kill") but
never wired into filter construction — seccomp_linux.go hardcodes
ActErrno(EPERM). Spec wires it through: errno | kill | log | log_and_kill,
with kernel fast-path for silent modes and ActNotify dispatch for audit
modes. Default shifts to "errno" to preserve current runtime behavior
for existing deployments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rejects unknown on_block values at config load with a clear error. Default shifts from "kill" to "errno" so upgrades preserve the EPERM runtime behavior introduced in b670835. Adds OnBlockAction typed enum + ParseOnBlock helper for use by downstream filter construction. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Roborev flagged that the inline comment on Sandbox.Seccomp.Syscalls.OnBlock still listed only kill|log_and_kill. Updates it to reflect the full set validated in the previous commit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Propagates sandbox.seccomp.syscalls.on_block from server Config into the agentsh-unixwrap JSON payload so the wrapper can honor it when building the seccomp filter. No behavior change yet — wrapper still hardcodes EPERM until Task 3. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces the hardcoded ActErrno(EPERM) for blocked syscalls with a switch on OnBlockAction: errno -> ActErrno, kill -> ActKillProcess, log/log_and_kill -> ActNotify plus populated Filter.blockList map. The notify dispatch for log modes lands in Task 5. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Roborev flagged that on_block=log and on_block=log_and_kill install
SCMP_ACT_NOTIFY rules on block-listed syscalls, which turns the main
seccomp filter into a USER_NOTIF filter — but neither
mainFilterUsesUserNotify (wrap.go) nor hasNotifyFeatures (core.go)
recognized this, so:
- the signal filter could stack a second USER_NOTIF filter on top,
triggering the Alpine/musl notification-delivery failure documented
in TestAlpineEnvInject_BashBuiltinDisabled, and
- ptrace sync was incorrectly skipped for sessions that only used
block-list logging.
Adds a blockListUsesNotify helper (plain-string, cross-platform safe
because internal/seccomp is linux+cgo only) and wires it into both
gates. Adds four targeted subtests to TestWrap_SignalFilterUsesSessionPolicy
covering log / log_and_kill / errno-with-block / log-with-empty-block.
Second-pass roborev on f6cf673 caught that blockListUsesNotify was comparing the raw configured names — so a block-list containing only unknown-on-this-arch syscalls would still flip the gate, causing ptrace-sync to wait for a READY/GO handshake the wrapper would never send (seccomp.ResolveSyscalls silently skips unknown names). Add resolvableBlockListCount, split by build tag: on linux+cgo it routes through internal/seccomp.ResolveSyscalls; on other targets it returns 0 (seccomp isn't loaded, so ActNotify rules are never installed). blockListUsesNotify now requires at least one resolved name before counting the block-list as a USER_NOTIF feature. Adds a subtest covering the unresolved-name case.
Second-pass roborev on a4eb566 noted that disabled_by_onblock_log and disabled_by_onblock_log_and_kill would fail on non-linux or linux without cgo: resolvableBlockListCount returns 0 there, so the gate won't close — but the tests unconditionally expect it to. Gate both subtests with a runtime check on the helper itself (resolvableBlockListCount(["ptrace"]) == 0 → skip). The !linux/!cgo behavior is already covered by enabled_when_onblock_log_with_only_unknown_names, which passes on every build.
Small wrappers around pidfd_open and pidfd_send_signal used by the seccomp notify handler to deliver SIGKILL under log_and_kill mode. *Fn indirections give tests a seam for injecting ESRCH/EPERM/EINVAL. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Roborev Low finding on ea1c529: pidfd_open returns the lowest free descriptor, which can be 0 if stdin is closed. Use GreaterOrEqual so a successful syscall that happens to land at fd 0 doesn't fail the test.
handleBlockListNotify runs when a log / log_and_kill blocked syscall traps: validates the notification is still live, resolves the syscall name, kills via pidfd for log_and_kill, emits a seccomp_blocked event, and responds EPERM. Pidfd is wrapped in *Fn seams for test injection covering ESRCH / EPERM / EINVAL branches. Not yet wired into ServeNotifyWithExecve — that happens in Task 6. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… fix) Closes a TOCTOU race roborev flagged on d4e34e9. Between handleBlockListNotify's initial NotifIDValid check and attemptKill's pidfd_open, the target can exit and its PID can be recycled by an unrelated process. SIGKILL would then land on the wrong target. attemptKill now revalidates the seccomp notif id *after* pidfd_open. While the id is valid, the kernel guarantees the trapped task has not been released, so the just-opened pidfd must reference the original caller. If the recheck fails, we skip SIGKILL and report "killed" (the original target is gone regardless). The check goes through a new notifIDValidFn test seam so unit tests can drive the race path without standing up a real seccomp fd. Added TestAttemptKill_NotifIDInvalidAfterOpen to cover the branch. Also fixes the MEDIUM in blocklist_linux_test.go: TestAttemptKill_Success had its pidfd stub return a fabricated fd (42) which attemptKill would then unix.Close — potentially closing an unrelated descriptor in the test process. All attemptKill tests now open /dev/null via openDevNullFD() so deferred closes hit a real throwaway fd. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tests Roborev MED on 7eac1e1: attemptKill treated ANY notifIDValidFn error as "target is gone" and silently downgraded log_and_kill to "killed", producing inaccurate audit outcomes on unrelated failures (bad listener fd, transient ioctl error, EINVAL, etc.). Narrowed to isENOENT-only for the "killed" branch — the canonical "notif id is no longer valid" signal from the kernel. Any other error now warns and returns "denied" so the audit record reflects that we could not deliver the signal. Split the existing TestAttemptKill_NotifIDInvalidAfterOpen into two cases: _ENOENT (killed, no signal) and _UnexpectedError (denied, no signal). Also fixes roborev LOW: the test helper opened "/dev/null" literally; switch to os.DevNull so the constant is sourced from the stdlib. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tify ServeNotifyWithExecve now accepts a *BlockListConfig and routes notifications for block-listed syscall numbers to the dedicated handler. The config is built by the server from Sandbox.Seccomp.Syscalls — empty for errno/kill modes (kernel fast path), populated for log/log_and_kill modes. Warns at startup if log mode is selected but no emitter is wired. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Roborev HIGH on 8f64189: ServeNotifyWithExecve returned immediately when emit==nil, so a session configured with on_block=log_and_kill but no emitter wired would not service any seccomp notifications — block-list enforcement (SIGKILL) silently stops working. Split the entry guard: return only on nil fd (cannot operate), warn on nil emit and proceed (enforcement must still run). Downstream emit consumers (buildUnixSocketEvent, handleBlockListNotify) already nil-check internally; execve/file handlers manage their own emitters. Add TestServeNotifyWithExecve_NilEmitterAllowed as a regression guard against reintroducing the short-circuit. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Roborev MEDIUM on a4ce79b: the previous timing-only assertion would pass even if the emit==nil short-circuit were reintroduced, because the broken and fixed paths both exit within the 1s deadline. Install a buffer-backed slog handler for the test duration and assert that "NotifReceive error" appears in the captured logs. That log only fires if the loop actually called seccomp.NotifReceive on the pipe fd; the broken path returns before the loop and never emits it. Verified by temporarily reintroducing the short-circuit — test fails as expected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spawns a wrapped child with each on_block value and asserts: - errno: EPERM, child exits 0, no event - kill: SIGSYS termination, no event - log: EPERM, exactly one seccomp_blocked event with outcome=denied - log_and_kill: SIGKILL termination, exactly one event with outcome=killed - concurrent ptrace storm resolves to exactly one event (no deadlock) - default block-list resolves on the current arch (regression gate) Plus non-interference skips for file_monitor/unix_socket/signal where shared fixtures don't yet exist. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…aded callers
seccomp_notif.pid is the TID of the thread that made the blocked syscall,
not the TGID (man 2 seccomp, "struct seccomp_notif"). On kernel 6.8
(Ubuntu 24.04 target) pidfd_open(non_TGL_tid, 0) returns ENOENT because
PIDTYPE_TGID is not set on the struct pid for non-leader threads.
PIDFD_THREAD would solve this directly but requires 6.9+.
This caused log_and_kill to fail silently whenever the first notification
came from a non-leader thread: pidfd_open errored, outcome was logged as
"denied", and the target kept running past the supposedly-blocked call.
Fix: parse /proc/<tid>/status Tgid: before opening the pidfd. Resolution
is isolated behind a resolveTGIDFn test seam so unit tests can drive the
non-leader path without spawning threads. Errors on the /proc read: ENOENT
surfaces as unix.ESRCH ("target already gone", skip the kill attempt, tag
outcome killed), non-ESRCH errors fall back to TID so attemptKill still
observes whatever errno the kernel returns.
The audit event's PID field still carries the TID (which thread trapped
is forensically useful on multi-threaded agents); only the pidfd target
changes to TGID.
Regression coverage:
* blocklist_linux_test.go — three unit tests exercise main-thread,
non-leader-thread, and non-existent TID paths against the real
/proc filesystem.
* seccomp_onblock_test.go — the ptrace-storm helper now fires from 100
goroutines each pinned to a distinct OS thread via LockOSThread; no
ptrace fires from the TGL thread. A pre-fix handler would see ENOENT
on every pidfd_open and never deliver SIGKILL, so st.Signaled() would
be false and the test would fail loudly. With the fix, SIGKILL lands
and at least one event records outcome=killed.
The storm test's assertion stays at >=1 event because the kernel may
queue several notifications from sibling threads before SIGKILL
propagates; exact count is inherently racy. The load-bearing invariant
is sawKilled — that proves TGID resolution worked for a non-leader TID.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…on masking The ptrace-storm helper's goal is to guarantee every ptrace call comes from a non-leader thread — so a regression to pidfd_open(TID) would cause the test to fail loudly. Per-worker runtime.LockOSThread() on its own is not sufficient. When the main goroutine parks on wg.Wait, the Go scheduler can reuse the M that was running main (the TGL thread) for a freshly spawned worker. That worker then locks to the TGL thread and fires ptrace from it, producing a notification the broken handler CAN deliver SIGKILL to — the test would pass even against a pre-fix binary. Fix: lock the main goroutine to the TGL before spawning any workers. This reserves the TGL so no other goroutine (including worker goroutines) can execute on it (per go doc runtime.LockOSThread). Add a belt-and-braces check in each worker: after LockOSThread, if Gettid() still equals the TGL TID, abort with exit code 3 rather than silently mask a regression. This protects against future runtime changes (or unforeseen scheduling paths) that might subvert the reservation. Flagged by roborev medium on d8d7f0a. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- docs/seccomp.md: replace "immediately terminates" overview, correct the YAML example default (errno, not kill), add the four-value action table with kernel mechanism / caller effect / event outcome columns, rewrite the Audit Events stanza against the actual emitted wire format (source, typed Fields map, TID semantics for pid). - docs/spec.md §13.3: expand the one-line "immediately kills" claim into the four-mode breakdown, include the real event JSON, call out that pid is the trapping TID rather than the TGID. - internal/events/types.go: add a docblock above EventSeccompBlocked describing the Fields payload so downstream consumers know what keys and types to expect without grepping the handler implementation. No behavior change. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
sandbox.seccomp.syscalls.on_blockschema was advertised with four values but only one path was wired up: the filter always resolved toSCMP_ACT_ERRNO, regardless of config.on_block: kill,log, andlog_and_killsilently degraded toerrno.errnoandkillresolve kernel-side (SCMP_ACT_ERRNO(EPERM)/SCMP_ACT_KILL_PROCESS);logandlog_and_killroute throughSCMP_ACT_NOTIFYto a handler that emits a typedseccomp_blockedevent and (forlog_and_kill) sendsSIGKILLviapidfd_send_signal.errno). Deployments that explicitly seton_block: killexpecting the advertised schema now get a real kernel-side kill rather than a silentEPERMfallback — call-out in the changelog.Behavioral change
on_blockerrnokillSCMP_ACT_KILL_PROCESSlogseccomp_blockedaudit eventlog_and_killpidfd_send_signal+seccomp_blockedaudit eventStartup warning is emitted when
log/log_and_killis selected but no audit emitter is registered, so silent event drops don't surprise operators.Design / spec
Design doc:
docs/superpowers/specs/2026-04-15-seccomp-on-block-design.md.Notable implementation choices
seccomp_notif.pidis a TID, not a TGID (seeman 2 seccomp). On kernel 6.8 (Ubuntu 24.04 target),pidfd_open(non_TGL_tid, 0)returnsENOENT— the trapped non-leader thread's task never becomes a valid pidfd target. Without TID→TGID resolution,log_and_killwould silently fail for multi-threaded agents whose first blocked syscall came from a non-leader thread. Fix: parse/proc/<tid>/statusTgid: before opening the pidfd.PIDFD_THREADwould solve this directly but requires kernel 6.9+, below our floor.NotifIDValidandpidfd_send_signal, the target could exit and its PID could be recycled. The handler revalidates the notif id after opening the pidfd but before signalling. AnENOENTon recheck is the canonical "notif id is gone" signal — we skip the signal (pidfd may reference a reused PID) and tag outcomekilled. Any other revalidation error is NOT evidence the target exited; we refuse to signal and tagdenied.nilemitter short-circuited the entire notify loop — silently disabling enforcement. Fixed by splitting the entry guard: anilfd returns, but anilemit only warns. Regression test captures slog output to confirm the loop is actually entered.Test plan
OnBlockActionenum + validation, filter construction per mode, pidfd seams (pidfd_open ESRCH/EPERM, pidfd_send_signal ESRCH/EINVAL), TOCTOU recheck (ENOENT vs non-ENOENT), TID→TGID resolution (main thread, non-leader, non-existent), event builder schema, block-list dispatch.//go:build integration && linux): one per mode —TestSeccompOnBlock_{Errno,Kill,Log,LogAndKill}— plus a 100-goroutine storm test (TestSeccompOnBlock_LogAndKill_ConcurrentCalls) that fires ptrace exclusively from non-TGL threads, proving the TID→TGID path works under load. A pre-fix handler would see ENOENT on every pidfd_open; the storm test would failst.Signaled().go test ./...green.GOOS=windows go build ./...andGOOS=darwin go build ./...green (the feature is Linux-only; no-op stubs on the other platforms).roborev review --reasoning maximum --waitrun after every commit; all above-LOW findings addressed before proceeding.Known pre-existing failures (not introduced by this branch)
internal/ptrace/TestIntegration_{NetworkDenyConnect,FileRedirect,ConnectRedirect,ScratchPage,TracerPidNotMasked,ConnectExitRetainedTLS,WriteEscalationOnTLSConnect}— fail identically onmainwithout this branch's changes. Unrelated to seccomp on_block.internal/platform/darwin/*build failure — macOS-only packages; expected on a Linux build host.Docs
docs/seccomp.md: new "Syscall Block Actions" table, corrected YAML example default, updated Audit Events to match the real emitted wire format (source=seccomp, typed Fields map, TID semantics on pid).docs/spec.md §13.3: expanded single-line "immediately kills" claim into the four-mode breakdown with the real event JSON.internal/events/types.go: docblock aboveEventSeccompBlockedlisting field keys and meanings so consumers don't have to grep the handler.🤖 Generated with Claude Code