You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #846 fixes writable virtiofs volumes by allowing the Linux xattr syscall family in the VMM seccomp profile. That is a practical compatibility fix: host-side virtiofs handling can call fgetxattr/related syscalls while preserving file metadata such as security.capability, and blocking those syscalls SIGSYS-kills the VMM during otherwise normal writes.
However, the current seccomp granularity is VMM-process-wide. The vmm filter is applied with TSYNC to all threads in the shim/VMM process, so the xattr allowlist is available not only to the virtiofs path that needs it, but also to other threads sharing the same process boundary, such as libkrun runtime threads and embedded networking/helper threads.
Problem
This is acceptable as a short-term unblocker, but it is not the ideal long-term security shape. The current architecture forces one broad VMM seccomp profile to cover multiple responsibilities:
libkrun / core VMM execution
virtiofs host-side file serving
networking/helper runtime threads
other shim-managed runtime work
Because those components share one process-wide filter, any syscall required by one component expands the allowed syscall surface for all of them. PR #846 makes this visible with the xattr family (getxattr, setxattr, listxattr, removexattr, including l* and f* variants).
Desired direction
Refactor the libkrun/VMM runtime boundary so security policy can be applied at a finer granularity, ideally per process or per narrowly-scoped component. For example:
Keep the core VMM/libkrun process on the smallest syscall profile it actually needs.
Run virtiofs/file-serving work behind a separate process boundary with an explicit xattr-capable seccomp profile.
Keep networking/helper paths on their own profile where possible.
Avoid one shared process-wide allowlist accumulating every syscall needed by any subsystem.
Context
PR #846 fixes writable virtiofs volumes by allowing the Linux xattr syscall family in the VMM seccomp profile. That is a practical compatibility fix: host-side virtiofs handling can call
fgetxattr/related syscalls while preserving file metadata such assecurity.capability, and blocking those syscalls SIGSYS-kills the VMM during otherwise normal writes.However, the current seccomp granularity is VMM-process-wide. The
vmmfilter is applied with TSYNC to all threads in the shim/VMM process, so the xattr allowlist is available not only to the virtiofs path that needs it, but also to other threads sharing the same process boundary, such as libkrun runtime threads and embedded networking/helper threads.Problem
This is acceptable as a short-term unblocker, but it is not the ideal long-term security shape. The current architecture forces one broad VMM seccomp profile to cover multiple responsibilities:
Because those components share one process-wide filter, any syscall required by one component expands the allowed syscall surface for all of them. PR #846 makes this visible with the xattr family (
getxattr,setxattr,listxattr,removexattr, includingl*andf*variants).Desired direction
Refactor the libkrun/VMM runtime boundary so security policy can be applied at a finer granularity, ideally per process or per narrowly-scoped component. For example:
Acceptance criteria
Related