Skip to content

Conversation

@ddiss
Copy link
Collaborator

@ddiss ddiss commented Dec 3, 2025

Sangeetha and I have been working on some changes to replace Dracut with
a rust-based initramfs / cpio image generator. The main reasons for this
are:

  • Some distros are moving away from using Dracut, so we can't assume it's
    locally available
  • We only use a small portion of Dracut functionality: base and systemd
    modules, and kernel / user dependency gathering for cpio
  • Dracut is slow: it forks many processes, stages all initramfs content
    and is mostly written in bash

The rewrite builds on my previous dracut-cpio implementation and adds:

  • elf dependency gathering using https://github.com/cole14/rust-elf
  • kernel module dependency gathering via native modules.dep (etc.)
    parsing (thanks @thackara !)
    • VMs still user regular kmod / modprobe.
  • basic rapido-vm and rapido-init programs, to run qemu and start autorun
    scripts
  • basic rapido.conf parsing

It's otherwise kept as minimal as possible, with rust-elf and the std
library the only major external dependencies. One single-file crosvm
argument parser is also bundled.

The preliminary benchmark results look good, particularly for initramfs
image generation (cut):

---------------+-------------------------------+-----------------------|
               | Before: rapido e4c6077        | After: rs_wip 9a90973 |
               | dracut-059+suse.769.g693ea004 |  rustc 1.91.0         |
---------------+-------------------------------+-----------------------|
simple-example |   2.389s +- 0.117             | 0.075s +- 0.001       |
cut            |                               |                       |
---------------+-------------------------------+-----------------------|
simple-example |   7.2942 +- 0.065             | 4.746s +- 0.003       |
cut+boot+exit  |                               |                       |
---------------+-------------------------------+-----------------------|
simple-network |   2.572s +- 0.135             | 0.098s +- 0.000       |
cut            |                               |                       |
---------------+-------------------------------+-----------------------|
simple-network |   7.460s +- 0.126             | 4.926s +- 0.011       |
cut+boot+exit  |                               |                       |
---------------+-------------------------------+-----------------------|

To try these changes yourself, check out this branch and run:

cargo build --offline --release
./rapido cut simple-example

You may need to change your rapido.conf file a little if you
use env variables or shell callouts.

I'm flagging this as WIP, as there are still a few things to do:

  • boot with systemd as init, instead of only rapido-init
  • clean up interface before locking it in (parameter naming, etc.)
  • think about distro packaging / path assumptions
    • at the moment, it assumes bins are in target/release/* and conf is
      in the working directory
  • improve test coverage

I don't expect to convert remaining cut scripts before merge. Dracut and
rust based functionality should be able to live side by side, although
rapido.conf parsing is much less flexible in rust: no invocations,
currently no env var expansion, variables must be wrapped in {}.

ddiss added 30 commits November 6, 2025 10:56
Crosvm's rust argument library is very small and simple, while still
providing helpful functionality. It will be consumed by dracut-cpio in a
subsequent commit.

The unmodified, BSD licensed argument.rs source is lifted as-is from
https://chromium.googlesource.com/chromiumos/platform/crosvm
(release-R92-13982.B b6ae6517aeef9ae1e3a39c55b52f9ac6de8edb31).
The one-line crosvm.rs wrapper is needed to ensure that crosvm::argument
imports continue to work.

Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust.
It provides support for a minimal set of features needed to create
performant and space-efficient initramfs archives:
- "newc" archive format only
- reproducible; inode numbers, uid/gid and mtime can be explicitly set
- data segment copy-on-write reflinks
  + using Rust io::copy()'s native copy_file_range() support[1]
  + optional archive data segment alignment for optimal reflink use[2]
- hardlink support
- comprehensive tests asserting GNU cpio binary output compatibility

1. Rust io::copy() copy_file_range()
   rust-lang/rust#75272

2. Data segment alignment
   We're bending the newc spec a bit to inject zeros after the file path
   to provide data segment alignment. These zeros are accounted for in
   the namesize, but some applications may only expect a single
   zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs
   handle this fine as long as PATH_MAX isn't exceeded.

Signed-off-by: David Disseldorp <[email protected]>
This is a workaround for GRUB2's Btrfs implementation, which doesn't
correctly handle gaps between extents.

A fix has already been proposed upstream via
https://lists.gnu.org/archive/html/grub-devel/2021-10/msg00206.html

Given that this bug is severe, it makes sense to include this minimal
workaround.

Signed-off-by: David Disseldorp <[email protected]>
This will be used for future device major/minor testing. Convert the
current fifo test to use it.

Signed-off-by: David Disseldorp <[email protected]>
This tests dracut-cpio's handling of rmajor / rminor values compared to
GNU cpio. The test requires root, due to mknod invocation for block
device node creation.

Signed-off-by: David Disseldorp <[email protected]>
dev_t -> major/minor number mapping is more complicated than the
incorrect major=(dev_t >> 8) minor=(dev_t & 0xff) mapping that we
currently perform. Fix mapping to match Linux / glibc behaviour.

Fixes: dracutdevs/dracut#1695
Reported-by: Ethan Wu <[email protected]>
Signed-off-by: David Disseldorp <[email protected]>
The previous dracut-cpio commits were applied verbatim from patches
generated from the dracut source via:
git format-patch 94fc50262f5e6c28d92782dc231fbb6c61855954^..94fc50262f5e6c28d92782dc231fbb6c61855954
git format-patch a9c67046431ccf5fd4f4c16c890695df388f0d38^..a9c67046431ccf5fd4f4c16c890695df388f0d38
git format-patch 0af11c5ea5018a3e1049a2207a9a671049651876^..0af11c5ea5018a3e1049a2207a9a671049651876
git format-patch 80e70f76d92b1a1c8e5cd10a06b70ef3f97d0899^..80e70f76d92b1a1c8e5cd10a06b70ef3f97d0899
git format-patch 8bd7ddf8197c14532cf05edac3203d08798af6f2^..8bd7ddf8197c14532cf05edac3203d08798af6f2
git format-patch acc629abb0d7a26f692f99e5a9cf8c8401bc6a86^..acc629abb0d7a26f692f99e5a9cf8c8401bc6a86

This change moves the nested src/dracut-cpio/ Cargo project to the
root, with source in src/main.rs . third_party is also moved under
src.

Signed-off-by: David Disseldorp <[email protected]>
warning: call to `.clone()` on a reference in this situation does
nothing
   --> src/main.rs:289:27
    |
289 |     let mut outpath = path.clone();
    |                           ^^^^^^^^
    |
    = note: the type `Path` does not implement `Clone`, so calling
`clone` on `&Path` copies the reference, which does not do anything and
can be removed
    = note: `#[warn(noop_method_call)]` on by default
help: remove this redundant call
    |
289 -     let mut outpath = path.clone();
289 +     let mut outpath = path;

Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio unit tests compare binary archive output with that of GNU
cpio, for the same set of input files. A recent change to upstream GNU
cpio, commit 6a94d5e ("New option --ignore-dirnlink"), causes some tests
to fail.
The failure is due to GNU cpio `--reproducible` now hardcoding directory
nlink values to 2, instead of using the st_nlink value reported by
stat().

Fix the unit tests by dropping the GNU cpio `--reproducible` alias
parameter, and instead specify `--ignore-devno --renumber-inodes`
explicitly, matching pre-6a94d5e GNU cpio `--reproducible` behaviour.

This fix has also been submitted to upstream dracut-ng.

Signed-off-by: David Disseldorp <[email protected]>
The 'std::' namespace prefix is unnecessary so drop it.

Signed-off-by: David Disseldorp <[email protected]>
In preparation for reusing the core cpio archiving library code for
rapido.

Signed-off-by: David Disseldorp <[email protected]>
The newly bundled dracut-cpio source is GPL-2.0 licensed, with and
additional BSD licensed third_party crosvm argument parsing library.
People should rely on the per-file license headers as an indicator.

Signed-off-by: David Disseldorp <[email protected]>
The compiler warns that it's unused for the normal build.

Signed-off-by: David Disseldorp <[email protected]>
Obtained from a https://github.com/cole14/rust-elf clone via:
  $ git archive --format=tgz -o rust-elf-v0.8.0.tgz v0.8.0
  $ sha256sum rust-elf-v0.8.0.tgz
    082a203a8b47a94cff85d055b03852dc69291c0d4c51d82d7b43c65fdd2a9188

The v0.8.0 tag references c4d5222a34a97e113f863f80399284767d725e28 .

Signed-off-by: David Disseldorp <[email protected]>
Use rust-elf to recursively walk through all ELF dependencies for a
given binary.

Plenty of missing bits:
- hardcoded input: only looks for 'ls' dependencies
- directory and kernel modules handling
- cpio archive generation
- lacks proper error handling
- is written by someone who doesn't know idiomatic rust (me)
  - not sure I want to learn it; heavy abstraction and explicit
    lifetimes scare me

Signed-off-by: David Disseldorp <[email protected]>
`cargo test` runs tests in parallel by default, unless the
--test-threads=1 parameter is provided. This causes the dracut-cpio
unit tests to fail due to their working directory changes.

Add a mutex and hold it over the course of the changed directory, so
that the --test-threads=1 parameter is no longer needed.

Suggested-by: Benjamin Drung <[email protected]>
Fixes: dracut-ng/dracut-ng#1702
Signed-off-by: David Disseldorp <[email protected]>
Drop the hardcoded 'ls' binary and allow users to provide a space
separated install list, e.g.
  ./rapido-cut --install "ls bash" my.initramfs

The actual cpio archive generation isn't hooked up yet.

Signed-off-by: David Disseldorp <[email protected]>
Archive entries are written out as the files are found, so ideally we
can reuse any file handle we have around from the elf parsing.
Next steps will be to use the rust-elf file-seek API, so that we're
not buffering the entire file data.

Signed-off-by: David Disseldorp <[email protected]>
rust-elf provides a seeking file API via the "std" feature, so use it
instead of buffering the entire file for parsing.

Signed-off-by: David Disseldorp <[email protected]>
We're archiving paths immediately as they're found, so we shouldn't
need to track this separately, at least not for now.

Signed-off-by: David Disseldorp <[email protected]>
Perform the stat in the caller instead, so that rapido-cut can avoid
a double stat.

Signed-off-by: David Disseldorp <[email protected]>
This should be squashed with the previous commit to avoid breaking
bisect.

Signed-off-by: David Disseldorp <[email protected]>
The main reason to split this from archive_path is to support pre-opened
file descriptors. E.g. For rapido-cut this will allow us to reuse the
elf_deps fd instead of closing and reopening to write file data to the
cpio archive.

Signed-off-by: David Disseldorp <[email protected]>
Reuse the elf_deps fd for it.

Signed-off-by: David Disseldorp <[email protected]>
Callers must now use archive_file for files and archive_path for
anything else, to allow for open-fd reuse when writing file data.
Convert dracut-cpio and hardlink callers.
A typo that I made in the initial dracut-cpio implementation sees the
stat()-reported major number used for both major and minor number in the
cpio archive.

Device major / minor numbers (as opposed to rmajor / rminor numbers) are
mostly ignored by initramfs, with the exception of hardlink association.
I've not seen any bug reports from users hitting this in the wild, but
theoretically cross-device archives could carry incorrectly colliding
major/minor/inode triplets which erroneously trigger initramfs hardlink
handling.

Signed-off-by: David Disseldorp <[email protected]>
initramfs / cpio allow for the tracking of hardlinks for nlink >= 2
entries using a combination of the inode, device major and minor
numbers.

dracut-cpio uses unique inode numbers within an archive via the global
state.ino counter. Device major/minor numbers are also renumbered, with
each unique source device obtaining a major/minor number mapped from the
index within dev_seen()/DevState array.

With archive-unique inode numbers, device major/minor mapping is
unnecessary. This change sees dracut-cpio behave the same as GNU
cpio --ignore-devno, where archive device major/minor numbers are
hardcoded to zero.

Hardlink tracking is simplified, replacing per-device HardlinkState
arrays with a global state.hls array. A hash could be used for faster
source inode+dev -> archive HardlinkState mapping, but the extra
size and complexity isn't worth it IMO, given that hardlinks should be
rare.

Signed-off-by: David Disseldorp <[email protected]>
Inode numbers are unique (for non-hardlinks) within the archive, so
device ID mapping is unnecessary. Confirm that dracut-cpio behaves like
GNU cpio --ignore-devno. Check this by archiving the /tmp directory
alongside a working-directory nested file; despite differing source
device IDs, the archived major/minor numbers should be zero.

The test is skipped if stat(/tmp) fails, or working-directory and /tmp
device ids match.

Signed-off-by: David Disseldorp <[email protected]>
This is currently all my work (at SUSE), including dracut-cpio. Using
(GPL-2.0 OR GPL-3.0) instead of GPL-2.0 only makes the rust codebase a
little more compatible with other projects, while remaining copyleft.

Signed-off-by: David Disseldorp <[email protected]>
Likely only useful for rapido.rs, but I've split it out as a separate
project at https://github.com/ddiss/kv-conf . This source corresponds to
commit 1f8d1448c613f87ed681a928c61b686914273b6e.

Signed-off-by: David Disseldorp <[email protected]>
ddiss added 30 commits December 12, 2025 10:06
str::from_utf8() retains zeros, so we need to explicitly drop the
trailer and any padding.

Signed-off-by: David Disseldorp <[email protected]>
Leap 15.6 doesn't carry it. It might be worth doing the same for modules.softdep
and modules.alias if there are systems that don't carry them.
Drop a few superfluous comments too.

Signed-off-by: David Disseldorp <[email protected]>
I used this for early testing kv-conf using my local rapido.conf, and
forgot to remove it before committing. rapido-cut should be useful for
catching rapido.conf / kv-conf issues nowadays.

Signed-off-by: David Disseldorp <[email protected]>
The current bin and lib search paths assume a usr-merge system, which
results in missing libraries on non-usr-merge systems such as Leap 15.6.

Change missing tracking to actually return an error, so that users are
aware of the search failure. In future we could add best-effort
searching via --try-install or something.

Signed-off-by: David Disseldorp <[email protected]>
This effectively reverts 0bae18c ("rapido-vm: start /init instead of
/rapido-init"). Dracut installs into /init and doesn't allow for it
to be replaced, so we need to use a different path and specify it at
VM boot time. Use "/rdinit".

We'll need to rework a few things when rapido-cut supports systemd init,
but this will do for now.

Signed-off-by: David Disseldorp <[email protected]>
Drop state entries for delayed tracking of missing items. It's cleaner
we just return an error immediately.

Signed-off-by: David Disseldorp <[email protected]>
The current (String, Option<String>) tuple is cumbersome, especially
given that dst is mostly None.
This enum should also allow us to more easily support static &str
entries.

Signed-off-by: David Disseldorp <[email protected]>
More prep for supporting static &str Gather entries.

Signed-off-by: David Disseldorp <[email protected]>
This enum entry accepts a (&'static str) member, so we can drop a whole
bunch of to_string() calls for base dependency names.

Signed-off-by: David Disseldorp <[email protected]>
Cleanup from previous commit.

Signed-off-by: David Disseldorp <[email protected]>
rapido-cut --install now aborts on missing bin/lib items. xfstests and
other test suites have a bunch of soft dependencies, where certain tests
will only run if a specific binary is available.

--try-install will archive a binary if available, or skip it if missing.
ELF dependencies for a --try-install binary are handled as mandatory: if
a binary is installed then it's reasonable to assume that the package
manager has ensured that all dependencies are available.

Signed-off-by: David Disseldorp <[email protected]>
We only have one static user (rapido-init), so there's no need to retain
a String for it. We may need to revert this (or add a separate type) if
we decide to again provide a way for users to install to an arbitrary
destination.

Signed-off-by: David Disseldorp <[email protected]>
This cleans up the lib and bin callers a bit, and puts us in a position
to support GatherEnt entries with additional custom search paths derived
from RUNPATH.

Signed-off-by: David Disseldorp <[email protected]>
This type allows extra search paths to be specified alongside library
names. It'll be used in future for ELF RUNPATH handling.

Signed-off-by: David Disseldorp <[email protected]>
Parse DT_RUNPATH alongside DT_NEEDED and ensure that any provided paths
are used alongside the default library search paths.

As an example, fstests cut scripts install fio, which depends on
librados.so. librados.so requires libceph-common.so, which lives under
/usr/lib64/ceph/ on Leap 15.6 and Tumbleweed. /usr/lib64/ceph/ is
carried as a DT_RUNPATH entry for librados.so. This change ensures that
libceph-common.so is successfully located.

We should also be able to remove the systemd library search path from
defaults, as it appears that systemd also uses DT_RUNPATH.

Signed-off-by: David Disseldorp <[email protected]>
Leap 15.6 kernel builds with CONFIG_MODPROBE_PATH="/sbin/modprobe",
which needs to exist for auto filesystem module load on mount.
Ensure that a symlink is present.
Unfortunately we can't rely on a /sbin -> /usr/sbin symlink alone,
as Leap carries a separate binaries in each dir.

Signed-off-by: David Disseldorp <[email protected]>
All systemd binaries on 15.6 and Tumbleweed carry
RUNPATH=/usr/lib64/systemd, so we probably don't need to retain it as
a default library search path.

Signed-off-by: David Disseldorp <[email protected]>
xfstests has plenty of soft dependencies, where a test may be run if a
binary (kmod, option, etc.) exists or otherwise skipped. Move a bunch
of these binaries from --install to --try-install so that we don't abort
if these soft dependencies are missing.

One other change here is that we now pull in nano instead of vim. vim
may be a symlink to gvim, which pulls in a huge amount of libraries.
nano should be a little bit lighter.

Signed-off-by: David Disseldorp <[email protected]>
resize is part of the xterm package and may not be around. There really
should be an easier way to propagate the terminal size.

Signed-off-by: David Disseldorp <[email protected]>
Add "use std::str;" where needed, and switch one
io::ErrorKind::InvalidFilename to InvalidData.
1.85 is the most recent rustc version available on Ubuntu 24.04.3.

Signed-off-by: David Disseldorp <[email protected]>
Ubuntu carries this in /etc/ld.so.conf.d/x86_64-linux-gnu.conf . We
could also parse /etc/ld.so.conf entries, but I'd really prefer not
to if we can get away with it.

Signed-off-by: David Disseldorp <[email protected]>
Ubuntu 24.04.3 LTS can run rust-based rapido using the distro kernel.
However, simple-example isn't an option due to a lack of zram, so use
simple-network instead.

Signed-off-by: David Disseldorp <[email protected]>
Signed-off-by: David Disseldorp <[email protected]>
It appears to be present on the Github VM, so see whether we can use it.
Some additional changes split out the cut / boot tests, so that we can
inspect the initramfs image before boot.

Signed-off-by: David Disseldorp <[email protected]>
Signed-off-by: David Disseldorp <[email protected]>
On Ubuntu, initramfs-tools-core carries an lsinitramfs script, which is
a wrapper for unmkinitramfs --list, which is a wrapper for GNU cpio.

Signed-off-by: David Disseldorp <[email protected]>
Use something a little shorter. I've left the src/bin/kmod/main.rs
example code as-is.

Signed-off-by: David Disseldorp <[email protected]>
Pass the mutable line buffer to kv_process, so that it can clear any
processed data and retain unprocessed multi-line portions. A (now
non-mut) map still needs to be provided for ${} variable substitution.

This should put us in a better position to support callback based kv
parsing or (perhaps?) iter().

Signed-off-by: David Disseldorp <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants