You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Relatively new Ubuntu install, with root on ZFS (following the guide). I've configured a few lxc containers and not much else. The rpool/ROOT dataset is on a 3-drive raidz1 and has a single snapshot fresh-install (taken shortly after the system was installed).
I checked out the zfs-2.3.1 branch to compile from source and upgrade from noble's stock 2.2.2, following the standard steps listed here. No special flags, just building the native deb-based DKMS packages. Everything compiled successfully and I then ran sudo apt-get install --fix-missing ./*.deb to install the resulting packages (after rm'ing the dracut package).
The install got most of the way through and then my entire system softlocked when apt was configuring mdadm. I was able to use ssh to issue some troubleshooting commands and noticed the 1min load avg around 2500 (!) and thousands of kworker/events_unbound threads all with a mount child process. Examining further shows ls /.zfs/snapshot/fresh-install/etc (run as part of mdadm.postinst configure -> grub-mkconfig -o /boot/grub/grub.cfg -> /etc/grub.d/10_linux_zfs) is blocking and likely triggered the issue.
After waiting for some time, I rebooted the system (userland zfs had seemingly updated successfully, but the host was still using the older 2.2.2 kernel module) and ran sudo dpkg --configure -a to finish the upgrade (and see if the issue reoccurred). The system then softlocked again at the same configuration step which eventually calls ls /.zfs/snapshot/fresh-install/etc. After rebooting once more, zfs version now shows:
zfs-2.3.1-1
zfs-kmod-2.3.1-1
and manually running ls /.zfs/snapshot/fresh-install/etc causes no issues (likely because the system is no longer on 2.2.2).
Based on the stack below zpl_snapdir_automount seems involved and overall my experience seems very similar to #6154 but I'm not familiar enough to troubleshoot further (and possibly related to #13327 as well).
Describe how to reproduce the problem
Reproduced once after rebooting and dpkg re-ran /usr/bin/perl /usr/share/debconf/frontend /var/lib/dpkg/info/mdadm.postinst configure which triggered /usr/sbin/grub-mkconfig -o /boot/grub/grub.cfg -> /etc/grub.d/10_linux_zfs and eventually ls /.zfs/snapshot/fresh-install/etc. Haven't yet tried to repro on a fresh install again.
Include any warning/errors/backtraces from the system logs
There's a known issue in 2.3.1, where its userspace programs can break or act strangeley against 2.3.0 kernel module and earlier. Since the snapshot automounter calls out to userspace for it's work, i'd put money on that being the problem here.
The issue is fixed in #17137 and scheduled for release in 2.3.2. However, it's probably not worth it to take that patch now, as you've already done the reboot.
System information
Describe the problem you're observing
Relatively new Ubuntu install, with root on ZFS (following the guide). I've configured a few lxc containers and not much else. The
rpool/ROOT
dataset is on a 3-drive raidz1 and has a single snapshotfresh-install
(taken shortly after the system was installed).I checked out the
zfs-2.3.1
branch to compile from source and upgrade from noble's stock 2.2.2, following the standard steps listed here. No special flags, just building the native deb-based DKMS packages. Everything compiled successfully and I then ransudo apt-get install --fix-missing ./*.deb
to install the resulting packages (after rm'ing the dracut package).The install got most of the way through and then my entire system softlocked when apt was configuring
mdadm
. I was able to use ssh to issue some troubleshooting commands and noticed the 1min load avg around 2500 (!) and thousands ofkworker/events_unbound
threads all with amount
child process. Examining further showsls /.zfs/snapshot/fresh-install/etc
(run as part ofmdadm.postinst configure
->grub-mkconfig -o /boot/grub/grub.cfg
->/etc/grub.d/10_linux_zfs
) is blocking and likely triggered the issue.After waiting for some time, I rebooted the system (userland zfs had seemingly updated successfully, but the host was still using the older 2.2.2 kernel module) and ran
sudo dpkg --configure -a
to finish the upgrade (and see if the issue reoccurred). The system then softlocked again at the same configuration step which eventually callsls /.zfs/snapshot/fresh-install/etc
. After rebooting once more, zfs version now shows:and manually running
ls /.zfs/snapshot/fresh-install/etc
causes no issues (likely because the system is no longer on 2.2.2).Based on the stack below
zpl_snapdir_automount
seems involved and overall my experience seems very similar to #6154 but I'm not familiar enough to troubleshoot further (and possibly related to #13327 as well).Describe how to reproduce the problem
Reproduced once after rebooting and dpkg re-ran
/usr/bin/perl /usr/share/debconf/frontend /var/lib/dpkg/info/mdadm.postinst configure
which triggered/usr/sbin/grub-mkconfig -o /boot/grub/grub.cfg
->/etc/grub.d/10_linux_zfs
and eventuallyls /.zfs/snapshot/fresh-install/etc
. Haven't yet tried to repro on a fresh install again.Include any warning/errors/backtraces from the system logs
thousands of kworker/mount processes:
ps auxf of kworker/mount:
ps auxf snippet of apt install:
ps auxf of
dpkg --configure -a
:hung_task_timeout_secs
output forls
:hung_task_timeout_secs
output formount.zfs
:stack trace from a kworker thread & mount child process:
dpkg.log:
iostat -x
showing low load:The text was updated successfully, but these errors were encountered: