Skip to content

System lockup on access to ZFS snapshots (mismatched module version) #17252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
IvanVolosyuk opened this issue Apr 17, 2025 · 4 comments
Open
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@IvanVolosyuk
Copy link
Contributor

IvanVolosyuk commented Apr 17, 2025

I am experiencing some strange lock up of my system, so far I reproduced it twice typing in terminal:
losetup -P /dev/loop0 /zpool/dataset/.zfs/snapshot/2025-04-17/<TAB>

Looks somewhat similar to #11169

Right after I press tab the tmux hangs on any of open shells I cannot type in any shells including VT1 with open root console.
The filesystem in general seem to work, but something is terribly locked with all ptys?

Both VM and chrome on host can access filesystem.
But, there is windows VM which runs fine, games can access filesystem without problem. I can shutdown the VM, but it doesn't help with the system lock up.
I tried to open snapshot dirs from brother, but all */.zfs/snapshot directories just hang.

Update: It turns out there was mismatched kernel module and userspace utils.

System information

Type Version/Name
Distribution Name Gentoo
Distribution Version Live
Kernel Version 6.12.21-gentoo
Architecture x86_64
OpenZFS Version (kernel) 2.2.7
OpenZFS Version (userspace) 2.3.0

Describe the problem you're observing

Partial system lockup

Describe how to reproduce the problem

Type in shell:
cd /zpool/dataset/.zfs/snapshots/2025-03-17/<TAB>
After I press <TAB> all terminals are dead. All snapshot dirs in different pools hang when I try to get list of files there.

Include any warning/errors/backtraces from the system logs

logs.txt

@IvanVolosyuk IvanVolosyuk added the Type: Defect Incorrect behavior (e.g. crash, hang) label Apr 17, 2025
@IvanVolosyuk
Copy link
Contributor Author

It turns out if I access the snapshot dir from chrome, all terminals still lock up anyway. Tmux dead, VT1 dead - cannot type a single character. Magic, I'll try an older kernel version and new ZFS to see if it makes a difference.

@IvanVolosyuk
Copy link
Contributor Author

Ok, looks like I figured it out - the userspace ZFS version was 2.3.0, while kernel module 2.2.7. Didn't expect that this can cause system lockup.

@snajpa
Copy link
Contributor

snajpa commented Apr 17, 2025

stack trace from that mount.zfs process would be useful; even if there's a version mismatch it still shouldn't cause crashes I would say :)

@IvanVolosyuk
Copy link
Contributor Author

IvanVolosyuk commented Apr 17, 2025

It doesn't actually crash, a VM running on the system is fully functional and chrome seem like working as well, for whatever reason any terminals I had are borked. Looks like the systemd logs contain the stack trace @snajpa :

Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x0000000079ff9000-0x0000000079ffbfff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x0000000079ffc000-0x0000000079ffffff] usable
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x000000007a000000-0x000000007bffffff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x000000007d7f3000-0x000000007fffffff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x00000000e0000000-0x00000000efffffff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x00000000f7000000-0x00000000ffffffff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x0000000100000000-0x000000205de7ffff] usable
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x000000205eec0000-0x00000020a01fffff] reserved
Apr 17 21:10:58 toster kernel: BIOS-e820: [mem 0x000000fd00000000-0x000000ffffffffff] reserved
Apr 17 21:10:58 toster kernel: Disabled PAT check type (experimental)
Apr 17 21:10:58 toster kernel: NX (Execute Disable) protection: active
Apr 17 21:10:58 toster kernel: APIC: Static calls initialized
Apr 17 21:10:58 toster kernel: e820: update [mem 0x597b2018-0x597d6a57] usable ==> usable
Apr 17 21:10:58 toster kernel: e820: update [mem 0x5979d018-0x597b1257] usable ==> usable
Apr 17 21:10:58 toster kernel: e820: update [mem 0x5978f018-0x5979c857] usable ==> usable
Apr 17 21:10:58 toster kernel: e820: update [mem 0x59784018-0x5978ee57] usable ==> usable
Apr 17 21:10:58 toster kernel: extended physical RAM map:
Apr 17 21:10:58 toster kernel: reserve setup_data: [mem 0x0000000000000000-0x000000000009ffff] usable
Apr 17 21:10:58 toster kernel: reserve setup_data: [mem 0x00000000000a0000-0x00000000000fffff] reserved
Apr 17 21:05:30 toster kernel: RAX: ffffffffffffffda RBX: 00005577d6a46d30 RCX: 00007f6533895f4b
Apr 17 21:05:30 toster kernel: RDX: 0000000000000000 RSI: 00007fffd05958a0 RDI: 00007fffd0598cc0
Apr 17 21:05:30 toster kernel: RBP: 00007fffd0598b90 R08: 0000000000000000 R09: 0000000000000000
Apr 17 21:05:30 toster kernel: R10: 0000000000000100 R11: 0000000000000246 R12: 00007fffd0595b40
Apr 17 21:05:30 toster kernel: R13: 00007fffd0598cc0 R14: 0000000000000000 R15: 00005577d6a3c300
Apr 17 21:05:30 toster kernel:  </TASK>
Apr 17 21:05:30 toster kernel: task:mount.zfs       state:D stack:0     pid:8437  tgid:8437  ppid:8436   flags:0x00000002
Apr 17 21:05:30 toster kernel: Call Trace:
Apr 17 21:05:30 toster kernel:  <TASK>
Apr 17 21:05:30 toster kernel:  __schedule+0x4af/0xb50
Apr 17 21:05:30 toster kernel:  schedule+0x27/0xd0
Apr 17 21:05:30 toster kernel:  schedule_timeout+0x125/0x140
Apr 17 21:05:30 toster kernel:  ? kick_pool+0x65/0x130
Apr 17 21:05:30 toster kernel:  wait_for_completion_state+0x112/0x1e0
Apr 17 21:05:30 toster kernel:  call_usermodehelper_exec+0x142/0x170
Apr 17 21:05:30 toster kernel:  zfsctl_snapshot_mount+0x565/0x840
Apr 17 21:05:30 toster kernel:  zpl_snapdir_automount+0x10/0x20
Apr 17 21:05:30 toster kernel:  __traverse_mounts+0x89/0x200
Apr 17 21:05:30 toster kernel:  step_into+0x34b/0x760
Apr 17 21:05:30 toster kernel:  ? lookup_fast+0xb2/0xe0
Apr 17 21:05:30 toster kernel:  path_lookupat+0x6d/0x1b0
Apr 17 21:05:30 toster kernel:  filename_lookup+0xc2/0x1a0
Apr 17 21:05:30 toster kernel:  ? common_perm_cond+0x3e/0x1b0
Apr 17 21:05:30 toster kernel:  ? generic_fillattr+0x49/0x110
Apr 17 21:05:30 toster kernel:  ? from_kgid_munged+0x12/0x20
Apr 17 21:05:30 toster kernel:  ? cp_new_stat+0x13c/0x150
Apr 17 21:05:30 toster kernel:  user_path_at+0x37/0x50
Apr 17 21:05:30 toster kernel:  user_statfs+0x34/0x90
Apr 17 21:05:30 toster kernel:  __do_sys_statfs+0x10/0x30
Apr 17 21:05:30 toster kernel:  ? syscall_trace_enter+0xfb/0x190
Apr 17 21:05:30 toster kernel:  do_syscall_64+0x52/0x120
Apr 17 21:05:30 toster kernel:  entry_SYSCALL_64_after_hwframe+0x50/0x58
Apr 17 21:05:30 toster kernel: RIP: 0033:0x7f22ddacff4b
Apr 17 21:05:30 toster kernel: RSP: 002b:00007ffe229961b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000089
Apr 17 21:05:30 toster kernel: RAX: ffffffffffffffda RBX: 000055e714ef6d30 RCX: 00007f22ddacff4b
Apr 17 21:05:30 toster kernel: RDX: 0000000000000000 RSI: 00007ffe229961e0 RDI: 00007ffe22999600
Apr 17 21:05:30 toster kernel: RBP: 00007ffe229994d0 R08: 0000000000000000 R09: 0000000000000000
Apr 17 21:05:30 toster kernel: R10: 0000000000000100 R11: 0000000000000246 R12: 00007ffe22996480
Apr 17 21:05:30 toster kernel: R13: 00007ffe22999600 R14: 0000000000000000 R15: 000055e714eec300
Apr 17 21:05:30 toster kernel:  </TASK>
Apr 17 21:05:30 toster kernel: task:mount.zfs       state:D stack:0     pid:8440  tgid:8440  ppid:8439   flags:0x00000002
Apr 17 21:05:30 toster kernel: Call Trace:
Apr 17 21:05:30 toster kernel:  <TASK>
Apr 17 21:05:30 toster kernel:  __schedule+0x4af/0xb50
Apr 17 21:05:30 toster kernel:  schedule+0x27/0xd0
Apr 17 21:05:30 toster kernel:  schedule_timeout+0x125/0x140
Apr 17 21:05:30 toster kernel:  ? pwq_tryinc_nr_active+0xc7/0x160
Apr 17 21:05:30 toster kernel:  ? insert_work+0x3f/0x80
Apr 17 21:05:30 toster kernel:  wait_for_completion_state+0x112/0x1e0
Apr 17 21:05:30 toster kernel:  call_usermodehelper_exec+0x142/0x170
Apr 17 21:05:30 toster kernel:  zfsctl_snapshot_mount+0x565/0x840
Apr 17 21:05:30 toster kernel:  zpl_snapdir_automount+0x10/0x20
Apr 17 21:05:30 toster kernel:  __traverse_mounts+0x89/0x200
Apr 17 21:05:30 toster kernel:  step_into+0x34b/0x760
Apr 17 21:05:30 toster kernel:  ? lookup_fast+0xb2/0xe0
Apr 17 21:05:30 toster kernel:  path_lookupat+0x6d/0x1b0
Apr 17 21:05:30 toster kernel:  filename_lookup+0xc2/0x1a0
Apr 17 21:05:30 toster kernel:  ? common_perm_cond+0x3e/0x1b0
Apr 17 21:05:30 toster kernel:  ? generic_fillattr+0x49/0x110
Apr 17 21:05:30 toster kernel:  ? from_kgid_munged+0x12/0x20
Apr 17 21:05:30 toster kernel:  ? cp_new_stat+0x13c/0x150
Apr 17 21:05:30 toster kernel:  user_path_at+0x37/0x50
Apr 17 21:05:30 toster kernel:  user_statfs+0x34/0x90
Apr 17 21:05:30 toster kernel:  __do_sys_statfs+0x10/0x30
Apr 17 21:05:30 toster kernel:  ? syscall_trace_enter+0xfb/0x190
Apr 17 21:05:30 toster kernel:  do_syscall_64+0x52/0x120
Apr 17 21:05:30 toster kernel:  entry_SYSCALL_64_after_hwframe+0x50/0x58
Apr 17 21:05:30 toster kernel: RIP: 0033:0x7f1a13dc2f4b

@IvanVolosyuk IvanVolosyuk changed the title System lockup on access to ZFS snapshots. System lockup on access to ZFS snapshots (mismatched module version) Apr 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

2 participants