Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs online/replace fails on vdev with changed name #16983

Open
eharris opened this issue Jan 23, 2025 · 6 comments
Open

zfs online/replace fails on vdev with changed name #16983

eharris opened this issue Jan 23, 2025 · 6 comments
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@eharris
Copy link
Contributor

eharris commented Jan 23, 2025

System information

Type Version/Name
Distribution Name Debian
Distribution Version 12 (bookworm)
Kernel Version 6.1.0-30-amd64
Architecture amd64
OpenZFS Version zfs-2.2.7-1~bpo12+1

Describe the problem you're observing

When a vdev changes name, the pool is degraded and zpool status gives the message:

action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.

zpool online with the new device name is silently ignored, no warning or error, but also no action.

zpool replace (even with -f) fails, complaining that the new device is already a member of the active pool, even though zpool status shows it as the old name, and offline.

I did find issue #3242 which is a feature request for zfs online to be able to rename a device, which would fix the underlying problem. But I'm opening this ticket as a bug, since I would not expect either zfs online to silently ignore the command, nor zfs replace to know it belongs to the pool but still fail, especially with -f.

At the least, it seems like the error messages (or no message) given by the commands should be improved.

Describe how to reproduce the problem

truncate -s 1G /blah/file1
truncate -s 1G /blah/file2
zpool create test mirror /blah/file1 /blah/file2
zpool offline test /blah/file2
mv /blah/file2 /blah/file3
zpool online test /blah/file3
zpool replace -f test /blah/file3

@eharris eharris added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 23, 2025
@ixhamza
Copy link
Member

ixhamza commented Jan 23, 2025

Warning message to zpool online was recently merged in master though: #16244.

@n0xena
Copy link

n0xena commented Jan 24, 2025

in regards to #16984 and #16985 this issue is somewhat invalid as you try to simulate something that won't happen on a real pool but is more of a hypothetical test - or to put it this way: WHY would some do such an operation to a real pool? WHAT would be the point in removing a drive just to re-add the very same drive at a later point in time? Can you give some real world example?
If a drive fails it gets replaced with a new drive and the old drive gets disposed. I highly doubt that any serious admin would re-use a drive which was pulled from another pool.

@eharris
Copy link
Contributor Author

eharris commented Jan 24, 2025

@n0xena I was asked for an easy way to reproduce, and I have given one.
As for it being invalid, it is not at all. I have several USB-attached SSD adapters that identify as:

/dev/disk/by-id/usb-General_Generic_PCIE_0123456789ABCDEF-0:0

The number at the end changes based on how many there are and in what order they were detected, so drives move around often, in real life.

@ixhamza
Copy link
Member

ixhamza commented Jan 24, 2025

It still seems to be a duplicate of the #3242 feature request. The zpool online error message was recently improved, and zpool replace is designed to replace an existing device with a new one. If the disk is placed in the same slot, it also works. However, I don't think we can expect it to function the same way with a file vdev, as your example involves changing the file path.

@amotin
Copy link
Member

amotin commented Jan 24, 2025

When pool is imported, ZFS can search all available disks for its vdevs. When removed disk is hot-plugged, it is up to zed/zfsd to reattach it. I just don't remember what zed/zfsd do if the name is different. I wonder if zpool online (or some of its equivalent used by zed/zfsd) internally might have some argument to specify a different device name. If no, then may be one could be added. There is a reason why we use GPTID-based names in TrueNAS -- they don't depend on disk identification.

@eharris
Copy link
Contributor Author

eharris commented Jan 24, 2025

@ixhama #3242 is a feature request, and if that is fixed, that would be great (although I'm not holding my breath since it is 10 years old at this point).
This was filed as a bug because the information from online/replace is missing/insufficient/misleading. This ticket is not about asking for the feature to solve the problem, it is about fixing the messages to be more informative/useful as long as the capability isn't available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

4 participants