Deduplication is hit and miss, are there settings to determine what is matched? #17151
Unanswered
AncientMystic
asked this question in
Q&A
Replies: 1 comment 3 replies
-
Are you trying to dedup data in VM images / volumes? |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
So i am noticing even if i move the exact same data ZFS misses a lot of duplicates in deduplication
example i copied 163gb of game files for a vm twice and the dedup function caught 72gb, yet 100% of the data is duplicated
Backups are the same, multiple identical copies of windows, linux, etc, it is catching around 44% or less of the duplicate files on average.
Another example, i was hoping to use it to have ai models in multiple VMs and not take up so much space since they are huge and so many apps want them in different locations or setup with different environments to run efficiently, so deduplication would be perfect for this, yet ai models that i have moved onto the dataset, ZFS somehow does not see a single one as duplicate despite there being duplicates of all of them for different VMs.
I am wondering is this to be expected or is there something i am missing, a setting i should specify to increase how thoroughly it checks files for deduplication, etc?
I am running ZFS on proxmox and I have 96GB of ram, so i am pretty sure i have enough at least to run deduplication.
Beta Was this translation helpful? Give feedback.
All reactions