ZFS Snapshots: The case of the vanishing disk usage
3 Jun 2022I have a home server that runs on TrueNAS (which is FreeBSD) using ZFS for its storage filesystem. I recently ran into the common issue of not knowing where my disk usage was going, discovered that disk usage accounting for snapshots is more complex than for regular filesystems, and thought I’d write it up. Hopefully others find it helpful and/or enlightening.
The Problem
Disk usage generally seems like the sort of thing that should be easy to reason about. With modern filesystems however, that isn’t really true anymore. Particularly when it comes to copy-on-write snapshots like ZFS supports.
On my server I have a directory called ‘backups’. Every now and then I zip up some files and dump them there in case I need them later.
Here is what du
claims is the amount of disk space consumed by that directory:
% du -d 0 -h /mnt/Vol_1/Home/Jacques/backup/
256G /mnt/Vol_1/Home/Jacques/backup/
I also take monthly snapshots of this folder. Here is what zfs
claims is the amount of disk space consumed by those snapshots:
% zfs list -t snapshot /mnt/Vol_1/Home/Jacques/backup/
NAME USED AVAIL REFER MOUNTPOINT
Vol_1/Home/Jacques/backup@snap-2021-04-01_03:00 72K - 27.3G -
Vol_1/Home/Jacques/backup@snap-2021-05-01_03:00 72K - 581G -
Vol_1/Home/Jacques/backup@snap-2021-06-01_03:00-25monthlifetime 384K - 581G -
Vol_1/Home/Jacques/backup@snap-2021-07-01_03:00-25monthlifetime 384K - 609G -
Vol_1/Home/Jacques/backup@snap-2021-08-01_03:00-25monthlifetime 376K - 609G -
Vol_1/Home/Jacques/backup@snap-2021-09-01_03:00-25monthlifetime 376K - 609G -
Vol_1/Home/Jacques/backup@snap-2021-10-01_03:00-25monthlifetime 344K - 609G -
Vol_1/Home/Jacques/backup@snap-2021-11-01_03:00-25monthlifetime 352K - 609G -
Vol_1/Home/Jacques/backup@snap-2022-02-01_03:00-25monthlifetime 376K - 637G -
Vol_1/Home/Jacques/backup@snap-2022-03-01_03:00-25monthlifetime 360K - 637G -
Vol_1/Home/Jacques/backup@snap-2022-04-01_03:00-25monthlifetime 416K - 637G -
Vol_1/Home/Jacques/backup@snap-2022-05-01_03:00-25monthlifetime 432K - 256G -
Vol_1/Home/Jacques/backup@snap-2022-06-01_03:00-25monthlifetime 432K - 256G -
Not a whole lot! This is one of the excellent features of ZFS. Until I make changes to the files in the snapshot, the snapshot itself consumes almost no disk space.
So you can imagine my surprise, then, when I get a notification saying that I’m running out of disk space, check what zfs
(rather than du
) says about the usage and get this:
% zfs list -t filesystem /mnt/Vol_1/Home/Jacques/backup/
NAME USED AVAIL REFER MOUNTPOINT
Vol_1/Home/Jacques/backup 637G 279G 256G /mnt/Vol_1/Home/Jacques/backup
637GB used. So we’re just…missing…around 381GB of disk space? That’s odd. A closer look at some man pages reveals a reasonable explanation though. In particular the zfsprops
man page says this about the USED
property:
The used space of a snapshot (...) is space that is referenced exclusively by this snapshot
This is actually quite sensible. The USED
column tells you how much storage will be freed up by destroying just that one individual snapshot. How do you tell how much space would be freed up by destroying multiple snapshots?
The Solution
Ask zfs destroy
while very carefully explaining to it that you don’t actually want to delete anything:
zfs destroy -nv filesystem@snapshot1%snapshot2
This will ask ZFS to destroy all snapshots from snapshot1 to snapshot2 (inclusive).
Note the -nv
in there though. -n
tells it to do a “dry run” so that it doesn’t actually delete anything and -v
tells it to show additional info.
Lets give that a try:
% zfs destroy -nv Vol_1/Home/Jacques/backup@snap-2021-04-01_03:00%snap-2022-06-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-04-01_03:00
would destroy Vol_1/Home/Jacques/backup@snap-2021-05-01_03:00
would destroy Vol_1/Home/Jacques/backup@snap-2021-06-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-07-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-08-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-09-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-10-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2021-11-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2022-02-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2022-03-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2022-04-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2022-05-01_03:00-25monthlifetime
would destroy Vol_1/Home/Jacques/backup@snap-2022-06-01_03:00-25monthlifetime
would reclaim 382G
If we assume the fact that it shows 382 instead of 381 is just due to rounding then that explains the missing storage capacity. Interestingly, you may note that in the zfs list
output above, the REFER value actually does match what du
told us.
I should close by saying that I would certainly have spent far longer investigating this had I not stumbled upon one Matthew McDonald on GitHub who built a tool to help solve this mystery and wrote a readme that helpfully explains what’s going on. Many thanks to Matthew!