Welp, after a few harrowing days, I managed to recover all my data from a failing ZFS pool. It had been serving me well, with no significant issue since 2013, about 10 years at this point. But the time has come to destroy
it, which I did. Goodbye old friend. 😢
On the bright side, I have two shiny new pools with shiny new disks, and a bunch of old hardware that needs a new use.
Also, I have to say that using a tool that automatically takes ZFS snapshots on some kind of schedule is a must-have. I have a filesystem that was badly corrupted but one of the snapshots survived intact. I was lucky that it was mostly a slowly-changing filesystem and the snapshot was up to date. I use sanoid
for this purpose but I suppose there are many.
@prologic@twtxt.net phu ZFS. Long time ago that I did that Solaris Admin Course 1 and 2. It is nice to work with ZFS when it works and you know how to set them up
@prologic@twtxt.net Oof, a long story. One disk went bad, and I replaced the disk. As that disk was resilvering, a second disk started dropping off the bus and then reappearing. This array is 6 disks arranged in 3 mirrors (so, 2 disks for each mirror), and that meant I had two mirrors with only one disk each supporting them for awhile 😱 Anyway, something about that disk disappearing and reappearing threw the entire array into…..disarray (pardon the pun). I can’t even explain what happened but it was really in a bad state and the resilvering just wouldn’t complete.
I bought some new disks, made a new array, and used zfs send
to get as many filesystems as possible from the old array to the newly-built one. One filsesystem was cranky, so I used an older snapshot of that one instead and was lucky that it worked fine. Finally, I rsynced
a few directories that seemed like they might have been out of date in that old snapshot, and that worked too.
Having a tool that automatically takes snapshots on a regular interval saved my ass.
@prologic@twtxt.net Backups were at least an hour old whereas I sent snapshots from the existing and fully up to date file systems.
@prologic@twtxt.net Yeah I try to minimize TLAs