Das Nuclear Boot

So the startup I’ve been fortunate enough to be employed by for the last year has shut down operations, and I’ve been asked to stick around as part of the “shutdown team.” As I head up our European IT operations from London, this means I’m going to a few things repeatedly over the next couple weeks:

  1. Inventory
  2. Negotiating our way out of the contracts I’d previously negotiated into—telecom and colocation facilities for our far-flung empire, mostly
  3. Negotiating the sale of equipment
  4. Wiping hard drives
  5. Beating away the headhunters with a stick

For this post, I’m going to concentrate on 4. Wiping hard drives. In the past, I’d never been part of the bulk drive disposal—I’ve never been party to a large-scale infrastructure refresh on the server level, so old drives tended to simply get stuck in a cabinet as they died or re-used one-off in desktops after low-level formatting.

I’ll first get the pedantry out of the way: nothing you can do to a standard drive will completely remove the data. If you want to destroy a drive such that no information can be be recovered, you need to approach it Fargo-style and throw it into an industrial chipper-shredder. Then you need to melt down the remains. Or you could just melt the thing from the start. It’s up to you.

However, if you simply want to make it cost-prohibitive to recover the data, then you really need to do very little. A couple passes of overwrites (on spinning rust) or resets (on SSD) is really all you need, and random-data overwrite is unnecessary, zeros and ones is enough. I’m not aware of anyone who has successfully recovered overwritten/deleted data, and most of the people who were worried about it are skeptical that it’s something which can be accomplished today.

XKCD #538
XKCD #538

When asked about this a couple months ago, I would have answered differently: “Oh, you need the 34-pass overwrite to be sure it’s going to be clean, which will take a week on your 4TB drive, assuming you have enough entropy to generate 125MBps of random data for a week straight.” This is because when I don’t know for sure, my initial instincts are very 538—useful if I was working for a secretive government agency or successful business, but totally the wrong approach when you’re trying to wind up a failed startup’s business and sell off the leftovers before years end. In that situation, you look for shortcuts.

So the easiest shortcut to use has got to be the SATA security commands, which has a lot of advantages and a couple hefty downsides. On the advantages side, it works (normatively) with SSDs. For spinning rust, it’s faster than writing data to a block device since it happens entirely on the disk itself—you aren’t going to be bandwidth constrained by the PCI bus.

On the downside, it’s pretty touchy to use properly, and it won’t actually work properly if any of the conditions aren’t met. In other words, there’s a long checklist of things that need to be done properly:

Make sure the mainboard has onboard SATA
You need the ability to send a SATA command directly to the drive you want to wipe. This means no RAID controllers, so you’re better off putting the drives directly into a workstation.
Make sure the SATA controller is in AHCI mode
By default, your controller should come in this mode already, though most motherboards will allow you to break SATA secure erase by putting the controller into RAID or IDE compatibility mode.
Ensure the BIOS controller will allow SATA security commands
The SATA security stuff we’re using to do secure erase is classified alongside the “hard drive password” stuff you should be able to see in the BIOS of your PC.
Ensure the drive supports SATA security commands
There’s nothing that says a SATA drive has to support secure erase. I’ve got quite a few HP-branded SATA drives that simply don’t support the security bits of SATA.
Ensure the drive isn’t in “frozen” mode
Some BIOSen will set your drive as “frozen,” on boot automatically (i.e. it will fiddle with the SATA settings all on it’s own), which precludes you from setting the options you need to in order to send the erase commands and wipe the disk. I’ve seen this on Apple-branded laptops, HP-Compaq 8400 Microtowers, and Lenovo C30 workstations. My Thinkpad T420, doesn’t do this, so it’s hit-and-miss, even within a brand-name. The solution is to suspend the machine to RAM, then wake it up. For whatever reason the BIOS doesn’t remark the drives as frozen when it’s re-initializing the devices.

That’s a lot of caveats to proper operations, and it makes verification kind of troublesome. That said, most of the ugliness in the drive wiping comes from the BIOS options, and they can all be set in one go. The rest of the process is a pretty trivial algorithm:

  1. Boot into an image.
  2. Iterate all SCSI/SAS/SATA drives and pick out the SATA ones.
  3. If any drives are marked as frozen, suspend and wake the box.
  4. Set the security password on all the drives.
  5. Run secure erase on all the drives.
  6. If the drives do not support secure erase, aren’t SATA, etc., then just write random data to them.
  7. Verify that the disk was actually erased.

Fortunately, all this can be automated, so I did (save the verification step) and put the results into a dracut plugin that automatically wipes all your drives. What does this mean for you? It means you can generate an initramfs which will automatically nuke any disks plugged into your system.

Why would you ever want to do that? Well, you wouldn’t, at least not on your machine. What you would want to do is combine it with a TFTP-backed PXE environment so you could take any random desktop, load it up with disks and netboot it to wipe all the drives.

More instructions are in the README file in the above repo, and (as always) patches are welcome.