15 February 2008

enRAIDing by remote control

Facing an interesting challenge. I updated a few machines in... remote places, & in so doing had to run the installer sans RAID (it complained about different machines in different ways) in a hurry, & now I address the challenge of enRAIDing a system by remote control.

I’ve grabbed a disk image of one of the remote machines, & plugged it into a local machine as a test-bed. This has revealed an immediate issue: the SATA controller in this machine is different to either of the others, so it won’t boot the disk image.

At the moment, I’ve booted a rescue image, borrowed part of the prospective RAID partner-drive & am bzip2’ing images of each partition into the borrowed partition, to implement a simple worst-case restore scenario.

When that’s secure, I’ll get into making the confused SATA drivers happy, then use the resulting bootable hard disk to walk through the process of making a small chroot-ish area on a transient partition, then walking the machine through switching opensshd into that, then...

Copying the live partitions into one-partition RAID versions of themselves, then unmounting & “reformatting” each live partition into a RAID component, joining that to (updating it from) the freshly-copied RAIDed version of itself on the secondary disk, remounting it again, then refrain for each. The exciting part is going to be booting from the transient partition into a system with working LAN & opensshd so I can reRAID the root partitions.

By remote control. Watching someone try this via ’doze would be... entertaining. For the onlookers.

Stay tuned for an (eventual) outcome.


Leon Brooks said...

Compression wonderment: bzip2 squashed a basically-blank 8.5GB partition down to a 32MB compressed image file.

It’s a pity that this won’t apply to all of the partitions, else I could fit the entire hard disk into a 1GB Flash stick fairly easily. At the moment, it’s compressed 4 of 8 partitions & used just over a GB to store 10.6GB raw.

Leon Brooks said...

I’ve given myself an unfair advantage: I’d left a brace of unused partitions on each drive, which I’ll be able to press into service as transient boot partitions.

This means that I can basically cp -a enough file trees onto a t-b-p to reboot onto that with working LAN & sshd, then rework the other partitions, then reboot back onto the reworked system. Sounds fairly simple... except for the gap of over 4000km between keyboard & workings. About a fifth of the way around the world. (-:

Leon Brooks said...

So far, on the model system, I’ve been able to “cheat” successfully.

I set up the second drive as a set of RAID partitions (with mods to /etc/raidtab & /etc/mdadm.conf), added a new boot option to /boot/grub/menu.lst which treats the RAID as root (etc) & has an initrd containing modules like raid1, then booted into the RAIDed partitions (still retaining, up to now, the ability to boot old-style in case some part doesn’t work as well as it should).

Then merged the old partitions into the RAID & deleted the old-style boot option.