15 February 2008

enRAIDing by remote control

Facing an interesting challenge. I updated a few machines in... remote places, & in so doing had to run the installer sans RAID (it complained about different machines in different ways) in a hurry, & now I address the challenge of enRAIDing a system by remote control.

I’ve grabbed a disk image of one of the remote machines, & plugged it into a local machine as a test-bed. This has revealed an immediate issue: the SATA controller in this machine is different to either of the others, so it won’t boot the disk image.

At the moment, I’ve booted a rescue image, borrowed part of the prospective RAID partner-drive & am bzip2’ing images of each partition into the borrowed partition, to implement a simple worst-case restore scenario.

When that’s secure, I’ll get into making the confused SATA drivers happy, then use the resulting bootable hard disk to walk through the process of making a small chroot-ish area on a transient partition, then walking the machine through switching opensshd into that, then...

Copying the live partitions into one-partition RAID versions of themselves, then unmounting & “reformatting” each live partition into a RAID component, joining that to (updating it from) the freshly-copied RAIDed version of itself on the secondary disk, remounting it again, then refrain for each. The exciting part is going to be booting from the transient partition into a system with working LAN & opensshd so I can reRAID the root partitions.

By remote control. Watching someone try this via ’doze would be... entertaining. For the onlookers.

Stay tuned for an (eventual) outcome.

3 comments:

Leon Brooks said...

Compression wonderment: bzip2 squashed a basically-blank 8.5GB partition down to a 32MB compressed image file.

It’s a pity that this won’t apply to all of the partitions, else I could fit the entire hard disk into a 1GB Flash stick fairly easily. At the moment, it’s compressed 4 of 8 partitions & used just over a GB to store 10.6GB raw.

Leon Brooks said...

I’ve given myself an unfair advantage: I’d left a brace of unused partitions on each drive, which I’ll be able to press into service as transient boot partitions.

This means that I can basically cp -a enough file trees onto a t-b-p to reboot onto that with working LAN & sshd, then rework the other partitions, then reboot back onto the reworked system. Sounds fairly simple... except for the gap of over 4000km between keyboard & workings. About a fifth of the way around the world. (-:

Leon Brooks said...

So far, on the model system, I’ve been able to “cheat” successfully.

I set up the second drive as a set of RAID partitions (with mods to /etc/raidtab & /etc/mdadm.conf), added a new boot option to /boot/grub/menu.lst which treats the RAID as root (etc) & has an initrd containing modules like raid1, then booted into the RAIDed partitions (still retaining, up to now, the ability to boot old-style in case some part doesn’t work as well as it should).

Then merged the old partitions into the RAID & deleted the old-style boot option.