13-Jul-2011

DR test

I mentioned doing disaster recovery via Continuous Access in the last entry. Last Sunday, we had our first attempt at performing a test with this solution in place. Previously, we had confirmed that replication is working well by mounting a replicated disk read-only, but this was the first time that we would attempt to full boot the systems.

Due to some appalling planning on both the part of my company and the people that are leasing us the EVA/CA solution, things didn't go very well I'm afraid to say.

Why is it so difficult to set up a network isolated enough so that at the flick of a switch (or the disconnection of no more than two cables) I can boot the DR machines up with their production IP addresses? I was faced with the second alternative here (disconnecting cables) but I was told that access to the iLO was switched over the same piece of copper and if we pulled it, I'd have to take a laptop into the computer room and connect directly to the blade enclosure. D'oh!

Yes, things like consoles are called "out of band" for a reason.

Secondly, we are not replicating all our disks. For our critical data disks, we run three way host based mirroring, but we only replicate the first member to reduce the bandwidth requirements for the replication link. Because the second and third members had not been initialised previously, VMS was umm, reluctant to add them to the shadowsets at boot time.

Lessons learned and DCL corrected. Hopefully the network issues will be addressed next week.

Let's hope the next test goes off a little better.

Posted at July 13, 2011 8:03 PM
Tag Set:

Comments are closed