November, 2005

Disaster Recovery Optimization Techniques

By Gilles Schipper
GSA, Inc.

While working recently with a customer on the design and implementation of disaster recovery (DR) plan for their large HP 3000 system, it became apparent that the mechanics of its implementation had room for improvement.

In this specific example, the customer has a production N-class HP 3000 in its primary location and a backup HP 3000 Series 969 system in a secondary location several hundred miles removed from the primary.

The process of implementing the DR was more manual-intensive than it needed to be. As an aside, it was completed entirely from a remote location — thanks to the Internet, VPNs and the use of the HP Secure Web Console on the 969.

One of the most labor-intensive aspects of the DR exercise was to rebuild the IO configuration of the DR machine (the 969) from the full backup tape of the production N-Class machine, which included an integrated system load tape (SLT) as part of the backup.

The ability to integrate the SLT on the same tape as the full backup is very convenient. It results in a simplified recovery procedure as well as the assurance that the SLT to be used will be as current as possible.

When rebuilding a system from scratch from a SLT/Backup, if the target system (in this case the 969) differs in architecture from the source system (N-4000) it is usually necessary to modify all the device paths and device configuration specifications with SYSGEN and then rebooting the system in order to even be able to utilize the tape drive of the target system to restore any files at all.

(This would be apart from the files restored during the INSTALL process — which does not require proper configuration of any IO component at all).

Some would argue that this system re-configuration needs to be completed only once since any future system rebuilds would require only a “data refresh” rather than a complete system re-INSTALL.

I say that this would be true only in very stable system environments where IO configurations — including network printer configurations — are static and where TurboIMAGE transaction logging is not utilized. Otherwise there could be unpleasant results and complications from using stale configurations in a real disaster recovery situation.

In any case, there really is no reason to take any chances, since the labor-intensive step of creating a proper DR target system configuration environment is achievable minus the labor-intensive part – or at least without repetition of the manual chore of re-configuring the target system each time the DR is exercised.

Unless both the production system and the DR system are architecturally similar (i.e. they belong to same HP 3000 family) the configuration of the target system (the DR machine) cloned from the source system (the production machine) will be non-trivial.

At a minimum, before data restore can begin on the DR machine, the path hierarchy of the tape drive associated with the backup tape must be re-created.

Further, if the subsequent restore requires more than just the system disk, all the path components for all the disk drives must also be created.

In a real DR situation, this task can be daunting at best – particularly since ii may be difficult in that event to be able to access the appropriate documentation that describes the pertinent SYSGEN configuration requirements. How much preferable would it be to be able to complete this configuration well in advance of the hope-to-never-happen event.

In fact, it is entirely possible to create an appropriate DR configuration environment that is (almost) completely integrated into one’s production environment.

SYSGEN IO requirements

In order to provision a potential DR HP 3000 system’s IO configuration requirements into an existing production HP3000 SLT, it is only necessary to configure all of the DR path components into the existing production system’s IO configuration.

The fact that these paths do not exist on the production (source) system is immaterial — as long as you can withstand the menacing, although perfectly innocuous — console error messages that accompany a reboot of a system so configured.

[Ed. Note: An example of these console error messages is shown on the NewsWire’s blog (3000newswire.com/blog). Look for Schipper’s article on the Homesteading page.]

There is also the small matter of actual device numbers — and that is why I included the “almost” when mentioning “completely integrated” earlier.

Clearly, it is not possible to have duplicate device numbers when configuring both production and DR devices into the production SYSGEN IO configuration. So, in order to distinguish between the two systems (one the real production, the other virtual DR), I simply add 100 (you can choose any number) to the device numbers associated with the virtual machine. Then when actually testing or invoking the DR process, it is a simple matter to change the device numbers in a batch job designed for that purpose.

Another batch job could be pre-built that would add the appropriate disk drives and volume sets to the system’s disk pool, using VOLUTIL. These batch jobs would be included in the full backup tape and could be restored almost immediately following the INSTALL by referencing :file tape;dev=107 (to use my example of adding 100 to the corresponding virtual device)

The command :restore *tape;{fileset}; directory;olddate; keep;create;show (where {fileset} corresponds to the fileset that would include the appropriate device number change and volutil batch jobs. One could take this technique one step further in the case where the DR target machine is unknown.

In such a situation, you could create a SYSGEN IO configuration that includes path constructs for any possible virtual machine that you could think of and include them in the host configuration – adding 100 for devices associated with virtual machine 1, 200 for virtual machine 2, and so on.

More details on this recovery procedure, including an example of a SYSGEN IO configuration of a model 969 integrated into a host N4000 are in the version of this story hosted on the NewsWire’s blog.


Licensed under the Creative Commons License.