New FTP transfers
enable 3000 shadowing
software, tested job offers some disaster recovery
By Wirt Atmar
Firstly, the job is
extremely simple and of course free. Indeed, theres
nothing complicated about any of it, nor are there any costs other
than a few minutes time. The job is constructed as shown below,
streaming itself to run at 3 AM every morning:
The backup of the primary machine onto a DDS-2 drive takes about 45 minutes to make a full CSLT/store copy of all files (approximately. 3.5Gb), using hardware compression, but theres no need for us to transfer all of the user files from one machine to the other on a regular basis. The seven jobs represented in the jobfile above transfer only the regularly active development and corporate accounts and their databases. These seven jobs represent about 2.5Gb worth of material.
My initial experiments indicated that transfers across the LAN were just about as fast as DDS/compression on backups. That still seems to be true. A complete CSLT/store backup of the primary takes about 45 minutes; the transfer of about half that material across the LAN takes 23 minutes.
There is a caveat in that 23-minute number, however. That transfer rate occurs when we allow all seven FTP jobs to run in parallel. A 10Mbit LAN is the equivalent of six T1 lines. If the seven jobs all run in parallel, we seem to consume about 60 percent of the LANs bandwidth leaving the equivalent of two T1s available for all other internal traffic, and that seems more than enough. While this intense internal traffic is flowing, normal terminal communications or Web page draws seem unaffected.
However, with all seven jobs running simultaneously, the disks on both machines are being exercised to their maximums, to the point that I consider it excessive wear on the disks. Theyre really clattering. Thus, we now set the job limit to just one above the background jobs, so that the seven jobs execute in single file; however, doing this increases the transfer time from 23 minutes to 64 minutes.
A good portion of a single-files FTP bandwidth (dead-time) is consumed in just the negotiation between the two machines. Once the file transfer begins flowing, the data moves at a good speed. When all seven jobs are running simultaneously, whatever dead time on the LAN is left over by one job is filled in by one or several of the others. Nevertheless, the excess wear on the disks doesnt seem worth the additional 40 minutes you save, especially at three oclock in the morning.
Ive tried multiple FTP syntaxes, but the one used in the job above seems by far to be the most certain. However, it only transfers MPE-named files, but thats all we currently use, so its not a problem for us. IMAGE databases, KSAM files, and all regular files transfer with no problem. Symbolic links transfer, but seem to lose their symbolicness in the process. HFS files dont transfer at all, using this syntax and the MPE version were currently on.
Im also planning on upgrading both machines to 6.5. Once thats done, well be able to use Jeff Vances new Store-to-Disk option. If my understanding is correct, the great advantage of doing this is that well be able to get all of the advantages of using the STORE command and assemble the store material into one file that can be FTPed between the two machines, and then RESTORED on the shadow.
However, Ive never been particularly fond of partial backups, but in order to make the STD option work well, thats what well have to do. The FTP jobs above transfer every file in the specified groups and accounts, regardless of the recency of their modifications.
One final note: FTPing
2.5Gb of files between the two machines does not have the same impact
on users as does a STORE-like backup does, where all of the files are
marked for some time for exclusive access. Only one file at a time is
locked as the MGET process walks through the file list. That
attribute can obviously be either good or bad, depending on
individual circumstances. Nonetheless, using the simple job that
Ive outlined above, that is the current behavior.
Overall, Im quite
tickled as to how its working. If we should lose the primary
machine, we are (nearly) guaranteed of having a perfect replicate
machine, available and ready to go, in building three.