NewsWire TestDrive -- CSM: Hassle-Free, Safe Data Compression/Decompression

CSM: Hassle-Free, Safe Data Compression/Decompression

3000 NewsWire TestDrive

[June, 1996 NewsWire]

TEST DRIVE ROAD REPORT

Compression Storage Manager (CSM)
SolutionSoft Systems, Inc.
2350 Mission College Blvd. Suite 715
Santa Clara, CA 95054
Tel: 408.988.7378
Fax: 408.988.4777
Also available from
ORBIT Group International
Tel: 510.215.9000
info@orbitsw.com

An online archiving solution for the HP 3000. U.S. prices range from $1,500 to $11,000 depending upon CPU tier. A demonstration copy is available.


Review by John Burke

Compression Storage Manager (CSM), from SolutionSoft Systems, Inc., of Santa Clara, Calif., provides an online archiving solution for the HP 3000. It will compress files based on a set of rules and then, automatically, decompress them when they are accessed.

You may be thinking, "Who cares about data compression?" The cost-per megabyte of online storage has plummeted in recent years. Let me give you two words to think about: Data Warehousing.

It is true that the cost, operating expense and footprint of online storage has plummeted and, for a time, even kept ahead of the voracious storage requirements of today's systems. No more. The beast is about to overtake you if it hasn't already.

So, why should you pay money for a compression utility? After all, there are perfectly good ones, excellent even, available from the CSL, free from Telamon (LZW) or as part of the POSIX interface included with MPE/iX 5.0. Two features of CSM clearly stand out, compression "in place" and auto file decompression. Together both features enable online archiving and provide the front end for future Hierarchical Storage Management (HSM) products. Despite what one HP marketplace publication recently contended, HSM is decidedly not just for Unix systems, nor was the idea originally conceived with Unix systems in mind!

Key features

Compression "in place" means that a file compressed with CSM appears the same as the original file: all file and security attributes are unchanged, including the last access date. This contrasts with the more traditional approach where a new, compressed, file is created with a different name and possibly containing one or more other compressed files (the PC model for this is PKZIP). With CSM, programs and procedures that expect files to have a certain appearance will continue to operate without modification. Because CSM will then also automatically decompress the file when the process first tries to accesses it. To my knowledge, these features are unique to CSM.

Other notable CSM features include:


Performance

CSM is blazingly fast, which is necessary, of course, for online archiving or HSM. For example, CSM compressed a 436,487-record ASCII file (138,032 sectors, or 35Mb) achieving a compression ratio of 72 percent (38,608 sectors, or 9.9Mb after) using just 33 seconds of CPU. Decompression was even faster at 28 seconds (in all my tests decompression was about 10 percent less). All the while, I monitored system use and never saw CSM grab more than 70 percent of the CPU on a lightly loaded system. While hardly a rigorous yardstick, my system users were never aware that I was testing a compression utility on large files.

Testing was done on a 918LX with 64Mb of RAM and 2x2GB disk drives running the Express-3 release of MPE/iX 5.0. So, as reviewers are fond of saying, your mileage may vary. To put it all in perspective, using the COPY command on the same file consumed 11 seconds of CPU (or one-third the time to achieve 72 percent compression with CSM). LZW, a popular compression/archiving utility contributed and maintained by Telamon, clocked in at 138 CPU seconds (4.2 times slower than CSM). Finally, FCOPY staggered in at a whopping 352 CPU seconds! Note also that LZW and FCOPY both registered 90 percent-plus CPU utilization at times.

Safety

CSM uses MPE/iX's Storage Manager and Transaction Manager (XM) when compressing and decompressing files. This provides a fail-safe transition guaranteeing data integrity (just like XM provides for IMAGE databases). Transaction Manager allows recovery even after a system abort or other crash. I am taking the product architect's word for this since I was not about to crash my system deliberately -- not a big risk, since he (Paul Wang, President of SolutionSoft Systems) was also the architect for Transaction Manager while working for HP's CSY several years ago.

CSM is fully integrated with MPE/iX including the POSIX Hierarchical File System (HFS). It supports all MPE file types, retaining all security and file attributes after compression (even access date, for example). Under auto file decompression, CSM uses HP's Architected Interface Facility (AIF) Procedure Exits (PE) to trap any file access and perform its decompression magic, if necessary, before returning to the calling process.

What about compatibility? A stand-alone version of CSM that will decompress any file compressed by CSM is freely available to anyone. Ever try to read a tape created with hardware compression when you do not have a compatible drive?

Installation/setup

Installation is a snap. Do a RESTORE ... ;CREATE and execute a command file from MANAGER.SYS. The installation is now complete. SSS recommends you have auto file decompression enabled at all times, and that you incorporate it into your start-up procedures. Auto file decompression is enabled by running the program PENABLE.PUB.SYS from a SM-capable user (STREAM a MANAGER.SYS job from SYSSTART at system start-up to RUN the program). It only takes a moment to RUN PENABLE. Once enabled, auto file decompression remains enabled until explicitly disabled by the PDISABLE program or by a system reboot. CSM is compatible with MPE/iX 4.0, 4.5, 5.0 and 5.5.

As part of the installation documentation, you are led through several tests that demonstrate how seamlessly CSM can be incorporated into your environment and how it works on all types of files.

SolutionSoft suggests several usage scenarios for CSM:

A real-world test

We keep historical information (payroll and accounting) for our customers. In the Classic 3000 days, when the HP7933 (404Mb) was a BIG disk drive (after all it held over three times as much data as the previous BIG disk drive, the 120-Mb HP7925), we had to store all historical data from previous years on tape. If a customer needed to access historical data, they had to contact us and arrange for a RESTORE. Going back in time further, we even kept source program generations offline on tape. When we moved to MPE/iX, we had so much more capacity in both online storage and tape backup (2Gb DAT) that we started maintaining at least one year of history online. Our customers love it. But, the 121 history data files occupy almost 1.9 million sectors (476Mb). Only 52 of the files (512K sectors, or 131Mb) were accessed at all in the last 30 days.

We are a small shop -- a lights out operation. We do an automated full backup every night after the batch processing is complete. We are getting close to the 2Gb limit of our backup device. What to do?

We could buy a new DAT drive with hardware compression (expensive). Or, we could buy a backup product with software compression (also expensive). These solutions would not impact our customers.

A free solution would be to use LZW or something similar and manually administer compression/decompression of history files. This would be only slightly easier to administer than the old tape-archive approach and would certainly be a step back for our customers.

A third option (less expensive than the first option without the negatives of the second) would be to acquire CSM for online archiving. Here is how it might work for us (and for you):

We could use CSM to compress the 476Mb occupied by the history files to only 169Mb, a savings of 307Mb! (This approach will go a long way to keeping our backup to one tape.) Then, add to our backup job (right before the actual backup) a RUN of CSM that compresses only those history files that have been decompressed (which adds at most a couple of minutes up front, but saves many minutes during the actual STORE). Our customers would see little, if any, change, since most access to the history files is batch reporting. Even with interactive access, there would only be a slight delay.

Conclusion

CSM is an impressive product, with everything I look for: features, performance, ease of use and more. It will be well worth your time to look at how CSM fits into your current environment. And how future HSM products from the SSS/ORBiT collaboration could change the way you look at storage management.


Copyright 1997, The 3000 NewsWire. All rights reserved.
Ron Seybold, Editor In Chief The 3000 NewsWire Independent Information to Maximize Your HP3000 http://www.3000newswire.com/newswire rseybold@zilker.net 512-657-3264