Click here for Robelle Consulting Sponsor Message

net.digest

Net.digest summarizes significant discussions on the HP 3000-L Internet mailing list edited by longtime HP 3000 columnist John Burke, who provides commentary on HP 3000 issues. Advice is provided on a best effort, Good Samaritan basis. Test these concepts for yourself before applying them on your systems.

Analysis by John Burke

Not supported here?
In this case, ignore it
When the 9x9 series of machines were introduced, it was with a "gotcha" that has had a varying, and sometimes quite unpleasant, impact on sites upgrading from earlier 9xx machines: the HPIB interface is officially unsupported -- you cannot configure in an HPIB card on a 9x9 order. This was justified because, as a practical matter, it makes little sense to bring HPIB disk or tapes drives over to 9x9 systems: the transfer rate of HPIB is too limiting. However, many sites have perfectly good HPIB HP256x printers they would like to use as system printers on their 9x9 machines. And printers are obviously unaffected by the transfer rate limitations of HPIB. What to do?

Until support for network printers was introduced with MPE/iX 5.5, the only printers supported on 9x9 systems were the high-end HP 5000 family of printers with the SCSI interface and any printer with a serial interface. Serial interface printers can be connected to any 9xx system through a DTC. The only officially supported (and I might add, officially recommended) way to connect 256x printers to a 9x9 has been to convert them from the HPIB interface to a serial interface and hang them off a DTC.

Many of us have frequently discussed how unacceptable serial interface printers are for critical system printer functions -- the recovery process from printer problems such as paper jams is prone to failure. The HPIB CIPER protocol, however, supported a very robust recovery technique called checkpoint recovery. SCSI printers fall somewhere in between as far as recovery is concerned.

This ongoing thread resumed when someone posted a query about how to best make use of an existing HPIB 2564C printer on a new 9x9 system.

Fortunately, a site in the Northwest demonstrated some time ago that "unsupported" is not the same as "does not work", at least where the HPIB interface and 9x9 systems is concerned. The newer HPIB interface cards work just fine in 9x9 systems (search the HP3000-l archives for the complete story). So, the unofficial recommendation is to keep your 9x7 or 9x8 HPIB card, slap it into your new 9x9 system, and continue using your existing HPIB 256x printers. We all routinely use officially unsupported software utilities (virtually everything in TELESUP and all of the CSL, for example). Is using an officially unsupported hardware interface such a large step to take?

To get a detailed expert perspective, I am going to quote extensively from several postings by Larry "MPE/iX Spoolers 'R' Us" Byler that deal with printers, the MPE/iX spooler and recovery issues:

"If checkpoint recovery (the ability to quickly and correctly resume printing at a specific page following an interruption) is important in your shop, I would strongly recommend retaining your HP-IB interface, supported or not.

"Similar 'recovery' on serial printers is notoriously haphazard and not always accurate because the printer does not participate in the recovery process, as it does when it is an HP-IB printer.

"The HP 5000 SCSI printers support what I've been calling 'page level recovery', the ability to restart printing at a specific page, printing all previous pages. However, the spooler must re-transmit the entire spool file from the beginning (with the printer in silent-run mode until it detects the start of the target page). 'Checkpoint recovery,' as I use it, was available only under the CIPER protocol used by HP256x HP-IB printers. With this protocol, the printers sent a snapshot of their state (or 'checkpoint') as of the top of each printed page back to the spooler, which archived it in the checkpoint file. If it was necessary to resume to a particular page, the spooler fetched the checkpoint for that page, sent it to the printer (thus re-establishing the printer's state as of that page), then started transmitting data from that point in the spool file (not the beginning).

"There were data-dependent cases even for the CIPER printers which made it impossible to use the checkpoints, but this merely degraded those spool files to 'page level recovery' mode. For those files, it was necessary to silent-run from the beginning, but you still restarted at the desired page.

"The LaserJet printers and the large SCSI printers were not designed to return checkpoints because they have entirely too much state information. Returning that, even at every N pages, would seriously degrade the forward print throughput. Archiving it would cause a massive explosion in the size of the checkpoint file.

"Back in the late '80s when we were designing the Native Mode Spooler, we and the Boise team had a neat concept of indirect checkpoints. With this scheme, the printer returns only the most simplistic state information in its page-by-page checkpoints. But it also recognizes all data that changes its global environment (font and form downloads, etc.), and returns a structure that defines the limits (start and end) of this data in the spool file, but not the data itself. We called this an indirect checkpoint. The spooler archives as many of these structures as the printer returns. When it's necessary to resume to a particular page, the spooler first transmits all the data within all the indirect checkpoint limits (and only that data); this re-establishes the desired printer state. It's then possible to transmit actual print data starting at the target page rather than at the beginning.

"The concept was never implemented, for several reasons, some of them technical. The biggest reason was that Boise was already moving away from system printers and in the direction of desktop printing (with the early LaserJets just a few years away). This massive cooperative effort to reduce silent-run time is useful only in large (hundreds of pages) jobs. They simply couldn't justify this kind of effort for a printer aimed at small output jobs."

In response to a question about whether MPE/iX could support checkpoint recovery to serial printers, Larry responded:

"As far as I know, there are no plans to change the printers, nor the data path through the DTC, to use a bidirectional protocol. PJL won't help if the serial print path, link, and/or DTC don't support bidirectional traffic."

Why did that spool file
print on that printer?

This thread was prompted by the following (paraphrased) question:

"I recently added a second printer to our system and configured it with the same device class as the first printer. I thought it was logical to assume that since we now have two printers with the same device class that any printout sent to that class would generate a console message requiring a reply to resolve which printer we intended (as is currently the case with our three tape drives, all configured with the device class TAPE). Instead, the spooler seems to choose arbitrarily a printer on its own. I once sent several print jobs at the same time to the device class shared by both printers and found, surprisingly, that the printouts had been split between the two printers.

"I have since worked around the problem by allowing the users, via command files, to set up a default printer based on the LDEV number, rather than device class. However, the above-mentioned behavior doesn't seem normal, or consistent with what I've experienced in the past with MPE. Any comments or explanations?"

While a reasonable assumption, tape drives (and terminals) are the only devices that generate REPLY requests -- it's been that way since the beginning. Printers have never routinely generated REPLY requests to determine the physical print device in the manner that tape drives generate REPLY requests to determine the physical tape device. Printing is under the control of the MPE/iX spooler, a sophisticated printer-traffic cop.

There are two things going on here that determine the behavior the questioner is seeing. The first, "pooled printers", has been a feature of MPE since at least the late 1970s (as was pointed out by several responders). The other, the order in which files are actually printed, changed with the MPE/iX Native Mode Spooler.

If there are two or more printers in the same device class, the spool file at the head of the queue for that class goes to the first available printer in the class. If multiple printers in the same device class are available to print a file, the actual printer used is more or less random. The reason is that "file ready" messages are sent to the spooler process for each of the printers. Whichever process is dispatched first grabs the file, the rest find no file to print and go back to sleep.

The situation gets a little complex when you have printers that "share" queues.

Say you have two printers, LDEV 100 and LDEV 101, both in the same device class "LP". If you specify LDEV 100, your print file will be in a queue for LDEV 100. IF you specify LDEV 101, your print file will be placed in a separate queue for LDEV 101. If you specify "LP", your print file will be placed in a queue that is serviced by either LDEV 100 and LDEV 101, depending on which is available first. Thus, you have three queues for the two printers.

Suppose now you have the following situation:

LDEV Class
100 LP LP1
101 LP LP2

You now can have five queues: "LP", "LP1", "LP2", "100", and "101". If you print to "LP", your printout can go to either LDEV 100 or LDEV 101. If you print to "LP1", your printout can only go to LDEV 100.

This can come in handy if you need to run your printer "hot". If all your programs are "well behaved" and always use the class name for a printer, you can stop spooling on the LDEV, close the queue for the LDEV, and print to the printer "hot" (unspooled) while applications are able to continue spooling output to the open device class.

When multiple files are available for a device, the file chosen is the highest priority file above the system-wide (or device-specific) outfence. If there are multiple files at that priority, the file that first entered the READY state (that is, was closed with an implicit or explicit FCLOSE by the user) is selected. Thus, the order in which files are printed is not under user control. Note that this behavior was new with the Native Mode spooler.

On Classic MPE systems, altering anything (for example output priority) changed the timestamp used to resolve print order for spoolfiles in the same queue and with the same output priority. On a Classic system, if you altered the output priority of a group of spoolfiles in the same queue to a priority greater than the current applicable OUTFENCE, the spool files would print in the order that they were altered. With the Native Mode Spooler, the print order is no longer controllable and hence is unpredictable.

If you find this is a problem and you are adventurous, you can follow my example and create a program using the Operating System AIFs to mimic the functions of ALTSPOOLFILE, altering the READY date and time to RIGHT NOW whenever output priority is changed. Another impact of the Native Mode spooler on print order is the behavior of multiple spool files, each with multiple copies specified. The Classic spooler would automatically interleave files. The Native Mode spooler does not and, in fact, the interleave behavior can only be achieved with considerable intervention.

Transient space: How much you need,
where you need it, plus basics on user volumes


This is one of those times when the answer to one question generated a lengthy discussion about an accepted "rule of thumb" in configuring new disk drives; i.e. permanent and transient space should both be set at 75 percent -- at least for LDEV 1. The original question and answer:

Q: "I have a 4Gb drive that I am trying to add to my 937. We are running 5.0 Express 3. The model number is HPA3352A and is made by Seagate. How can I configure that drive on my HP 3000? Using Sysgen and doing the adev, I get 'This device is not supported'. Is it supported and what ID type do I use??"

A: Shut down your system and Ctl-B RS
ISL>ODE
ODE>MAPPER
MAPPER>RUN

Write down info for new peripheral (should be ST15150N), then restart the system. Say path is 52.2 Then in SYSGEN, configure in ldev x.

SYSGEN>AP 52.2 ID=PSEUDO
SYSGEN>AD x ID=ST15150N PATH=52.2.0
Hold, Exit, Keep, Exit

Shut down your system and
Ctl-B RS
START NORECOVERY

Do not forget: VOLUTIL> NEWVOL MPEXL_SYSTEM_VOLUME_SET:MEMBERx x 100 100

Almost immediately, someone from HP responded:

"Make that volutil> NEWVOL MPEXL_SYSTEM_VOLUME_SET:MEMBERx x 75 75 ... or use 80/80 for permanent and transient space, but never 100 percent for permanent. Great way to crash a system eventually. I'm sure you realize users are like kids and money -- they use all resources available without understanding the impact of what they're doing. Go 100 percent on permanent space without room for transient space and you don't have to worry about performance anymore ... the system's down anyway."

To which a well known, long time performance guru replied:

"Actually, I have always done 100/100 with nary a problem. You can go to 100 percent permanent and still leave room for transient by just not using all the permanent. I have configured hundreds of systems this way and never seen a problem. You will not crash your system this way."

And the cyberspace debate was on. From one poster:

"True, but only if you don't get a rogue process that chews up ALL of the disk space. I'd prefer the process to come down, than the entire system. For the system set, I relax it a little more to 90/10 though."

And another:

"'By just not using all the permanent'? This reminds me of the Unix user who justifies not being able to un rm by saying that you shouldn't delete files that you don't mean to, and if you do, well, some people just shouldn't use computers.

"I've prevented (dataset) space problems with careful and scrupulous monitoring, and monitored carefully by building tools to monitor automatically, butŠ crashing or any other loss of service is a high price to pay for what may be an aberrant surge in activity (like the space problem I did suffer during a physical inventory)."

The guru responded:

"So much for the short answer.

"As I said before, I have configured literally hundreds of systems at 100/100 for the system volume set. In fact, you can even increase the allocation on ldev 1 from the default 75/75 to something like (86-93)/100. And I have done that on many systems, as well.

"My point isn't to be careless about how much free space you have. Quite the opposite. I should have said that I usually recommend that you keep 20 percent free space. If you do, you can stay with the 100/100. Disk space utilization bears watching.

"I should have also said that I am a big proponent of user volumes. If you move all your applications from the system volume set and configure in plenty of available space on the system volume set (to be used for MPE, transient and spool space), you won't run out of disk space used for transient.

"If you are using user volumes (and why aren't you, if you aren't ?) you would almost never configure at less than 100/100 since transient space is only actually used on the system volume set. The only legitimate reason for configuring less than 100,100 is if you want to keep some disk space in reserve.

"So, in summary I would rather spend a little time managing my disk environment than pay a 25 percent premium for each of my disk drives, which is what I would do if I didn't configure all of my space."

Several users reported personal experiences:

"We recently had a rogue process (COBOL/VPlus program creating a gigantic stack dump) fill up all available transient space on the system volume. MPE/iX simply shuts all queues and puts all incoming jobs in a WAIT state. A little hair-raising on a busy system, but pretty easy to deal with when you know what's happening."

"Yes. We have seen several crashes that were related to low transient disc space. One of the systems I work with has very high transient space requirements, and we got burned a number of times until we learned to keep a sufficient amount of unfragmented transient space."

A voice of moderation added:

"... MPE/iX will get 'cranky' as it runs out of transient space (or <long-name> volume set space in general), shutting spool queues, refusing additional logons, etc.; but it takes some real persistent trouble to crash it.

"As to the VOLUTIL perm/trans issue of 75/75, 100/100, or whatever your choice, there are some 'undocumented' features at play here which you can only discover by trial by fire, so to speak.

"(1) For user volumes, transient space is a moot point (no transient space is ever allocated on a user volume).

"(2) Permanent+Transient allocation must be >= 100. You can't 'reserve' any space to act as a buffer when you're running tight on space.

"(3) Regardless of your LDEV 1 allocations, as of MPE/iX 5.0 the system will not allocate more than 50 percent of available space provided space is available on another volume.

"For user volumes, setting 100/0 is typical; setting permanent to anything less than 100 can however be used to provide a buffer (do you want to run out of space with no immediate fix, or run out of space and fix it with a quick volutil command?). You can use 95/5 (or similar) allocation to provide a buffer against exhausting your disc space catastrophically.

"For system volumes, however, you can't do this by rule (2) above, so as you run low on space the volumes of other than ldev 1 will fill to capacity due to rule (3).

"This is typically not a problem with MPE applications, but if you are running any Posix applications, particularly those using fork() (and that includes the shell itself), you can run into a long-known but outstanding bug (up through 5.5) that will cause fork() to fail with an error (resource busy, try again). fork() clones the parent's stack on the same ldev and must have contiguous space."

Then a cry from the wilderness:

"Ok. I could use a primer on user volumes. I tried reading Volume Management and just didn't get it on a casual read. Here I am dealing with an Oracle-on-Unix guru who wants to know how to control the logical devices (meaning the UNIX conception of logical devices, not MPE's LDEVs), so the more I can understand this, the better."

The response:

"I just cruised the 3000-L archives for a discussion we had about this a few months ago. With the miracles of cut-and-paste, here are some of the thoughts: [Quote:]

There has been lots of dialogue about the pros and cons of doing this but I think that we can summarize by saying the following:

1. There are lots of pros.
2. There don't seem to be any serious remaining cons.

The pros are:

1. Better system resiliency in the event of a disk mech failure. 2. Better overall performance due to the smoothing out of transaction management posting activity (each volume set effectively receives its own XM). 3. Better positioning for high availability solutions (mirrored disk requires user volumes) 4. More operational control of disk environment. You don't need to perform a complete system install in order to add new disk drives.

The cons are:

1. Initially, it's a pain in the butt to set up. 2. Requires (slightly) more overall disk space to manage effectively. 3. Slightly more operational awareness required in a user volume environment.

At HP World, several of us were involved in an HP 3000 system performance roundtable. I think that we reached consensus that user volumes are good but mainly for environments that have more than four to five disk drives.

Another issue of importance is that you must be sure to keep enough disk space in the MPEXL_SYSTEM_VOLUME_SET to handle all of your transient disk space requirements. Most of us at the conference agreed that you need a minimum of three to four disk drives to handle the transient requirements in a medium or larger installation. [End quote]

So, what do I do? I've got only a small system with two 2Gb drives, so no user volumes. LDEV 1 is set to 90,100 and LDEV 2 is set to 100,100. To check your system, use the SHOWVOL command in VOLUTIL. To change any allocations, use the ALTVOL command.


Copyright 1996, The 3000 NewsWire. All rights reserved.