net.digest December 2002

December 2002

Net.digest summarizes helpful technical discussions on the HP 3000 Internet newsgroup and mailing list. Advice here is offered on a best-effort, Good Samaritan basis. Test these concepts for yourself before applying them to your HP 3000s.

Edited by John Burke

Wirt Atmar postulated in November, “on many days now, if it weren’t for the off-topic material, there wouldn’t be any postings at all.” While it may well eventually come to this, I do not think we are there yet. As someone who monitors the 3000-L mailing list carefully, and has been for the last seven years, I have seen a change in the nature of postings, particularly the off-topic postings, but not a significant lessening of useful content. Unfortunately, the off-topic postings have become meaner and more confrontational — in part I think because people are still mad about 11/14/2001, but also because politics and religion have dominated. However, I still always find more helpful content then I have room for in the NewsWire.

Last month I moved my personal HP 3000 to my home (among other things, you can find searchable versions of all my Hidden Value and net.digest columns there at www.burke-consulting.com). In my first section this month I give you a recipe for how to do this kind of move. Also, in looking back on previous editions of net.digest, I noticed I’ve devoted little space to IMAGE/SQL. So this month there are two sections discussing various aspects of capacity management.

As always, I would like to hear from readers. Even negative comments are welcome. If you think I’m full of it or goofed, or a horse’s behind, let me know. If you spot something on 3000-L and would like someone to elaborate on what was discussed, let me know. Are you seeing a pattern here? You can reach me at john@burke-consulting.com.

Want to set up an HP 3000 at home? Here’s how

It would appear from the traffic on the Internet that more and more people are installing HP 3000s in their homes or offices behind simple routers that connect to the Internet via DSL. With prices for used systems less than $3,000, and sometimes substantially less, I expect the trend toward home HP 3000s will continue. Since I just went through this very same process of installing an HP 3000 at home myself, I thought I’d share what I learned and what I had to do to make everything work.

About two years ago I purchased a Linksys BEFSR41 EtherFast Cable/DSL Router with 4-Port 10/100 Switch so that I could connect multiple computers to my DSL line. This is a very popular device with those users posting to 3000-L. It is currently available for under $80 from almost any bricks and mortar store or Internet retailer. Linksys has continually updated the firmware, not just to correct bugs, but also to add numerous features over the years. My DSL provider uses PPPoE and the Linksys device can be programmed to keep the connection continuously open. The BEFSR41 uses Network Address Translation (NAT) to enable multiple computers to share one IP address.

About a year ago I purchased a static IP address and a domain name because I was planning for the day when I would have a server at home. In early November I moved my HP 3000 home and proceeded to attempt to get it live on the Internet, not expecting any problems. You see, I already had had it on the Internet for some time using the firewall and T1 Internet connection where I worked so I knew things like DNS were configured and working correctly. Also, several months earlier I was experimenting putting my Linux server on the Internet. I discovered, after several conversations with Linksys support (very good by the way), that MTU (Maximum Transfer Unit) had to be enabled on the router (see Filters tab) and set to 1492 (because of PPPoE and DSL; otherwise, for an Ethernet LAN it would be 1500).

With this setting, I was able to telnet to my Linux box across the Internet, ftp to and from the Linux server and access Web pages served up by Apache. Therefore, I was not expecting any problems when I hooked up my 3000. I changed the IP address and gateway to conform to my LAN and pointed DNS to my ISP’s name servers. I then set up forwarding on the Linksys router of port 23 (telnet), ports 20 and 21 (ftp), ports 1537 and 1570 (VT) and port 80 (http) to my HP 3000 and proceeded to test. Telnet and VT appeared to work just fine, but neither ftp nor http seemed to work. In each case I could connect, but could not transfer data. Or so I thought.

I discovered after a lot of testing and help from various friends that the problem was related to the size of the file or page. Anything 1Kb or less worked fine. Nothing 2Kb or more worked. Gavin Scott pointed me in the direction of a posting he saw on SlashDot that, after a lot of sleuthing and reading of linked articles, led to the solution. The key was in a comment by Bruce Toback (“some TCP/IP stacks use the Do Not Fragment (DNF) bit to discover the MTU: they’ll send a large packet, and if it bounces back due to an MTU problem, they’ll send ever smaller packets until they find the MTU for the path”) and these two definitions from one of the referenced articles:

“a. MTU - Maximum Transfer Unit. This is the maximum number of bytes that your computer will send out in a packet. This should be set according to what your connection can handle. For ethernet this should be set to 1500. For PPPoE links this should be set to 1492.

“b. MSS - Maximum Segment Size. This is used in negotiating what the MTU of a connection between two hosts will be. Essentially this is saying ‘please don’t send me packets bigger than X.’ This should typically be set to 40 less than your MTU to allow room for headers.” [Note that this means my MSS should be 1492 - 40 = 1452.]

I reasoned that the MPE/iX TCP/IP stack was probably not sophisticated enough to use the discovery method and I set MAXIMUM SEGMENT SIZE (NETEXPORT.NI.LAN1) to 1452 in NMMGR (it was set at the default of 1514), cycled the network and, viola, FTP and HTTP access worked. You can see for yourself by pointing your browser to www.burke-consulting.com. By the way, don’t bother trying telnet, FTP or VT — I’ve turned them off at the router.

Warning note: As I was doing all this, thanks to a posting by Steve Cooper, I became aware that the BEFSR41 is vulnerable to a remote DoS attack if the firmware version is less than 1.42.7 and the Remote Management Interface (RMI) is enabled (the default is disabled). Firmware version 1.43 corrects the problem. Until you can update, be sure to disable RMI.

And this time, maybe it will stay dead

One of the great Urban Legends about the HP 3000 has to do with prime numbers and the capacity of IMAGE master datasets. It came up again in a posting to 3000-L about MDX, part of which said, “I was looking in the IMAGE manual and now I’m trying to set up automatic capacity management on all my production databases. I’ve always heard about the prime number factor for manual and automatic masters. I was wondering if that’s still necessary with automatic capacity management, or is it handled internally by IMAGE?”

Fred White, who is sometimes referred to as the Father of IMAGE, replied: “In a recent email, someone inquired about Automatic Capacity Management asking about the advisability of using prime number capacities for Master datasets.

“The ‘prime number’ capacity for Master datasets is a never-to-be forgotten fiction accidentally created in the original IMAGE manual way back in 1974 and subsequently propagated by a couple of so-called experts.

“The thing to avoid is a capacity one of whose factors is a large power of 2 (i.e., 2**N where N>>1). Consequently, the simplest valid ‘rule’ is to select any acceptable ODD integer.

“The Adager Web site (www.adager.com) is the source for numerous technical papers about IMAGE, one of which, entitled ‘Dynamic Dataset Expansion’, was authored by me. I strongly suggest that those of you who choose to use IMAGE’s Automatic Capacity Management feature read this article.”

What Fred does not say is that Adager, unfortunately, may help perpetuate the Urban Legend in its capacity change module for master datasets by suggesting prime number values when modifying the capacity. Since, by definition, prime numbers are not divisible by a large power of 2, they are appropriate suggestions. But as Fred points out, it is not necessary to find a prime number when simply an odd integer will do.

How to use MDX or DDX

While we’re on the subject of automatic capacity management, let’s consider how to best make use of it. I am still amazed at how few people use DDX and MDX. Both got some bad publicity when first introduced because of bugs. But both are stable now and have been for some time. Anyone contemplating some kind of migration should especially consider DDX and MDX. With DDX and MDX properly configured, capacity management then becomes one less administrative task to perform so that those energies can be re-directed toward your migration project. Ken Sletten, Chairman of SIGIMAGE/SQL, outlined his experience with DDX and presented a few best practices:

“We have been using dataset dynamic expansion (DDX) on our detail datasets for years without any problems (there were some real problems with DDX in certain cases in the early years, but if you are on a recent or current release of TurboIMAGE you should have no worries on that score). Two other things to consider:

“1. Don’t make your DDX increments too small. While in large part dependent on the daily ‘record add rate,’ all else being doable, I tend to make the DDX increment somewhere around one percent of the CURRENT capacity, plus or minus a bit. There is nothing magic about that formula; it’s just a rough convention that we adopted. I have heard stories about people setting DDX increments at less than 10 entries. If you add a lot of entries with very small DDX increments, the disc fragmentation could get really ugly. Several of our large, ‘add-active’ datasets have increments of 20,000 entries or more.

“2. Properly set up (and perhaps/probably with an occasional run with a defrag product). DDX on Details can run until you get close to your MAXCAP with no problem. However, that is not the case (or at least likely will not be the case) with MDX. MDX should, I believe, usually be considered as just a temporary stop-gap (although temporary might last for a while); until you get a chance to normally expand and rehash the master with the database tool of your choice. If you push too many master entries into the expansion area, you could run into serious performance problems.”

This recommendation of Ken’s has to do with the way MDX is implemented. I agree that MDX should be viewed as a temporary fix to a master set capacity issue that allows production to continue until a suitable maintenance window when the dataset can be properly re-sized.

Fred White added, “Ken is ‘right on.’ Increments should not be huge (adequate free space may not exist at the time of a dynamic expansion) nor should they be tiny (yielding fragmentation of disk space with lower performance).

“Don’t try to save space with masters. Make them larger than needed but not way too large (serial read performance degradation). The ‘wasted’ space generally results in fewer synonyms, which results in improved non-serial performance (DBFINDs, DBPUTs and keyed DBGETs).

“Do try to save space with details. Make them as small as you can afford. If they don’t expand or if they expand only a few times, you’ve saved disk space. If the initial capacity is set too large, you’ve wasted disk space.

“Also, most databases have more details than masters and most of those details are much larger than most of the masters. That’s why your efforts to conserve disk space should focus on details.”

Thanks to Ken and Fred for the valuable advice. Unfortunately, there are few hard and fast rules in designing capacity management schemes. And, often those rules have to be qualified with “it depends.”

But here are a few more observations you can use in designing your own strategy. If you know that your application does few, if any, serial reads of large master datasets, then size them to minimize synonyms, maximizing DBFIND, DBPUT and DBGET performance. Also, do not fall into the trap of setting the initial capacity of detail datasets too high on the theory that you will “grow into it” in a couple of years. Remember, you will be backing up all that empty space. I have seen situations where smart-sizing detail datasets, using DDX to expand only when necessary, has significantly reduced backup time.

John Burke is the editor of the NewsWire’s HiddenValue and net.digest columns and has more than 20 years’ experience managing HP 3000s.