The Fun of Understanding Unix

February 2002

The Fun of Understanding Unix

Linux shows the potential for an OS to be simpler than MPE

By Gavin Scott

The Linux world is so much more interesting than other Unix flavors right now. Just in the last few months it has started to look really viable, and not just as a server OS. We installed RedHat 7.1 on a laptop recently and it’s very impressive.

I’m actually not writing this in response to HP’s notice of the end of its 3000 support, but as a result of recent experimentation with RedHat Linux. The pace at which Linux is improving and the rate at which high-quality “free” (both as in speech and as in beer) software is being produced is just amazing. Linux now has several “journaled” file systems similar to MPE’s Transaction Manager, known as XM. And we’re probably just about to the point where the Linux free database people (PostgreSQL, etc.) start talking to the file system people about doing explicit transactions between the database and file system. This is the point where you have the real equivalent of MPE and IMAGE’s database-to-XM integration — the engineering that is responsible for much of the data integrity we so often use to tout MPE’s superiority.

Just the thought of being able to build real and robust business solutions using nothing but free software is quite intoxicating. Imagine never having to worry about the future viability of some small middleware vendor or even your DBMS supplier.

For commercial developers it might mean that the money is going to be more in building specific, custom, end-user solutions rather than a lot of meta-solutions, middleware, tools and the like. That’s because in the Linux model, anything that is general enough to be used by a lot of people is going to eventually be cost-free.

And as I come to learn more and more about Unix, and Linux in particular, the more sense that all seems to make. It presents many challenges to people used to MPE’s limited number of ways of doing things (which were well chosen to be just those things you needed to do). But once you understand the fundamentally simple way that everything works, Unix becomes much easier to comprehend.

A toy for older kids

One can perhaps think of MPE as a set of simplified building blocks, not unlike one of those preschool puzzles that only have four pieces that can only go together in one obvious way. There is a database-shaped piece and a COBOL-shaped piece, and a spooler-shaped piece, and so on, and you can only put these pieces together in a limited number of ways.

To be an MPE user or developer, you only need to understand the shapes of a few pieces and how to put them together to create an application. You don’t have to understand how those pieces themselves are constructed, and for the most part you can’t, because there’s no way to take them apart to see how they work.

Unix is more like a toy for older kids (Legos, for example) where the generality of the system is expanded by replacing the limited set of special purpose blocks with a different set of smaller and simpler ones. These simpler blocks also turn out to be what the larger “preschool” blocks are made of.

A typical MPE user looks at Unix and sees an apparently infinite complexity of incomprehensible programs and arcane scripts of one sort or another, and assumes that the complexity of Unix is just too great for any mortal to deal with. The assumption is that where MPE has a small number of brightly colored building blocks, Unix seems to have an unending variety of complex blocks that are all rather ugly and it’s not clear what the “right way” is to do anything.

The difficulty arises from not realizing that while on MPE the big friendly building blocks can be considered fundamental and opaque entities, on Unix everything that looks like an MPE-sized building block is really just a construction of small, regular, Lego-sized blocks, and that the key to understanding Unix is to develop a good understanding of the fundamental Lego sized blocks which everything is made out of.

Comparing simple and complex

On MPE this would be the equivalent of needing to understand the complete behavior of all of the myriad MPE Intrinsics before you could use, say, an Image database. This would be quite difficult due to the very rich nature of the MPE Intrinsic interface. And MPE files have more options than a Swiss Army knife has blades (MPE subsystems often have to go to a lot of trouble to hide these underlying gory details from the user).

On Unix, when we come to look at the equivalent of the MPE Intrinsics (basically the set of “system calls” supported by the kernel and its associated driver modules), we find that the Unix API on which everything else is built is much simpler and very much more consistent than are MPE’s equivalents, and consequently it’s not that hard to understand essentially everything about how Unix works, at least at the level which an application program interacts with it.

The implementation of Unix may be terribly complex (the kernel source code, for example), but this complexity is hidden from application programs behind a surprisingly simple set of interfaces.

In MPE it’s the high-level stuff that is simple, and the low-level stuff that’s complex. In Unix, it’s the low-level stuff that’s simple and the high-level stuff that’s complex. In either case, the key to using the system effectively is to understand a great deal about the level of the system that is simple, which then makes it possible to deal comfortably with those parts of the system which are more complex.

If you try to understand Unix by looking at the syntax of all the available (high level) commands, then you’ll probably never understand it. If, however, you step back and learn a little about the system call API (i.e., the primitive functions that the kernel provides from which everything else is built), then it becomes much easier to deal with the seemingly never-ending collection of exotic high-level building blocks the system provides. When you see something that looks incomprehensible, it’s just a matter of remembering that whatever this program claims to do, it has to implement it in terms of a small set of very simple operations.

The result can be a recognition that, while the whole Unix system is very rich (i.e. complicated) — and many Unix applications require a lot of work to understand because the flexibility of the simple low-level primitives from which everything is constructed tend to encourage a lot of flexibility in higher levels — this complexity is not an inherent requirement, and someone looking to implement a specific solution can do so in as simple a manner as they desire.

It’s just that most Unix developers, once they learn the underlying fundamental simplicity of Unix, tend to build large baroque constructions because it’s easy to build these things on top of a simple (rock solid) foundation layer. In contrast, it’s quite hard to build them on top of a much richer (and therefore less understood and harder-to-deal-with) foundation such as one finds in an OS like MPE.

Once the set of things a program can do is limited by the underlying simplicity of the system, the interfaces between programs become much simpler because my program doesn’t have to account for an endless set of possible things that your program did. This makes it much easier to write tools with a higher degree of generality, and promotes the reuse of existing programs rather than the generation of custom programs for each problem.

A full toolbox

If you start with a full 3Gb install of a current Linux distribution (RedHat 7.2 or something else of the same vintage), then the variety of useful tools available to you just boggles the mind. For almost any task you might want to accomplish, there’s a good chance that there is already a tool that will perform that task for you, or there exists a small set of tools that you can string together with a little “glue” to accomplish your task without any traditional “programming.”

The problem then becomes one of discovering (or choosing between) those tools that the system already includes, and figuring out how to fit the puzzle pieces together to get the picture you desire.

In general, I think you’ll see that the things you find in Unix are harder to understand initially, but once you understand something you’ll find that it’s actually simpler than, say, the MPE way of doing things.

Another way of saying it is that you need to understand what you’re doing upfront to a greater degree in Unix than you do in MPE. You won’t get as far in Unix by just trying to “bang on it until it works.” Because MPE tries to limit the number of things you can do to those that are useful, you can often try all the possible things until one works. Unix tries to make it possible to do anything you want to, therefore you won’t get very far with this method unless you have an infinite amount of time to try all the possibilities.

Which is better?

Now, is Unix better than MPE? I think you’ll have a hard time ever convincing someone who likes MPE that (at least generic) Unix is preferable. Unix trades ease-of-use for generality whereas MPE tries to retain ease-of-use at the expense of things like increasing the underlying complexity of the system. Unix expects more out of its users than MPE does.

I guess my point in all of this is that Unix is probably still not as good as MPE for the kind of things most of us use MPE for. But it is possible for anyone to fully understand it and in many ways it’s actually simpler than MPE is. It can even be a lot of fun.

When you combine the fact that Unix can actually be an “okay” operating system with the pace of advancement in the free software world and the “freedom” that results from being able to build an application without selling parts of your soul to an operating system company, a database company, and a half dozen middleware tools vendors, it doesn’t become hard to see Linux as the development system of choice for a lot of people, maybe even those who, under different circumstances, would have never thought of using anything other than MPE.

Gavin Scott is vice president at Allegro Consultants, a company specializing in software, services, and consulting for clients with Hewlett-Packard PA-RISC based computers.