Robelle Tech: New Interest in IMAGE Logging

June 2005

Boosting your e3000 productivity

New Interest in IMAGE Logging – Part II

Last month we talked about the basics of enabling and using transaction logging with TurboIMAGE. That discussion was mainly in the context of providing an audit trail for database changes. This month we examine logging from data protection point of view, examining the issue of “broken chains” and logical data consistency.

Broken Chains

Under the original IMAGE database, broken chains were a frequent topic of discussion. Now that we have the Transaction Manager on MPE/iX, broken chains are a minor issue. Let’s see why.

A broken chain typically occurs when you do a chained DBGET and the next entry (e.g. record) on the chain cannot be found. So what is a chain? It is a list of entries that have the same key value (for example, all Line-Item entries with Order-Number 123005). Each IMAGE entry in Line-Item maintains a forward and backward chain pointer. If those pointers become corrupted, they way you will notice is when your application program reports a broken chain.

Before Transaction Manager (XM), a broken chain could happen when the system crashed while in the midst of a DBPUT or DBDELETE operation. One of the entries had been updated on the chain, but the other entries that point to it had not all been updated.

With Transaction Manager, the operating system keeps track of all writes to the disk drives for TurboIMAGE. When recovering from a system failure, XM ensures that all the writes for a single DBPUT call are either completed, or are rolled back. In either case, the database is left in a physically consistent state.

Below are some email exchanges that help clarify “broken chains.” They are from a Jon Diercks’ 2001 Threads column on the Interex Web site at www.interex.org/pubcontent/enterprise/nov01/thrd1101.jsp

Will doing an ABORTJOB on a program will cause a broken chain in an IMAGE database? I’ve seen six broken chains (different systems, all on 6.0) in the past twelve months.

“Yes, from a logical frame the database may have only one-half of a transaction. But, from a relationship view there should be no broken chains,“ said Larry Simonsen.

Dennis Heidner, one of the authors of The IMAGE Handbook, said, “The broken chains don’t mean ABORTJOB is causing the problem. Each process running has a special hook put into it so that when it terminates and the process has a IMAGE database open, the abort process handler is called. It ensures that the database is closed in an orderly manner.

“Broken chains could be a symptom of a TurboIMAGE bug, defective hardware, using deferred posting, and system aborts

“Phantom broken chains can be reported if you are using a weak locking scheme, i.e., changing key value or deleting entries while another process is walking the same chain.”

The last point by Heidner is important. If you don’t lock while reading the entries on a chain, then another user could be changing the chain and you could get a broken chain error. But these are only phantom broken chains. If you rerun the code, you will not get the error again. It is possible that such phantom broken chain errors have been eliminated in the most recent versions of TurboIMAGE.

Logical Consistency

In the exchange above, Larry Simonsen says “from a logical frame, the database may have only one-half of a transaction.” Let’s look into that.

A typical logical transaction would be the creation of a new order. This might involve a DBPUT to Order-Header and several DBPUTs to Order-Line. The Transaction Manager (XM) ensures that you will get a complete Order-Line entry that is properly linked to the other lines for that order and customer, but it does nothing to ensure that you get the entire order.

XM does not guarantee logical consistency of your data.

How do you ensure logical consistency?

You use DBXBEGIN and DBXEND calls around all the DBPUT, DBUPDATE and DBDELETE calls that you make for your logical transaction. Yes, the definition of a logical transaction is up to the programmer.

There can be a lot of confusion about logical consistency, mostly because IMAGE kept adding logging and recovery features over a 30-year period. Luckily, Gavin Scott gives a clear explanation of the current state of affairs in the same email exchange noted above. Scott offered another point of view:

“It’s amazing how much superstition exists surrounding this kind of stuff, and how many unnecessary rituals and sacrifices are performed daily to appease the mythical pantheon of data integrity gods.

“Real broken chains are (supposed to be) impossible to achieve with IMAGE on MPE/iX, no matter what application programs do, or how they are aborted, or how many times the system crashes!

“The Transaction Manager provides absolute protection against internal database inconsistencies, as long as there are no bugs in the system and as long as the hardware is not corrupting data. No action or configuration is required on the part of the user.

“Logical inconsistencies (order detail without an associated order header record, for example) can easily be created by aborting an application that’s in the middle of performing a database update that spans multiple records. Of course, IMAGE doesn’t care whether your data is logically correct or not, that’s the job of application programmers.

“Using DBBEGIN/DBEND will have no effect whatsoever on logical integrity, unless you actually run DBRECOV to roll forward or roll back the database to a consistent point every time you abort a program or suffer any other failure.

“By using the DBXBEGIN/DBXEND “XM style” transactions, you can extend IMAGE’s guarantee of physical integrity to the logical integrity of your database. The system will ensure that no matter what happens, either all changes inside a DBX transaction will be applied, or none of them will be. Of course, it’s still possible to use this feature incorrectly (locking strategies are non-trivial as you need to lock the data that you read as well as that which you intend to write in many cases).

“MPE/V introduced a feature called Intrinsic-Level Recovery (ILR) which could be (and still can be) enabled for a database. This was sort of a mini-XM that forced updates to disk each time an Intrinsic call completed in order to ensure structural integrity of the database in the face of system failures.

“I believe that on MPE/iX, enabling ILR for a database does something really nasty like forcing an XM post after every update intrinsic call, which is a serious performance problem. ILR is no longer required on MPE/iX as XM will ensure integrity without it. With ILR you might be guaranteed that every committed transaction will survive a system abort, whereas without it XM might end up having to roll back the last fraction of a second’s worth of transactions. For almost any application this difference is negligible. Do not turn ILR on!

“There are more complexities if your application performs transactions that affect multiple databases or databases and non-database files. It’s possible to do multi-database IMAGE transactions, but only if the databases reside on the same volume set, I believe.”