| SUBSCRIBE | News Headlines | Technical Headlines | Planning Features | Advanced Search |

June 2001

Get the most out of your Robelle Tools

Suprlink: which is the Input file?

By Mike Shumko

If you have Robelle’s Suprtool, you also have Suprlink. They come together. Suprlink is a fast, efficient tool for joining two or more files by a common field. It works by scanning all the files at once and creating a merged output record combining all the fields. In Suprlink terms, you do:

input fileA
link fileB
output fileZ
xeq

You will get very different results if you reverse the Input and Link between fileA and fileB:
input fileB
link fileA
output fileZ
xeq

The results in one case will be wrong, though you may not immediately recognize the error, because in both cases you will probably get a lot of data in the output file. How do you choose which is the Input file and which is the Link file

The general rule of thumb is that the file that has multiple records for the same key value is the Input file, while any file that has only one record per key value is a Link file. If the files came from IMAGE dataset selections, and one of the datasets was a master while the other was a detail set, then the choice is simple: the detail file is the Input, and the master file is the Link.

Consider this example: You have a file of sales orders, one record per order, and a second file of customer names and addresses, one record per customer. Both files have the sequential customer ID, and you want to create a third file that has the customer name and address attached to each order.

ORDHEAD file:

CUST-EDP FULL-ORDER-N ENTRY-DA LAST-SHI

123 P00025470000 20010511 00000000
123 P00023690000 20010516 20010517
124 P00023580000 20010516 20010517
124 P00025890000 20010517 20010517
124 P00024670000 20010517 20010517

CUSTS file:

CUST-EDP NAME CITY ST

123 Murray Sporting Goods Inc. Akron OH
124 Johnson Valves Ltd. Perrysburg OH

The Input file will be the file of order headers, because there are multiple orders for any individual customer. The Link file will be the customer addresses, because you will have only one address for any customer. The resulting file will have all the order header data, with the corresponding customer data appended to each record.

Input ordhead
Link custs

CUST-EDP FULL-ORDER-N ENTRY-DA LAST-SHI NAME CITY ST
123 P00025470000 20010511 00000000 Murray Sporting Goods Inc. Akron OH
123 P00023690000 20010516 20010517 Murray Sporting Goods Inc. Akron OH
124 P00023580000 20010516 20010517 Johnson Valves Ltd. Toledo OH
124 P00025890000 20010517 20010517 Johnson Valves Ltd. Toledo OH
124 P00024670000 20010517 20010517 Johnson Valves L Toledo OH

If you were to reverse the Input and Link files, the output file would have all the customer data, with only the first corresponding invoice appended to each customer. This is not the result we were looking for.

Input custs
Link ordhead

CUST-EDP NAME CITY ST FULL-ORDER-N ENTRY-DA LAST-SHI

123 Murray Sporting Goods Inc. Akron OH P00025470000 20010511 00000000
124 Johnson Valves Ltd. Perrysburg OH P00023580000 20010516 20010517

So there you have it: Input = Repetition, Link = Singular. By the way, the example records used here are from the Ecometry database.

— mike.shumko@robelle.com

 


Copyright The 3000 NewsWire. All rights reserved.