The Records of a File

A record of a file – also referred to as a logical record – is a collection of related fields of information. For each field, you define in your program:
  • The data type (binary or character, for example).
  • The length to hold the largest item of data that may occur.

The sum of all field lengths in a record is the length of the record. Figure 1 gives an example of the layout of a logical record.

Figure 1. Logical Record

Frequently, the length of a file's logical records is always the same. If this length is the same as the length of the unit of data transfer, this file is said to have fixed-length unblocked records.

A list of names, for example, can be defined as a fixed-length record file, one record per name. Most likely, the names do not have the same length. This means that names shorter than the specified length are padded with blanks and longer names are truncated to make them fit into the name field of the record.

The various record formats that you may choose for a file are discussed in this section:

Variable-Length Records

An insurance company's file of holders is a typical application that works with records of variable length. The more claims there were against a holder the longer is that holder's record in the file. Figure 2 shows the format of a stored record in such a file of policy holders.

Figure 2. Stored Record of Variable Length
where:
BL =
Block length: A four-byte field called block descriptor. Needed also for variable-length unblocked records because they are considered as blocks with a blocking factor of 1. The value in BL includes the length of both BL and RL.
RL =
Record length: A four-byte field called record descriptor. The value in RL includes the length of field RL.
The block and record lengths are to be stored in fields BL and RL as follows:
Bytes
Contents
0-1
Length in binary format
2-3
Reserved.

Spanned Records

This is the record format when the records of a file are too long to fit into the given block size. One part of the record is in one block and the remainder in another. The two parts have to be reassembled again for processing. The system does this automatically for EBCDIC records.

Figure 3 shows how records are divided into variable length segments.

Figure 3. Format of a Spanned Record
BL =
Block-Length field
RL =
Record-Length field

For BL, the format description given in Figure 2 applies.

For RL, the format is as follows:
  X'LLLL0f00'
where
f=0: only segment
f=1: first segment
f=2: last segment
f=3: middle segment.

Spanned records may be useful when a file is to be moved between device types allowing different block sizes. The maximum block size of the receiving device may be smaller than one record. In this case every record has to be cut into segments which is done by IOCS. Another example when spanned records may be useful is in text processing applications where very long strings of text must be written. You need not be concerned about the maximum data capacity of the I/O areas. IOCS divides your records into segments that never exceed the size of the output area in your program.

Undefined Records

Any record format that does not conform to the rules for fixed- or variable-length records is considered an undefined record. IOCS allows a program to process such records but does not support this processing. Therefore, if you want to block or deblock such records, your program must provide for these functions.

Programs that write undefined records must communicate the size of each record to IOCS. Programs that read such records are informed of their length by IOCS routines.

Block of Records

To save time and space in processing, records can be grouped into blocks. This results in larger transfer units. For example, data stored on tape by one write operation is separated from the data stored by the next write operation by an inter-record gap. The smaller your records are, the more gaps (unused space) occur within a file of data. Gaps allow the tape device to accelerate before it starts to read or write the data. They allow the tape to come to a halt after having read or written a record of data.

The time needed to start and stop a tape is significant for the overall speed with which your program can read from or write to a tape. Fewer gaps therefore result in faster processing.

To reduce the number of gaps, you can group two or more records into a block, a technique called blocking. The number of records in one block is called blocking factor. Figure 4 illustrates the blocking of records.

Figure 4. Block and Blocking Factor