B+Tree Indexing

B⁺Tree Indexing

B⁺Tree indexing is a method of accessing and maintaining data. It should be used for large files that have unusual, unknown, or changing distributions because it reduces I/O processing when files are read. Also consider B⁺Tree indexing for files with long overflow chains.

Note:

Unlike block indexing, where the number of I/Os can increase substantially as the file gets larger, the number of I/Os using B⁺Tree indexing remains minimal regardless of the size of the file.

A B⁺Tree file consists of a data file, which contains logical records (LRECs), and an index file, which contains technical logical records (TLRECs). The B⁺Tree index file, which consists of blocks (also called nodes), is maintained internally by the TPFDF product. The index file has its own file ID, DSECT, and DBDEF statements. The prime block of the B⁺Tree index file (also called the root node) is pointed to by the header in the prime block of the B⁺Tree data file.

B⁺Tree indexing is similar to block indexing but has the following advantages:

Dynamically updates TLRECs
Has an unlimited number of TLRECs
Can use mixed key organization for file operations
Is used for all LREC searches when the specified keys entirely or partially match the default keys in the DBDEF statement.

Notes:

If the distribution of overflows can be predicted, you may want to use an algorithm. A well-distributed algorithm that has one or two overflows for each subfile will probably outperform a B⁺Tree index.
You can define a file to use an algorithm and B⁺Tree indexing. The algorithm accesses the subfile and B⁺Tree indexing accesses the record in the subfile.
B⁺Tree files cannot use FARF6 file addresses.

B⁺Tree Index File Node Blocks

B⁺Tree index file node blocks contain TLRECs that contain file addresses of the blocks at a lower level of a file. Those lower-level blocks may be other node blocks or data blocks. TLRECs in node blocks have the following layout:

Size of the TLREC (2 bytes).
Primary key (1 byte).
- Key 03 for a node pointing to another node
- Key 04 for a node pointing to a data block.
File address of the lower-level node or data block (4 bytes).
Record code check (RCC) equal to X'00' for leaf nodes, or the last byte of the data file ID for nonleaf nodes (1 byte).
The concatenation of the primary key and its associated default keys found in the DBDEF of the data file. These keys point to the first TLREC of a child node or the first LREC of a block in the B⁺Tree data file.
FARF6 is not supported for B⁺Tree node files. In the DBDEF macro, the F6PRP and F6OFP parameters (OP5 bits 3 and 4 must be off) must be set to NO (the default) for B⁺Tree node files.

B⁺Tree Data File Data Blocks

B⁺Tree data blocks are the same for B⁺Tree files as they are for other files except the STDSBA field in the prime block header points to the root node of the B⁺Tree index.

B⁺Tree Data File Characteristics

To use B⁺Tree indexing, set the DBDEF macro parameters, equivalent DSECT parameters, or equivalent default values for a B⁺Tree data file as follows:

OP1 bits 2, 3, and 6 must be off.
OP3 bit 5 must be on.
FARF6 is not supported for B⁺Tree node files. In the DBDEF macro, the F6PRP and F6OFP parameters (OP5 bits 3 and 4 must be off) must be set to NO (the default) for B⁺Tree data files.
The RBV algorithm value cannot be #TPFDB0D.
The TQK value must be greater than 4.
If present, the PIN value must be less than or equal to 50.
The TYP value must be R.
The NOC variable cannot be present.
The SKE variable cannot be present.
The NLR variable cannot be present.

Also, the B⁺Tree data file DBDEF must have the following:

A PKY statement to define default keys
NODEID=fileid (where fileid is the &SW00WID value shown in the associated B⁺Tree index file)
KEYCHECK=YES
UNIQUE=YES
DELEMPTY=YES if the DBDEF includes statements for recoup to perform multiple ECB chain chasing (see Multiple ECB Chain Chasing for information about multiple ECB chain chasing).

Note:

Before you can implement B⁺Tree indexing in an ALCS environment, enable C language support. See TPFDF Installation and Customization for more information.

Additional Considerations When Using B⁺Tree Indexing

If you use B⁺Tree indexing support, keep the following points in mind:

The sum of all keys, including the primary key, must be unique. Individual keys do not have to be unique. For example, in Figure 66, there is more than one LREC with a salary of $90000 but only one with a salary of $90000 and the name ADAMS.
Each subfile has its own B⁺Tree index. For example, a file using algorithm #TPFDB01 (26 subfiles) would have 26 separate B⁺Tree structures.
More than one file can use the same B⁺Tree index file DSECT. Each file would have its own B⁺Tree index structure. For example, data files IR26DF and IR27DF can both have a DBDEF statement for NODEID=B070.
Packing a B⁺Tree data file builds or rebuilds the B⁺Tree structure unless the file contains only a prime block with no overflow blocks.
You must pack a B⁺Tree file to validate the file references after CRUISE capture and restore processing because CRUISE capture and restore processing releases B⁺Tree files.
If a file is changed from block indexing to B⁺Tree indexing, existing TLRECs of the block index file are deleted when the file is packed.
B⁺Tree indexing does not support extended LRECs.
Each B⁺Tree node block must be large enough to contain at least four TLRECs.
The number of TLRECs that can fit into a node block depends on the size of the blocks and the size of the keys that are used. Because each node header file is 26 bytes, and each TLREC uses 8 bytes in addition to the key size, use the following formula to calculate the number of TLRECs that will be in your node blocks.
```
Number of TLRECs   =   (block size - 26) / (8 + key size)
```
If you convert an existing file to allow it to use B⁺Tree indexing, reassemble any assembler applications that use it, to flag any incompatible options. Applications that assemble cleanly do not have to be reloaded.
Never pack a B⁺Tree index file.
If a B⁺Tree file is converted to a non-B⁺Tree file, the nodes will be reported lost when recoup is run. Use the CRUISE PACK function to remove the node file references from the headers of the subfiles. You must remove the old node file references if you intend to convert the file into a B⁺Tree file again.

Structure of a Data File That Uses B⁺Tree Indexing

A data file that uses B⁺Tree indexing has a B⁺Tree index file associated with it. The data file consists of data blocks that contain LRECs. The B⁺Tree index file consists of node blocks that contain TLRECs.

Figure 66 shows data file, GR91SR, which uses B⁺Tree index file IR70SR. The figure only shows a portion of the index and data files and is not intended to show a complete B⁺Tree structure. Data file GR91SR shows 4 data blocks. B⁺Tree index file IR70SR shows a root node and 4 leaf nodes.

Figure 67 shows the DSECT and the DBDEF statements for GR91SR. Figure 68 shows the DSECT and the DBDEF statements for IR70SR.

Figure 66. Sample B⁺Tree File

Alternative text description not available.

Defining the DSECT and DBDEF for a Data File That Uses B⁺Tree Indexing

Figure 67 shows part of the DSECT and DBDEF for data file GR91SR, which uses B⁺Tree indexing. No matter what data is in an LREC, it is organized according to this definition.

The DBDEF includes statements that are necessary for recoup to perform single-ECB chain chasing. Chain chasing a B⁺Tree file involves chasing a normal chain of data blocks and a companion chain of node blocks.

Figure 67. B⁺Tree Data File DSECT and DBDEF

          &SW00WID SETC  'B073'      ** FILE ID
          &SW00TQK SETC  '15'        ** HIGHEST TLREC

          GR91SIZ   DS  H            ** SIZE OF LOGICAL RECORD
          GR91KEY   DS  X            ** LOGICAL RECORD IDENTIFIER
          #GR91K80  EQU   X'80'      ** LOGICAL RECORD KEY X'80'
          GR91ORG   EQU *            ** START OF LOGICAL RECORD DESCRIPTION
          GR91SAL   DS  CL5          ** SALARY
          GR91DPT   DS  CL4          ** DEPARTMENT
          GR91NAM   DS  CL6          ** LAST NAME
          GR91E80   EQU *



   DBDEF FILE=GR91SR,
         NODEID=B070,                ** B+TREE INDEX FILE
         KEYCHECK=YES,               ** REQUIRED FOR A B+TREE FILE
         UNIQUE=YES,                 ** REQUIRED FOR A B+TREE FILE
         (ID3=(CHK0),RID=B070,ADR=STDSBA-STDREC),
         (PKY=#GR91K80,              ** KEY x'80'
         KEY1=(PKY=#GR91K80,UP),     ** UP ORG ON PKY
         KEY2=(R=GR91SAL,DOWN),      ** DOWN ORG ON SALARY
         KEY3=(R=GR91NAM,UP)), ...   ** UP ORG ON LAST NAME

Defining the DSECT and DBDEF for a B⁺Tree Index File

Use the sample B⁺Tree index file DSECT, SAMTSR, to build your own DSECT. You can add statements to define a B⁺Tree index file with its own characteristics (for example, file ID, WRS size, and so on), but do not change the existing statements. The only DBDEF override values that you can use are:

WRS: Sets the block size of the nodes. WRS can be set to any value.
PF0: Sets the type of pool record used to create node blocks; LS (long-term nonduplicated pool), SS (short-term pool), or LD (long-term duplicate pool). PF0 defaults to LS.
PF1: Sets the type of pool record used to create temporary node blocks. Temporary node blocks are used by B⁺Tree indexing if the number of changed nodes exceeds the number of nodes defined in #TPFNODE in the ACPDBE segment. PF1 defaults to SS.

Figure 68 shows part of the DSECT and DBDEF for B⁺Tree index file IR70SR.

Figure 68. B⁺Tree Index File DSECT and DBDEF

          &SW00WID SETC  'B070'      ** FILE ID
          &SW00RBV SETC  '#TPFDBFF'  ** FILE ALGORITHM
          &SW00OP1 SETC  '00000000'  ** OPT BYTE1
          &SW00OP2 SETC  '00000110'  ** OPT BYTE2
          &SW00OP3 SETC  '00000000'  ** OPT BYTE3
       
          &SW00TQK SETC  '02'        ** HIGHEST TLREC
          &SW00NOC SETA  0           ** NUMBER OF CHAINS -FOR ADD CURRENT ONLY-
          &SW00PIN SETC  '00'        ** ENSURE NODES ARE NEVER PACKED

          IR70SIZ   DS  H            ** SIZE OF VARIABLE LEN LREC
          IR70KEY   DS  X            ** PRIMARY KEY
          IR70ORG   EQU *            ** START OF LOGICAL RECORD DESCRIPTION
          IR70FA1   DS  XL4          ** LOWER LEVEL FADDR
          IR70RC1   DS  XL1          ** RECORD CODE CHECK
          IR70A03   DS  0CL1         ** KEY FIELDS
          IR70E03   EQU *            ** END OF LOGICAL RECORD WITH KEY = X'03'
          IR70FA2   DS  XL4          ** LOWER LEVEL FADDR
          IR70RC2   DS  XL1          ** RECORD CODE CHECK
          IR70A04   DS  0CL1         ** KEY FIELDS
          IR70E04   EQU *            ** END OF LOGICAL RECORD WITH KEY = X'04'


   DBDEF FILE=IR70SR,TRS=0,NODE=YES

Multiple ECB Chain Chasing

The DBDEF shown in Figure 67 includes statements necessary for recoup to perform single ECB chain chasing. This may not be adequate for large data files. As an alternative, you can define the DBDEF for the B⁺Tree data and index files to allow multiple ECB chain chasing. Figure 69 shows one example of how multiple ECB chain chasing can be defined. Depending on the size of the chains and their location in the overall data structure, different methods of chain chasing might be necessary in each customer environment.

Figure 69. DBDEF for Multiple ECB Chain Chasing

The chain chasing of the structure in Figure 69 is as follows:

The prime block of GR91SR is found.
Consider GR91SR as a no forward chain file (PFC=-1).
Evaluate the CHK code. If there are no nodes, chain chase the remaining data blocks via STDFCH using file version 2.
Start chain chasing the nodes via STDSBA by going to X'B070' (IR70SR). If it is zero (no nodes), this will stop immediately.
IR70SR will invoke file version 1 of the data file with a new ECB whenever a X'04' TLREC is found (thus in a leaf node).
File version 1 of the data file will only cause the current block to be chain chased because it has no forward chains (PFC=-1).

B+Tree Indexing

B+Tree Index File Node Blocks

B+Tree Data File Data Blocks

B+Tree Data File Characteristics

Additional Considerations When Using B+Tree Indexing

Structure of a Data File That Uses B+Tree Indexing

Defining the DSECT and DBDEF for a Data File That Uses B+Tree Indexing

Defining the DSECT and DBDEF for a B+Tree Index File