B+Tree Indexing

B+Tree indexing is a method of accessing and maintaining data. It should be used for large files that have unusual, unknown, or changing distributions because it reduces I/O processing when files are read. Also consider B+Tree indexing for files with long overflow chains.

Note:
Unlike block indexing, where the number of I/Os can increase substantially as the file gets larger, the number of I/Os using B+Tree indexing remains minimal regardless of the size of the file.

A B+Tree file consists of a data file, which contains logical records (LRECs), and an index file, which contains technical logical records (TLRECs). The B+Tree index file, which consists of blocks (also called nodes), is maintained internally by the TPFDF product. The index file has its own file ID, DSECT, and DBDEF statements. The prime block of the B+Tree index file (also called the root node) is pointed to by the header in the prime block of the B+Tree data file.

B+Tree indexing is similar to block indexing but has the following advantages:

Notes:
  1. If the distribution of overflows can be predicted, you may want to use an algorithm. A well-distributed algorithm that has one or two overflows for each subfile will probably outperform a B+Tree index.
  2. You can define a file to use an algorithm and B+Tree indexing. The algorithm accesses the subfile and B+Tree indexing accesses the record in the subfile.
  3. B+Tree files cannot use FARF6 file addresses.

B+Tree Index File Node Blocks

B+Tree index file node blocks contain TLRECs that contain file addresses of the blocks at a lower level of a file. Those lower-level blocks may be other node blocks or data blocks. TLRECs in node blocks have the following layout:

B+Tree Data File Data Blocks

B+Tree data blocks are the same for B+Tree files as they are for other files except the STDSBA field in the prime block header points to the root node of the B+Tree index.

B+Tree Data File Characteristics

To use B+Tree indexing, set the DBDEF macro parameters, equivalent DSECT parameters, or equivalent default values for a B+Tree data file as follows:

Also, the B+Tree data file DBDEF must have the following:

Note:
Before you can implement B+Tree indexing in an ALCS environment, enable C language support. See TPFDF Installation and Customization for more information.

Additional Considerations When Using B+Tree Indexing

If you use B+Tree indexing support, keep the following points in mind:

Structure of a Data File That Uses B+Tree Indexing

A data file that uses B+Tree indexing has a B+Tree index file associated with it. The data file consists of data blocks that contain LRECs. The B+Tree index file consists of node blocks that contain TLRECs.

Figure 66 shows data file, GR91SR, which uses B+Tree index file IR70SR. The figure only shows a portion of the index and data files and is not intended to show a complete B+Tree structure. Data file GR91SR shows 4 data blocks. B+Tree index file IR70SR shows a root node and 4 leaf nodes.

Figure 67 shows the DSECT and the DBDEF statements for GR91SR. Figure 68 shows the DSECT and the DBDEF statements for IR70SR.

Figure 66. Sample B+Tree File
Alternative text description not available.

Defining the DSECT and DBDEF for a Data File That Uses B+Tree Indexing

Figure 67 shows part of the DSECT and DBDEF for data file GR91SR, which uses B+Tree indexing. No matter what data is in an LREC, it is organized according to this definition.

The DBDEF includes statements that are necessary for recoup to perform single-ECB chain chasing. Chain chasing a B+Tree file involves chasing a normal chain of data blocks and a companion chain of node blocks.

Figure 67. B+Tree Data File DSECT and DBDEF
          &SW00WID SETC  'B073'      ** FILE ID
          &SW00TQK SETC  '15'        ** HIGHEST TLREC

          GR91SIZ   DS  H            ** SIZE OF LOGICAL RECORD
          GR91KEY   DS  X            ** LOGICAL RECORD IDENTIFIER
          #GR91K80  EQU   X'80'      ** LOGICAL RECORD KEY X'80'
          GR91ORG   EQU *            ** START OF LOGICAL RECORD DESCRIPTION
          GR91SAL   DS  CL5          ** SALARY
          GR91DPT   DS  CL4          ** DEPARTMENT
          GR91NAM   DS  CL6          ** LAST NAME
          GR91E80   EQU *



   DBDEF FILE=GR91SR,
         NODEID=B070,                ** B+TREE INDEX FILE
         KEYCHECK=YES,               ** REQUIRED FOR A B+TREE FILE
         UNIQUE=YES,                 ** REQUIRED FOR A B+TREE FILE
         (ID3=(CHK0),RID=B070,ADR=STDSBA-STDREC),
         (PKY=#GR91K80,              ** KEY x'80'
         KEY1=(PKY=#GR91K80,UP),     ** UP ORG ON PKY
         KEY2=(R=GR91SAL,DOWN),      ** DOWN ORG ON SALARY
         KEY3=(R=GR91NAM,UP)), ...   ** UP ORG ON LAST NAME

Defining the DSECT and DBDEF for a B+Tree Index File

Use the sample B+Tree index file DSECT, SAMTSR, to build your own DSECT. You can add statements to define a B+Tree index file with its own characteristics (for example, file ID, WRS size, and so on), but do not change the existing statements. The only DBDEF override values that you can use are:

WRS
Sets the block size of the nodes. WRS can be set to any value.
PF0
Sets the type of pool record used to create node blocks; LS (long-term nonduplicated pool), SS (short-term pool), or LD (long-term duplicate pool). PF0 defaults to LS.
PF1
Sets the type of pool record used to create temporary node blocks. Temporary node blocks are used by B+Tree indexing if the number of changed nodes exceeds the number of nodes defined in #TPFNODE in the ACPDBE segment. PF1 defaults to SS.

Figure 68 shows part of the DSECT and DBDEF for B+Tree index file IR70SR.

Figure 68. B+Tree Index File DSECT and DBDEF
          &SW00WID SETC  'B070'      ** FILE ID
          &SW00RBV SETC  '#TPFDBFF'  ** FILE ALGORITHM
          &SW00OP1 SETC  '00000000'  ** OPT BYTE1
          &SW00OP2 SETC  '00000110'  ** OPT BYTE2
          &SW00OP3 SETC  '00000000'  ** OPT BYTE3
       
          &SW00TQK SETC  '02'        ** HIGHEST TLREC
          &SW00NOC SETA  0           ** NUMBER OF CHAINS -FOR ADD CURRENT ONLY-
          &SW00PIN SETC  '00'        ** ENSURE NODES ARE NEVER PACKED

          IR70SIZ   DS  H            ** SIZE OF VARIABLE LEN LREC
          IR70KEY   DS  X            ** PRIMARY KEY
          IR70ORG   EQU *            ** START OF LOGICAL RECORD DESCRIPTION
          IR70FA1   DS  XL4          ** LOWER LEVEL FADDR
          IR70RC1   DS  XL1          ** RECORD CODE CHECK
          IR70A03   DS  0CL1         ** KEY FIELDS
          IR70E03   EQU *            ** END OF LOGICAL RECORD WITH KEY = X'03'
          IR70FA2   DS  XL4          ** LOWER LEVEL FADDR
          IR70RC2   DS  XL1          ** RECORD CODE CHECK
          IR70A04   DS  0CL1         ** KEY FIELDS
          IR70E04   EQU *            ** END OF LOGICAL RECORD WITH KEY = X'04'


   DBDEF FILE=IR70SR,TRS=0,NODE=YES

Multiple ECB Chain Chasing

The DBDEF shown in Figure 67 includes statements necessary for recoup to perform single ECB chain chasing. This may not be adequate for large data files. As an alternative, you can define the DBDEF for the B+Tree data and index files to allow multiple ECB chain chasing. Figure 69 shows one example of how multiple ECB chain chasing can be defined. Depending on the size of the chains and their location in the overall data structure, different methods of chain chasing might be necessary in each customer environment.

Figure 69. DBDEF for Multiple ECB Chain Chasing

The chain chasing of the structure in Figure 69 is as follows: