Pending time overflow

Because all overflows for pending time are lost, RMF does not accurately report certain shared DASD delays. For example, a request delayed for 18 seconds overflows twice; 16.6 seconds are lost. To RMF, the delay appears to be only 1.4 seconds. Therefore, the AVERAGE PENDING TIME and the AVERAGE RESPONSE TIME values are extremely inaccurate.

For requests with extremely long delays, the missing interrupt handler (MIH) halts the request and reschedules it periodically. MIH estimates the amount of pending time, based on the MIH interval, and adds it to the value RMF reports. Therefore, pending time is lost only for requests that take longer than 8.3 seconds and less than 1.5 times the MIH interval. To increase the accuracy of AVERAGE PENDING TIME and AVERAGE RESPONSE TIME, decrease the MIH interval. An interval of four seconds will ensure that no pending time is lost. However, some performance penalty does occur because of the four-second interval.

Figure 1. Direct Access Device Activity Report
                                     D I R E C T   A C C E S S   D E V I C E   A C T I V I T Y
                                                                                                                           PAGE    2
            z/OS V2R4               SYSTEM ID SYSF             DATE 05/11/2019             INTERVAL 14.59.998
                                     RPT VERSION V2R4 RMF      TIME 06.00.00               CYCLE 1.000 SECONDS

TOTAL SAMPLES =    900   IODF = 00   CR-DATE: 09/14/2017   CR-TIME: 10.31.31     ACT: POR
                                                 DEVICE   AVG  AVG   AVG  AVG  AVG   AVG  AVG  AVG    %      %     %     AVG    %
STORAGE   DEV  DEVICE   NUMBER  VOLUME PAV  LCU  ACTIVITY RESP IOSQ  CMR  DB   INT   PEND DISC CONN   DEV    DEV   DEV   NMBR   ANY
 GROUP    NUM  TYPE     OF CYL  SERIAL           RATE     TIME TIME  DLY  DLY  DLY   TIME TIME TIME   CONN   UTIL  RESV  ALLOC  ALLOC
XTEST    02208 33903      3339  TRXSX9   1  0032   0.001  .384 .000  .128 .000 .123  .256 .000 .128   0.00   0.00   0.0   0.0   100.0
XTEST    02209 33903      3339  TRXSXA   1  0032   0.001  .256 .000  .000 .000 .135  .256 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220A 33909     10017  TRXT01   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220B 33909     10017  TRXT02   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220C 33909     10017  TRXT03   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220D 33909     10017  TRXT04   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220E 33909     10017  TRXT05   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         0220F 33909     32760  TRXT06   1  0032   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0   0.0   100.0
         02210 33909     32760              0032 OFFLINE
         02211 33909     32760              0032 OFFLINE                        									                             

For DASD devices actually used for synchronous I/O the Synchronous I/O Device Activity report shows detailed IBM zHyperLink activity data:

Figure 2. Synchronous I/O Device Activity
                                    S Y N C H R O N O U S   I/O   D E V I C E   A C T I V I T Y 
                                                                                                                           PAGE    1
            z/OS V2R4               SYSTEM ID SYSF             DATE 05/11/2019             INTERVAL 14.59.998
                                     RPT VERSION V2R4 RMF      TIME 06.00.00               CYCLE 1.000 SECONDS

TOTAL SAMPLES =    304   IODF = 00   CR-DATE: 09/14/2017   CR-TIME: 10.31.31     ACT: ACTIVATE
                                     - DEVICE ACTIVITY RATE -  -- AVG RESP TIME --  AVG SYNCH I/O  %       %     %     %          
STORAGE  DEV   DEVICE   VOLUME  LCU  -- SYNCH I/O --   ASYNCH  -SYNCH I/O - ASYNCH  TRANSFER RATE  REQ     LINK  CACHE --REJECTS--
GROUP   NUM   TYPE     SERIAL          READ   WRITE      I/O   READ  WRITE    I/O   READ   WRITE  SUCCESS  BUSY  MISS  READ WRITE 
        02180 33909    SYST10  001C   0.345   0.404    1.024  0.002  0.002  0.384  1.078   0.999    99.85  0.11  0.00  0.00  0.00
        02181 33909    SYST11  001C   0.702   0.491    0.073  0.001  0.001  0.112  0.500   0.456    99.94  0.00  0.01  0.00  0.00
                        LCU    001C   1.047   0.895    1.097  0.001  0.001  0.379  1.578   1.455    99.84  0.11  0.01  0.00  0.00 

The reports for communication equipment, character reader devices, graphic devices, and unit record devices have the same format. The Communication Equipment Activity report is shown as example in Figure 3.

Figure 3. Communication Equipment Activity Report
                                 C O M M U N I C A T I O N   E Q U I P M E N T   A C T I V I T Y
                                                                                                                           PAGE    1
            z/OS V2R4               SYSTEM ID   SYSF           DATE 05/11/2019           INTERVAL 15.00.000
                                    RPT VERSION V2R4 RMF       TIME 06.15.00             CYCLE 1.000 SECONDS

TOTAL SAMPLES =    900   IODF = 00   CR-DATE: 09/14/2017   CR-TIME: 10.31.31     ACT: ACTIVATE
                                     DEVICE   AVG  AVG   AVG  AVG  AVG   AVG  AVG  AVG    %      %     %     AVG    %      %     %
          DEV  DEVICE   VOLUME  LCU  ACTIVITY RESP IOSQ  CMR  DB   INT   PEND DISC CONN   DEV    DEV   DEV   NMBR   ANY    MT    NOT
          NUM  TYPE     SERIAL       RATE     TIME TIME  DLY  DLY  DLY   TIME TIME TIME   CONN   UTIL  RESV  ALLOC  ALLOC  PEND  RDY
         00120                  0001   0.129  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0         100.0          0
         00121                  0001   0.129  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0         100.0          0
         09D5D                  007C   1.482  .291 .000  .017 .000 .001  .177 .086 .028   0.00   0.02   0.0         100.0          0
         09E5D                  007C   1.702   588 .000  .018 .000 .003  .171  587 .057   0.01  99.97   0.0         100.0          0
                         LCU    007C   3.184   314 .000  .018 .000 .002  .173  314 .044   0.01  50.00   0.0         100.0          0

The following figure shows the Magnetic Tape Device Activity report.

Figure 4. Magnetic Tape Device Activity Report
                                    M A G N E T I C   T A P E   D E V I C E   A C T I V I T Y
                                                                                                                          PAGE    1
           z/OS V2R4               SYSTEM ID SYSE             DATE 05/11/2019            INTERVAL 15.00.027
                                    RPT VERSION V2R4 RMF      TIME 23.45.00              CYCLE 1.000 SECONDS

TOTAL SAMPLES =    810   IODF = 00   CR-DATE: 09/14/2017   CR-TIME: 12.03.30     ACT: ACTIVATE

                                     DEVICE   AVG  AVG   AVG  AVG  AVG   AVG  AVG  AVG    %      %     %     NUMBER   AVG      TIME
          DEV  DEVICE   VOLUME  LCU  ACTIVITY RESP IOSQ  CMR  DB   INT   PEND DISC CONN   DEV    DEV   DEV   OF       MOUNT    DEVICE
          NUM  TYPE     SERIAL       RATE     TIME TIME  DLY  DLY  DLY   TIME TIME TIME   CONN   UTIL  RESV  MOUNTS   TIME     ALLOC
         00660 3590             0015   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0      0         0        0
         00661 3590             0015   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0      0         0        0
         00662 3590             0015   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0      0         0        0
         00663 3590             0015   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0      0         0        0
                         LCU    0015   0.000  .000 .000  .000 .000 .000  .000 .000 .000   0.00   0.00   0.0      0         0        0
Table 1. Fields in the Device Activity Reports
Field Heading Meaning
IODF = xx The IODF number where xx is the suffix of the IODF data set name.
CR-DATE: mm/dd/yyyy The creation date of the IODF.
CR-TIME: hh.mm.ss The creation time of the IODF.
ACT: text The configuration state where text indicates how the IODF was activated.
STORAGE GROUP The name of the storage group to which the device belongs. Your storage administrator assigns the names. These names are available on the direct access device report only.
DEV NUM The five-digit hexadecimal device number that identifies a physical I/O device. The first digit represents the ID of the subchannel set to which the I/O device is physically configured.
DEVICE TYPE The device type on which the data set resides.
NUMBER OF CYL The DASD volume capacity (in cylinders).
VOLUME SERIAL The volume serial number (for direct access and magnetic tape reports) of the volume mounted on the device at the end of the reporting interval.
PAV The number of parallel access volumes (base and alias) which were available at the end of the reporting interval.

If the number has changed during the reporting interval, it is followed by an '*'.

If the device is is a HyperPAV base device, the number is followed by an 'H', for example, 5.4H. The value is the average number of HyperPAV volumes (base and alias) in that interval.

                                Accumulated # of HPAV devices
 Average # of HPAV devices =  ---------------------------------
                                     Number of Samples
LCU The number of the logical control unit (LCU) to which the device belongs.

An LCU is a set of devices attached to the same physical control unit (or a group of physical control units with one or more devices in common.) The IOP, which is part of the channel subsystem, manages and schedules I/O work requests.

There are two reasons that this field is blank:
  • RMF encountered an error while gathering data, check the operator console for messages.
  • This is a non-dedicated device in a z/VM guest system environment.
DEVICE ACTIVITY RATE The rate at which start subchannel (SSCH) instructions to the device completed successfully.
            # Successful SSCH Instructions
ACTV RATE = ------------------------------
                   Interval Time

This formula applies to the activity rate measured during asynchronous I/O processing. For devices using suspended channel programs, resume I/O requests are included in the SSCH counts.

Character ‘S’ appended to the DEVICE ACTIVITY RATE value of a device shown in the Direct Access Device Activity report indicates that the device performed synchronous I/O requests and that detailed synchronous I/O performance measurements for this device are available in the Synchronous I/O Device Activity report section.

For easy comparison the Synchronous I/O Device Activity report lists the asynchronous I/O device activity rate calculated (ASYNCH I/O) in adjacent columns showing
  • the rate of successfully completed SYNCH I/O READ requests and
  • the rate of SYNCH I/O WRITE requests which completed successfully during the interval.
The synchronous I/O activity rate is calculated as
            # Successful Synch I/O read (respectively write) requests          
ACTV RATE = --------------------------------------------------------- 
                        Interval Time

In the LCU summary line, this field contains the sum of the rates for each individual device.

If the device has been deleted during the last interval, DEVICE DYNAMICALLY DELETED appears in the field instead of the measurement data.

If the device has changed from static to dynamic, or was deleted and a new device added with the same device number, DEVICE DYNAMICALLY CHANGED appears in the field instead of the measurement data.

AVG RESP TIME The average number of milliseconds the device required to complete an asynchronous I/O request. This value reflects the total hardware service time and the front end software queuing time involved for the average I/O request to the device. The channel measures active time, which starts at the acceptance of a SSCH instruction (indicated by a condition code 0) and ends at the acceptance of the channel end (primary status pending). It does not, however, include the time required to process the interruption. The IOS queue length is factored in to reflect the front end queuing time.
                   Device Active Time
AVG ACT TIME  = ------------------------
                Measurement Event Count

AVG RESP TIME = AVG ACT TIME + AVG IOSQ TIME

The active time is the sum of connect, disconnect, and pending time as described later.

In the LCU summary line, this field contains the weighted average of the individual average response times for each device.

For easy comparison the Synchronous I/O Device Activity report lists the asynchronous I/O average response time calculated (ASYNCH I/O) in adjacent columns showing
  • the average processing time (in milliseconds) per successful SYNCH I/O READ requests and
  • the average processing time (in milliseconds) per successful SYNCH I/O WRITE request.
AVG SYNCH I/O TRANSFER RATE
READ
The number of megabytes per second read during synchronous I/O processing on the device.
WRITE
The number of megabytes per second written during synchronous I/O processing on the device.
% REQ SUCCESS Percentage of synchronous I/O requests that completed successfully.
% LINK BUSY Percentage of synchronous I/O requests that hit a link busy condition when trying to use a synchronous I/O link.
% CACHE MISS Percentage of synchronous I/O read requests that hit a cache miss condition.
% REJECTS
READ
The percentage of synchronous I/O read requests that were rejected for reasons other than a link busy condition or a read cache miss.
WRITE
The percentage of synchronous I/O write requests that were rejected for reasons other than a link busy condition.
AVG IOSQ TIME The average number of milliseconds an I/O request must wait on an IOS queue before a SSCH instruction can be issued.
                   Total IOSQ Time
AVG IOSQ TIME = ----------------------
                Start Subchannel Count
AVG CMR DLY The average number of milliseconds of delay that a successfully initiated start or resume function needs until the first command is indicated as accepted by the device. It allows to distinguish between real H/W errors versus workload spikes (contention in the fabric and at the destination port).
                   Initial Command Response Time
AVG CMR DLY = --------------------------------------
              # I/O Operations Accepted on that Path
AVG DB DLY The average number of milliseconds of delay that I/O requests to this device encountered because the device was busy. Device busy might mean:
  • Another system is using the volume
  • Another system reserved the device
  • Head of string busy conditions caused contention
  • Some combination of these three conditions has occurred.
              Device Busy Delay Time
AVG DB DLY = -----------------------
             Measurement Event Count
AVG INT DLY The average interrupt delay time in units of milliseconds encountered for I/O requests to this device. For each I/O request, the time is measured from when the I/O operation is complete to when the operating system begins to process the status.
              Device Interrupt Delay Time
AVG INT DLY = ---------------------------
                Measurement Event Count
AVG PEND TIME The average number of milliseconds an I/O request must wait in the hardware. This value reflects the time between acceptance of the SSCH function by the channel subsystem (SSCH-function pending) and acceptance of the first command associated with the SSCH function at the device (subchannel active). This value also includes the time waiting for an available channel path and control unit as well as the delay due to shared DASD contention.
If the value is high, refer to the device's LCU entry in the I/O queuing activity report for an indicator of the major cause of the delay.
               Device Pending Time
PEND TIME = -------------------------
             Measurement Event Count
AVG DISC TIME The average number of milliseconds the device was disconnected while processing an SSCH instruction. This value reflects the time when the device was in use but not transferring data. It includes the overhead time when a device might disconnect to perform positioning functions such as SEEK/SET SECTOR, as well as any reconnection delay.
                 Device Disconnect Time
AVG DISC TIME = -----------------------
                Measurement Event Count

The measurement event count is the same as the number of SSCH instructions issued, unless there has been a timer overflow error in the channel.

AVG CONN TIME The average number of milliseconds the device was connected to a channel path and actually transferring data between the device and central storage. Typically, this value, measures data transfer time but also includes the search time needed to maintain channel path, control unit, and device connection.
                   Device Connect Time
AVG CONN TIME = -------------------------
                 Measurement Event Count
% DEV CONN The percentage of time during the interval when the device was connected to a channel path.
             Device Connect Time
% DEV CONN = ------------------- * 100
                Interval Time
% DEV UTIL The percentage of time during the interval when the device was in use. This percentage includes both the time when the device was involved in I/O operations (connect and disconnect time) and the time when it was reserved but not involved in an I/O operation.

The percentage reported represents the time during the interval when the device is tied up when it could not be used to service a request from another system. Some small portion of device busy (reserved) time is missed when the device is reserved but the I/O request is pending in the channel.

               (CON + DISC)/PAV   RSV
% DEV UTIL = ( ---------------- + --- ) * 100
                      INT          N
CON
Device connect time
DISC
Device disconnect time
PAV
Number of paralles access volumes (base and alias); in case of non-PAV devices, PAV is set to 1
RSV
Number of samples when the device was reserved but not involved in an I/O operation
INT
Interval time (seconds)
N
Total number of samples
% DEV RESV The percentage of time during the interval when a shared device was reserved by the processor on which RMF was started.
At each RMF cycle, RMF checks to see if a device is reserved, and a counter is kept of all such samples. At the end of the interval, the percentage is computed.
             # Device-reserved Samples
% DEV RESV = ------------------------- * 100
                 # Samples
AVG NMBR ALLOC The average number of data control blocks (DCBs) and access method control blocks (ACBs) concurrently allocated for each volume. This field is reported only for direct access storage devices.

At each RMF cycle, a counter is increased to reflect the number of data sets concurrently allocated. At the end of the interval, the average is calculated by dividing the total number of allocated data sets for all samples by the total number of samples.

% ANY ALLOC The percentage of time during the reporting interval when the device was allocated to one or more data sets. Permanently mounted direct access devices show a 100% allocation, regardless of whether or not a data set was actually allocated.
To determine the value, RMF keeps a count of whether or not the device was allocated or permanently resident at each cycle. At the end of the interval, the percentage is computed.
              # Samples when the Device was Allocated
% ANY ALLOC = --------------------------------------- * 100
                     # Samples
% MT PEND The percentage of time during the interval when a mount was pending for the device. This field is reported only for direct access devices and magnetic tape devices.
At each cycle, RMF updates a counter when it detects a mount pending condition. At the end of the interval, the percentage is computed.
            Counter for Mount-Pending Condition
% MT PEND = ----------------------------------- * 100
                       # Samples
%NOT RDY The percentage of time during the reporting interval when the device was not ready for use. For example, when a tape has just been mounted but is not yet ready to be used to the system. This field is not reported for direct access devices. However, the value is recorded in the corresponding field of the SMF record, should your installation need the information.
At each RMF cycle, a counter is updated when the status of the device indicates that it is not ready. At the end of the interval, the percentage is computed.
           # Samples when the Device was not Ready
%NOT RDY = --------------------------------------- * 100
                       # Samples
NUMBER OF MOUNTS The number of tape mounts, shown as an integer value, detected by RMF.

If the tape mount was pending at the first cycle of the interval, an asterisk is placed before the numerical value of the tape mount. If the tape mount was pending at the last cycle of the interval, an asterisk is placed immediately following the numerical value of the tape mount.

If a mount-pending condition is detected at the first cycle of the interval, the mount count for the interval increments by one.

In the LCU summary line, this field contains the sum of all mount counts.

This field is reported only for magnetic tape devices.

Note: Due to the fact that the tape mount count is a sampled value, it might happen that it does not contain all subsecond mounts of VTS devices.
AVG MOUNT TIME The average mount time pending for every device, expressed in the form of HH:MM:SS.
                     # Samples Tape Mount was Pending * Interval
                     -------------------------------------------
                                     # Samples
AVG MOUNT TIME =  -------------------------------------------------
                                     # Mounts

If the mount count or the sample count is zero, the result is zero.

This field is reported only for magnetic tape devices.

TIME DEVICE ALLOC The total time the device was allocated during the interval, expressed in the form of HH:MM:SS.
                    # Samples Tape Device was Allocated * Interval
TIME DEVICE ALLOC = ----------------------------------------------
                                   # Samples

If the sample count is zero, the result is zero.

This field is reported only for magnetic tape devices.