I/O probe manager

I/O probe manager provides capabilities to trace I/O operation events in various layers of AIX® I/O stack. Use the syscall probe manager to trace application I/O request that is triggered by a read/write system call. Use I/O probe manager to probe further into the syscall layer.

Use I/O probe manager to analyze response time of I/O operations of a block device that segregates the service time and queuing delay.

The following layers are supported:

  • Logical File System (LFS)
  • Virtual File System (VFS)
  • Enhanced Journaled File Systems (JFS2)
  • Logical Volume Manager (LVM)
  • Small Computer System Interface (SCSI) disk driver
  • Generic block devices

The primary use cases for I/O probe manager are as follows:

  • Identify the following patterns of I/O usage of a device. Valid devices can be a disk, logical volume, or volume group, or file system (type or mount path) in a specified time period:
    • I/O operation count
    • Size of I/O operations
    • Type of I/O operation (read/write)
    • Sequential or random nature of I/O
  • Get process or thread-wise usage information of a file system (type or mount path), logical volume, volume group, or disk.
  • Get an end-to-end mapping of I/O flow among various layers (wherever possible).
  • Monitor a specific I/O resource usage. For example:
    • Trace any write operations of the /etc/password file.
    • Trace read operation on block 0 of the hdisk0 device.
    • Trace when a new logical volume is opened in root volume group (rootvg).
  • For Multipath I/O (MPIO) disks, get path-specific information by the following actions:
    • Get path-wise usage and response time information.
    • Identify path switching or path failure.
  • For I/O errors, get more details about the error in disk driver layer.

Probe specification

I/O probes must be specified in the following format in Vue script:

@@io:sub_type:io_event:operation_type:filter[|filter …]
This specification consists of five tuples that are separated by colon (:). The first tuple is always @@io.

Probe sub type

The second tuple signifies the sub type of the probe that indicates the layer of AIX I/O stack that contains the probe. This tuple can have one of the following values:

Table 1. Second tuple for probes
Second tuple (sub type) Description
disk This probe starts for disk driver events. Currently, the I/O probe manager supports only the scsidisk driver.
lvm This probe starts for Logical Volume Manager (LVM) events.
bdev This probe starts for any block I/O device. Disk, CD-ROM, diskette are examples of block devices. This sub type is used only when no other sub type is applicable. For example, if a block device is not a disk, volume group, or logical volume, this sub type is applicable.
jfs2 This probe starts for JFS2 file system events.
vfs This probe starts for any read/write operation on a file.
Note: The second tuple cannot have a value of asterisk (*).

For a disk type of second tuple, the third tuple can have the following values:

Table 2. Disk second tuple: Third tuple values
Sub type (Second tuple) I/O event (Third Tuple) Description
disk entry This probe starts whenever disk driver receives an I/O request to process.
iostart This probe starts when the disk driver picks up an I/O request from its ready queue and sends it down to lower layer (for example, adapter driver). A single original I/O request to disk driver can send multiple command requests (some might be driver-related task management command requests) to lower layer. However, sometimes the driver can combine multiple original requests and send a single request to lower layer.
iodone This probe starts when the lower layer (for example, adapter driver) returns an I/O request (successful or failed) to disk driver.
exit This probe starts when disk driver returns an I/O request (successful or failed) to its upper layer.
Note: The members of the following built-in values are available in the probes that are mentioned for the probe sub type: __iobuf, __diskinfo, __diskcmd (only in disk:iostart and disk:iodone), and __iopath (only in disk:iostart and disk:iodone).

For every entry, a corresponding exit probe is defined that has the same __iobuf->bufid value available at both the probe points. The entry event can be followed by multiple iostart events, but at least one of them must have the same __iobuf->bufid value. Every iostart event has a matching iodone event that has the same __iobuf->child_bufid value.

For an LVM type of second tuple, the third tuple can have the following values:

Table 3. LVM second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
lvm entry This probe starts whenever the LVM layer receives an I/O request to process.
iostart This probe starts when LVM picks an I/O request from its ready queue and sends down to the lower layer (usually the disk driver).
iodone This probe starts when the lower layer (for example, disk driver) returns an I/O request (successful or failed) to LVM.
exit This probe starts when LVM returns an I/O request (successful or failed) to its upper layer.
Note: The members of the following built-ins values are available in the probes that are mentioned for LVM: __iobuf, __lvol, and __volgrp. Every entry has a corresponding exit probe, which has the same __iobuf->bufid value available at both the probe points.

The entry event can be followed by multiple iostart events, but at least one of them has the same __iobuf->bufid value. Every iostart event has a matching iodone event that has the same __iobuf->child_bufid value.

For generic block device probes, the third tuple can have the following values:

Table 4. Generic block device second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
bdev iostart This probe gets fired when any block I/O (for example, disk, logical volume, CD-ROM) device is initiated. It happens when the AIX devstrat kernel service is called by any code.
iodone This probe gets fired when a block I/O request completion happens, when the AIX iodone kernel service is called by any code.
Note: The members of the following built-in values are available in the probes that are mentioned in bdev: __iobuf. Every iostart event has a matching iodone event that has the same __iobuf->bufid value.

For JFS2 file system probes, the third tuple can have the following values:

Table 5. JFS2 second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
jfs2 buf_map This probe starts when a logical file extent gets mapped to an I/O buffer and is sent to the underlying logical volume.
Note: The members of the following built-in values are available in the probe that is mentioned for JFS2 file system probes: __j2info.

For Virtual file system (VFS) probes, the third tuple can have the following values:

Table 6. VFS second tuple: Third tuple values
Sub type (second tuple) I/O event (third tuple) Description
vfs entry This probe starts when any read/write operation on a file is initiated.
exit This probe starts when any read/write operation on a file is completed (whether success or failure).
Note: The members of the following built-in are available in the probe that is mentioned in VFS probes: __file.

For the same thread, every entry is followed by an exit event that has the same __file->inode_id value.

Probe operation type

The fourth tuple indicates the type of I/O operation that is specified by the probe. The fourth tuple can have one of the following values:

Table 7. Fourth tuple for I/O operation
Fourth tuple Description
read The probe starts for only the read operation.
write The probe starts for only the write operation.
* The probe starts for both read and write operations.

Probe filter

The fifth tuple is the filter tuple that helps in filtering more specific probes according to the requirement. The possible values are subtype dependent. Multiple values can be specified separated by | character, and the probe starts if it matches any of those filters. If the value of the fifth tuple is *, no filtering occurs and the probe starts if other tuples match. If multiple selectors are specified, and one of them is *, it is equivalent to the whole tuple value of *.

For disk probes, the fifth tuple can have the following values:

Table 8. Disk filter tuple
Filter (fifth tuple) Description
Disk name. For example, hdisk0 The probe action is run only for the particular disk.
Disk type. Allowed symbols: FC, ISCSI, VSCSI, SAS The probe action is run only for disks with matching type. The meanings of the symbols are as follows:
  • FC: Fibre Channel disk
  • ISCSI: iSCSI disk
  • VSCSI: Virtual SCSI disk (on VIOS client)
  • SAS: Serial Attached SCSI disks
Note: The disk name and disk type can be combined as filters. For example, the following probe starts for either hdisk0 or any other FC disk (at disk entry event, for both read/write operation type)
@@io:disk:entry:*:hdisk0|FC

For Logical Volume Manager (LVM) probes, the fifth tuple can have the following values:

Table 9. LVM filter tuple
Filter (fifth Tuple) Description
Logical volume name, for example hd5, lg_dumplv The probe action is run only for the particular logical volume.
Volume group name, for example rootvg The probe action is run only for those logical volumes that belong to a particular volume group.

The following probe starts for any logical volume that belongs to either root volume group (rootvg), or test volume group (testvg) (at iostart event, for write operation only):

@@io:lvm:iostart:write:rootvg|testvg

For generic block device probes, fifth tuple can have following values:
Table 10. Generic block device filter tuple
Filter (fifth tuple) Description
Block device name, for example: hdisk0, hd5, cd0 The probe action is run only for the particular block device.

Consider the following examples for generic block device probes:

@@io:bdev:iostart:*:cd0

@@io:bdev:iodone:read:hdisk3|hdisk5

For JFS2 file system probes, the fifth tuple can have following values:

Table 11. JFS2 filter tuple
Filter (fifth tuple) Description
File system mount path, for example: /usr The probe action is run only for the file system with the particular mount path. It must be a JFS2 file system, otherwiseProbeVue rejects that probe specification.

Consider following examples for the JFS2 file system probes:

@@io:jfs2:buf_map:*:/usr|/tmp

For Virtual file system (VFS) probes, the fifth tuple can have following values:

Table 12. VFS filter tuple
Filter (fifth Tuple) Description
File system mount path. For example, /tmp The probe action is run for files that belong to the file system.
File system type. The allowed symbols are JFS2, NAMEFS, NFS, JFS, CDROM, PROCFS, SFS, CACHEFS, NFS3, AUTOFS, POOLFS, VXFS, VXODM, UDF, NFS4, RFS4, CIFS, PMEMFS, AHAFS, STNFS, ASMFS The probe action is run for files of the particular file system. The symbols correspond to the AIX file systems defined in the exported header file sys/vmount.h.

Consider the following examples for the Virtual file system (VFS) probes:

@@io:vfs:entry:read:JFS2

@@io:vfs:exit:*:/usr|JFS

I/O probe related built-in variables for Vue scripts

__iobuf built-in variable

You can use the special __iobuf built-in variable to access various information about the I/O buffer that is employed in the current I/O operation. It is accessible in probes of sub types: disk, lvm, and bdev. Its member elements can be accessed by using the __iobuf->member syntax.

Note: Whenever the actual value cannot be obtained, the value that is marked as Invalid Value is returned. This value is returned because of one of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value, num_pagefaults, is either 0 or not sufficient.
  • The memory location that is containing the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__iobuf built-in variable has the following members:

Table 13. The __iobuf built-in variable members
Member name Type Description Invalid Value
blknum unsigned long long Starting block number of the I/O request. 0xFFFFFFFFFFFFFFFF
bcount unsigned long long Requested number of bytes in the I/O operation. 0xFFFFFFFFFFFFFFFF
bflags unsigned long long The flags that are associated with the I/O operation. The following symbols are available: B_READ, B_ASYNC, B_ERROR. The symbols can be used along with the bflags value to see whether it is set. For example, if (__iobuf->bflags & B_READ) is true, then it is a read operation.
Note: There is no B_WRITE flag. If the B_READ flag is not set, it is considered to be write operation.
0
devnum unsigned long long The device number of the target device that is associated with the I/O operation. It has the device major number and minor number that is embedded in it. 0
major_num int The major number of the target device of the I/O operation. -1
minor_num int The minor number of the target device of the I/O operation. -1
error int In case of any error in the I/O operation, this value is the error number. This value is defined in the exported errno.h header file. -1
residue unsigned long long The remaining number of bytes from the original request that might not be read or written. On the I/O completion events, this value is ideally zero. But for read operation, a nonzero value might mean that you are trying to read more than what is available, which is acceptable. This value is considered only when error value is nonzero. 0xFFFFFFFFFFFFFFFF
bufid unsigned long long A unique number that is associated with the I/O request. While the I/O is in progress, the bufid value uniquely identifies the I/O request in all the events of a particular sub type. For example, in disk: entry, disk: iostart, disk: iodone, and disk:exit. If the __iobuf->bufid matches, it is the same I/O request at various stages). 0
parent_bufid unsigned long long If the value is not 0, this value provides the bufid of the upper layer buffer that is associated with this I/O request. You can now link the current I/O operation with the upper layer I/O request. For example, in a disk I/O request, the corresponding LVM I/O can be determined.
Note: The parent_bufid field is not set in all code paths, and hence it is not always useful. Use the child_bufid field to link I/O requests between two adjacent layers.
0
child_bufid unsigned long long If the value is not 0, this value provides the bufid of the new I/O request that is sent to the lower layer. The best events to record are disk:iostart, lvm:iostart, and bdev:iostart. You can identify the I/O in the lower adjacent layer by matching the __iobuf->bufid value to this child_bufid value. For example, in lvm:iostart, you can record the __iobuf->child_buf value. Then, in disk:entry, you can match it with __iobuf->bufid to identify the corresponding I/O request. 0

__file built-in variable

You can use the __file special built-in variable to get various information about file operation. It is available in probes of sub type VFS. Its member elements can be accessed by using the __file->member syntax.

Note: Whenever the actual value cannot be obtained, the value that is marked as invalid is returned. The invalid value is returned because of one of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location, which contains the value, is paged out.
  • Any other severe system error such as invalid pointer, or corrupted memory.

The __file built-in variable has the following members:

Table 14. The __file built-in variable members
Member name Type Description Invalid Value
f_type int Specifies the type of the file. It can match one of the following built-in constant values:
  • F_REG (regular file)
  • F_DIR (directory)
  • F_BLK (block device file)
  • F_CHR (character device file)
  • F_LNK (file link)
  • F_SOCK (socket)
Note: The value might not match any of the built-in constants because the list does not include every possible file type, but only the most useful ones.
-1
fs_type int Specifies the type of the file system to which this file belongs. It can match one of the following built-in constant values:
  • FS_JFS2
  • FS_NAMEFS
  • FS_NFS
  • FS_JFS
  • FS_CDROM
  • FS_PROCFS
  • FS_SFS
  • FS_CACHEFS
  • FS_NFS3
  • FS_AUTOFS
  • FS_POOLFS
  • FS_VXFS
  • FS_VXODM
  • FS_UDF
  • FS_NFS4
  • FS_RFS4
  • FS_CIFS
  • FS_PMEMFS
  • FS_AHAFS
  • FS_STNFS
  • FS_ASMFS

The built-in constants corresponds to the AIX file system types defined in the exported sys/vmount.h header file.

-1
mount_path char * Specifies the path where the associated file system is mounted. null string
devnum unsigned long long Specifies the device number of the associated block device of the file. Both the major and minor numbers are embedded in it. If there is no associated block device, then it is 0. 0
major_num int Specifies the major number of the associated block device of the file. -1
minor_num int Specifies the minor number of the associated block device of the file. -1
offset unsigned long long Specifies the current read/write byte offset of the file. 0xFFFFFFFFFFFFFFFF
rw_mode int Specifies the read/write mode of the file. It matches one of the built-in constant values: F_READ or F_WRITE. -1
byte_count unsigned long long At vfs: entry event, byte_count provides the byte count of the read or write request. At vfs: exit event, it provides the number of bytes that remained unfulfilled. For example, the difference of this value between these two events determines how many bytes were processed in the operation. 0xFFFFFFFFFFFFFFFF
fname char * Specifies the name of the file (only base name, not path). null string
inode_id unsigned long long Specifies a system-wide unique number that is associated with the file.
Note: It is different from file inode number.
0
path path_t (new data type in VUE) Specifies the complete file path. It can be printed by using printf() and the format specifier %p. null string as file path
error int If the read/write operation failed, the error number as defined in the exported errno.h header file. If there is no error, it is 0. -1

__lvol built-in variable

You can use the __lvol special built-in variable to get various information about the logical volume in an LVM operation. It is available in probes of sub type lvm. Its member elements can be accessed by using the __lvol->member syntax.
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value, is returned. There might be following reasons for getting this invalid value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.
__lvol built variable in has following members:
Table 15. The __lvol built-in variable members
Member name Type Description Invalid Value
name char * The name of the logical volume. null string
devnum unsigned long long The device number of the logical volume. It has both major number and minor number that is embedded in it. 0
major_num int The major number of the logical volume. -1
minor_num int The minor number of the logical volume. -1
lv_options unsigned int The options that are related to the logical volume. The following values are defined as built-in constants:
  • LV_RDONLY (read-only logical volume)
  • LV_NOMWC (no mirror write consistency checking)
  • LV_ACTIVE_MWC (active mirror write consistency)
  • LV_PASSIVE_MWC (passive mirror write consistency)
  • LV_SERIALIZE_IO (I/O is serialized)
  • LV_DMPDEV (This LV is a dump device)

You can check whether one of these values is set by having condition such as __lvol->lv_options & LV_RDONLY.

Note: All possible values are not defined, and hence other options might be available in the value.
0xFFFFFFFF

__volgrp built-in variable

You can use __volgrp special built-in variable to get various information about the volume group in an LVM operation. It is available in probes of sub type lvm. Its member elements can be accessed by using the __volgrp->member syntax.
Note: Whenever the actual value cannot be obtained, the value that is marked as Invalid Value is returned. The value could be invalid because of the following reasons:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__volgrp built-in variable has following members:

Table 16. The __volgrp built-in variable members
Member name Type Description Invalid Value
name char * The name of the volume group. null string
devnum unsigned long long The device number of the volume group. It has major number and minor number that is embedded in it. 0
major_num int The major number of the volume group. -1
minor_num int The minor number of the volume group.
Note: For volume group, AIX always assigns 0 as the minor number.
-1
num_open_lvs int The number of open logical volumes that belong to this volume group. -1

__diskinfo built-in variable

You can use the __diskinfo special built-in variable to get various information about the disk in a disk I/O operation. It is available in probes of sub type disk. Its member elements can be accessed by using the __diskinfo->member syntax.
Note: Whenever the actual value cannot be obtained, the value that is marked as “Invalid Value” is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.
__diskinfo built-in variable has following members:
Table 17. The __diskinfo built-in variable members
Member name Type Description Invalid Value
name char * The name of the disk. null string.
devnum unsigned long long The device number of the disk. It has major number and minor number that are embedded in it. 0
major_num int The major number of the disk. -1
minor_num int The minor number of the disk. -1
lun_id unsigned long long The Logical Unit Number (LUN) for the disk. 0xFFFFFFFFFFFFFFFF
transport_type int The transport type of the disk. It can match one of the following built-in constant values:
  • T_FC (Fibre Channel)
  • T_ISCSI (iSCSI)
  • T_VSCSI (Virtual SCSI)
  • T_SAS (Serial Attached SCSI)
-1
queue_depth int The queue depth of the disk. It indicates how many maximum simultaneous I/O requests that the disk driver can pass on to the lower layer (for example, adapter). If the number of incoming I/O requests is more than queue_depth, the request is handled differently. The extra request is handled by the disk driver in its wait queue until lower layer responds to at least one of the outstanding I/O requests. -1
cmds_out int Number of outstanding I/O command requests to the lower layer (for example, adapter). -1
path_count int Number of MPIO paths of the disk (Only if the disk is MPIO capable, else it is 0). -1
reserve_policy int The SCSI reservation policy of the disk. It matches one of the following built-in constant values:
  • DK_NO_RESERVE (no_reserve)
  • DK_SINGLE_PATH (single_path)
  • DK_PR_EXCLUSIVE (PR_exclusive)
  • DK_PR_SHARED (PR_shared)

Refer to AIX MPIO documentation to know more about the reservation policies.

-1
scsi_flags int The SCSI flags of the disk. The following built-in flag values are defined:
  • SC_AUTOSENSE_ENABLED (On error, target sends sense data in the response. Initiator needs not send request sense command.)
  • SC_NACA_1_ENABLED (Normal ACA is enabled and the target goes to ACA state if it is returning check condition.)
  • SC_64BIT_IDS (64-bit SCSI ID and logical unit number(LUN)
  • SC_LUN_RESET_ENABLED (LUN reset command can be sent.)
  • SC_PRIORITY_SUP (Device supports I/O priority.)
0
   
  • SC_CACHE_HINT_SUP (Device supports cache hints.)
  • SC_QUEUE_UNTAGGED (Device supports queuing of untagged commands.)
Note: All flag values are not defined, hence other flags present might be available in the value.
0

__diskcmd built-in variable

You can use the __diskcmd special built-in variable to get various information about the SCSI I/O command for the current operation. It is available in probes of sub type disk (but only iostart and iodone events). Its member elements can be accessed by using syntax __diskcmd->member.
Note: Whenever the actual value cannot be obtained, the value that is marked as “Invalid Value” is returned. There might be following reasons for getting value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__diskcmd built-in variable has following members:

Table 18. The __diskcmd built-in variable members
Member name Type Description
cmd_type int The type of the SCSI command (both type and subtype are merged together). The following built-in constant values are available as command type:
  • DK_BUF (normal I/O read/write)
  • DK_IOCTL (ioctl)
  • DK_REQSNS (Request sense)
  • DK_TGT_LUN_RST (target or LUN reset)
  • DK_TUR (Test unit ready)
  • DK_INQUIRY (Inquiry)
  • DK_RESERVE (SCSI-2 RESERVE, 6-byte version)
  • DK_RELEASE (SCSI-2 RELEASE, 6-byte version)
  • DK_RESERVE_10 (SCSI-2 RESERVE, 10-byte version)
  • DK_RELEASE_10 (SCSI-2 RELEASE, 10-byte version)
  • DK_PR_RESERVE (SCSI-3 Persistent Reserve, RESERVE)
  • DK_PR_RELEASE (SCSI-3 Persistent Reserve, RELEASE)
  • DK_PR_CLEAR (SCSI-3 Persistent Reserve, CLEAR)
  • DK_PR_PREEMPT (SCSI-3 Persistent Reserve, PREEMPT)
  • DK_PR_PREEMPT_ABORT (SCSI-3 Persistent Reserve, PREEMPT AND ABORT)
  • DK_READCAP (READ CAPACITY, 10-byte version)
  • DK_READCAP16 (READ CAPACITY, 16-byte version)
Note: The built-in constants are bit position values and hence their presence must be checked by using ‘&’ operator (the ‘==’ operator must not be used). For example: __diskcmd->cmd_type & DK_IOCTL.
retry_count int It indicates whether the I/O command is retried after any failure.
Note: The value of 1 means that it is the first attempt. Any larger value indicates actual retrials.
path_switch_count int It indicates how many times the path was changed for this particular I/O operation (usually indicates some I/O path failure, either transient or permanent).
status_validity int In case of any error, this value indicates whether it is a SCSI error or adapter error. It can match one of the following built-in constant values: SC_SCSI_ERROR or SC_ADAPTER_ERROR. If there is no error, then it is 0.
scsi_status int If the status_validity field is set to SC_SCSI_ERROR, this field gives more details about the error. It can match one of the built-in constant values:
  • SC_GOOD_STATUS (Task is completed successfully)
  • SC_CHECK_CONDITION (Some error, sense data provides more information)
  • SC_BUSY_STATUS (LUN is busy, cannot accept command)
  • SC_RESERVATION_CONFLICT (Violation of existing SCSI reservation.)
  • SC_COMMAND_TERMINATED (The device ended the command.)
  • SC_QUEUE_FULL (The device queue is full.)
  • SC_ACA_ACTIVE (The device is in Auto Contingent Allegiance state.)
  • SC_TASK_ABORTED (The device stopped the command.)
Note: All possible values are not defined. Hence, SC_SCSI_ERROR can have a value that might not match any of the built-in values. You can look up the corresponding SCSI command response code.
adapter_status int If the status_validity field is set to SC_ADAPTER_ERROR, this field provides more information about the error. It can match one of the following built-in constant values:
  • ADAP_HOST_IO_BUS_ERR (Host I/O bus error)
  • ADAP_TRANSPORT_FAULT (transport layer error)
  • ADAP_CMD_TIMEOUT (I/O command was timed out)
  • ADAP_NO_DEVICE_RESPONSE (no response from the device)
  • ADAP_HDW_FAILURE (adapter hardware failure)
  • ADAP_SFW_FAILURE (adapter microcode failure)
  • ADAP_TRANSPORT_RESET (adapter detected an external SCSI bus reset)
  • ADAP_TRANSPORT_BUSY (transport layer is busy)
  • ADAP_TRANSPORT_DEAD (transport layer is inoperative)
  • ADAP_TRANSPORT_MIGRATED (transport layer is migrated)
  • ADAP_FUSE_OR_TERMINAL_PWR (adapter blown fuse or bad electrical termination)

__iopath built-in variable

You can use the __iopath special built-in variable to get various information about the I/O path for the current operation. It is available in probes of sub type disk for iostart and iodone events only. Its member elements can be accessed by using the __iopath->member syntax .
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value, is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__iopath has following members:

Table 19. The __iopath built-in variable members
Member name Type Description Invalid Value
path_id int The ID of the current path (starting from 0). -1
scsi_id unsigned long long The SCSI ID of the target on this path. 0xFFFFFFFFFFFFFFFF
lun_id unsigned long long The Logical Unit Number (LUN) on this path. 0xFFFFFFFFFFFFFFFF
ww_name unsigned long long The worldwide name of the target port on this path. 0
cmds_out int The number of I/O commands outstanding on this path. -1

__j2info built-in variable

The __j2info is a special built-in variable that you can use to get various information about JFS2 file system operation. It is available in probes of sub type jfs2. Its member elements can be accessed by using the __j2info->member syntax.
Note: Whenever the actual value cannot be obtained, the value, which is marked as Invalid Value is returned. There might be following reasons for getting this value:
  • Page fault context is required, but the current probevctrl tunable value num_pagefaults is either 0 or not sufficient.
  • The memory location that contains the value is paged out.
  • Any other severe system error such as invalid pointer or corrupted memory.

__j2info has the following members:

Table 20. The __j2info built-in variable members
Member name Type Description Invalid Value
inode_id unsigned long long A system-wide unique number that is associated with the file of current operation.
Note: It is different from the file inode number.
0
f_type int Type of the file. The __file->f_type description provides possible values. -1
mount_path char * The path where the file system is mounted. null string.
devnum unsigned long long The device number of the underlying block device of the file system. It has both major number and minor number embedded. 0
major_num int The major number of the underlying block device of the file system. -1
minor_num int The minor number of the underlying block device of the file system. -1
l_blknum unsigned long long The logical block number for this file operation. 0xFFFFFFFFFFFFFFFF
l_bcount unsigned long long The requested byte count between the logical blocks in this operation. 0xFFFFFFFFFFFFFFFF
child_bufid unsigned long long The bufid of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bufid. 0
child_blknum unsigned long long The block number of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->blknum. 0xFFFFFFFFFFFFFFFF
child_bcount unsigned long long The byte count of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bcount. 0xFFFFFFFFFFFFFFFF
child_bflags unsigned long long The flags of the I/O request buffer that is sent down to the lower layer (for example, LVM). In that layer, it appears as __iobuf->bflags. 0

Example scripts for I/O probe manager

  1. Script to trace any write operation to the /etc/passwd file:
    
    int write(int, char *, int);
    @@BEGIN  {
            target_inodeid = fpath_inodeid("/etc/passwd");
    }
    @@syscall:*:write:entry {
            if (fd_inodeid(__arg1) == target_inodeid) {
                    printf("write on /etc/passwd: timestamp=%A, pid=%lld, pname=[%s], uid=%lld\n",
                            timestamp(), __pid, __pname, __uid);
            }
    }
    If the scripts is in a VUE file, names etc_passwd.e. The script can be run as:
    # probevue etc_passwd.e
    In another terminal, if the user (root) runs:
    # mkuser user1
    Then probevue displays an output similar to the following example:
    write on /etc/passwd: timestamp=Mar/03/15 16:10:07, pid=14221508, pname=[mkuser], uid=0
    
    
  2. Script to find the maximum and minimum I/O operation time for a disk (for example, hdisk0) in a period. Also, find the block number, requested byte count, time of operation and type of operation (read or write) corresponding to the maximum or minimum time.
long long min_time, max_time;
@@BEGIN {
        min_time = max_time = 0;
}
@@io:disk:entry:*:hdisk0 {
        ts_entry[__iobuf->bufid] = (long long)timestamp();
}
@@io:disk:exit:*:hdisk0 {
        if (ts_entry[__iobuf->bufid]) { /* only if we recorded entry time */
                ts_now = timestamp();
                op_type = (__iobuf->bflags & B_READ) ? "READ" : "WRITE";
                dt = (long long)diff_time(ts_entry[__iobuf->bufid], ts_now, MICROSECONDS);
                if (min_time == 0 || dt < min_time) {
                        min_time = dt;
                        min_blknum = __iobuf->blknum;
                        min_bcount = __iobuf->bcount;
                        min_ts = ts_now;
                        min_optype = op_type;
                }
                if (max_time == 0 || dt > max_time) {
                        max_time = dt;
                        max_blknum = __iobuf->blknum;
                        max_bcount = __iobuf->bcount;
                        max_ts = ts_now;
                        max_optype = op_type;
                }
                ts_entry[__iobuf->bufid] = 0;
        }
}
@@END {
        printf("Maximum and minimum IO operation time for [hdisk0]:\n");
        printf("Max: %lld usec, block=%lld, byte count=%lld, operation=%s, time of operation=[%A]\n",
                max_time, max_blknum, max_bcount, max_optype, max_ts);
        printf("Min: %lld usec, block=%lld, byte count=%lld, operation=%s, time of operation=[%A]\n",
                min_time, min_blknum, min_bcount, min_optype, min_ts);
}

Let this script be in a VUE file named disk_min_max_time.e. It can be executed as:
# probevue disk_min_max_time.e
Let there be some IO activity on hdisk0 (dd command can be used). 
Then after a few minutes, if the above command is terminated (by pressing CTRL-C), then it will print output similar to:
^CMaximum and minimum IO operation time for [hdisk0]:
Max: 48174 usec, block=6927976, byte count=4096, operation=READ, time of operation=[Mar/04/15 03:31:07]
Min: 133 usec, block=6843288, byte count=4096, operation=READ, time of operation=[Mar/04/15 03:31:03]