Monitoring IBM i instances

After you install the Instana host agent, the IBM i instances sensor is automatically installed. You can view metrics that are related to xxx in the Instana UI after you configure as outlined in the Configuring section.

Instana supports both remote and local monitoring of IBM i instances. For more information about individual metrics availability, see IBM i services. To install the host agent for IBM i, see Installing the host agent on IBM i.

Supported information

Supported versions for local monitoring

For local monitoring, Instana supports IBM i 7.4 and later.

Supported versions for remote monitoring

For remote monitoring, Instana supports IBM i 7.2 and later.

The IBMI versions supported on Instana are listed in the following table:

Sensor Support policy Latest version Last supported version
IBMI 45 days 7.5 7.5

Configuring

Configuring for local monitoring

To monitor IBM i locally, you must first install the Instana host agent on IBM i. For more information, see Installing the host agent on IBM i.

When you install and run the host agent, the agent automatically discovers the processes and starts the sensor with the default configurations. For more information about host agent configurations, see Configuring host agent.

A sample configuration for local monitoring is shown in the following example:

This configuration is optional. To leverage custom events and custom polling, enable the following configuration.

com.instana.plugin.ibmiseries:
  enabled: true
  local: # Single configuration only
    poll_rate_configuration: #Values are in seconds.
      os_poll_rate: 15 #This is default OS poll rate. Provide the values in seconds.
      db2_poll_rate: 15 #This is default DB2 poll rate. Provide the values in seconds.
      disabled_components:
        components: 'Names of the Components/Grids to be disabled in comma(,) separated way' # Example - 'HISTORY_LOGS, JOB_QUEUES'
      custom_poll_rate: #Multiple Poll Rate is supported. (Optional)
        poll_rate_1:
          polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 30
          components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
          events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
        poll_rate_2:
          polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 60
          components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
          events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
    user_specification: # For user inputs (Optional)
      activeJobs:
        jobs: 'comma separated list of job names. You can use * as a wild card character in any part of the job name' # example - '*/QUSER/QZDASOINIT, 311353/QLWISVR/ADMIN, 10034*/QUSER/*'
        event:
          identicalJobs:
            jobName/user: 'Provide JOB_NAME_SHORT/JOB_USER values in comma(,) separated way. Use * in user part as wild card character' # example - 'CT_AGENT/QAUTOMON,QZDASOINIT/*'
            threshold: 5 # Event will be triggered if jobs with same name & user in the running status is less than threshold value.
          runningStatus: 'Provide JobStatus/Subsystem values in comma(,) separated way. You can use * in subsystem part as wild card' # example - 'SIGW/QHTTPSVR,DSC/*,DEQW/QAUTOMON'
          inactiveJobs:  'Provide JOB_NAME in comma(,) separated way. You can use * as wild card character in any part of the JOB_NAME'  # example - '*/QSYS/QAUTOMON , 137640/QSYS/QBATCH , */QWEBADMIN/* , */QSYS/QAUTO*'
          enableInactiveJOBQStatus: #Allowed value 'true' or 'false'. Alert will be triggered for the inactive jobs if JOB_QUEUE_STATUS is 'RELEASED' or 'SCHEDULED' and JOB_STATUS is 'JOBQ'.
      diskStatus:
        operationalState: 'State of the disk which is expected, in comma(,) separated way' #Example 'ACTIVE, BUSY'
      messageQueue:
        filter: # User defined filter for Message Queue table
          library/queueName: 'Lib-1/queueName1,Lib-2/queueName2'  ## Provide values in comma(,) separated way. (Default Value : 'QSYS/QSYSOPR')
          timeFrame: '10 HOURS'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
        event: # User defined Event for Message Queue
          messageQueueIdEvent:
            Lib-1/queueName1:
              Event_Name_1 : 'messageId-1,messageId-2'  ##Example: Hyper Swap Alerts : 'CPC1E1D, CPI1E23'
              Event_Name_2 : 'messageId-3,messageId-4'  ##Example: Error in Device BackUp: 'CPI1E92, AMQ89*'. You can provide * in case you want to provide partial message id
            Lib-2/queueName2:
              Event_Name_3 : 'messageId-5'              ## You can provide the Event_Name as per relevance
          timeFrame: '15 MINUTES'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
          messageQueueTextEvent:
            Lib-1/queueName1:
              Event_Name_1 : 'Provide fully or partial message text' ## example: Clean up : 'messages deleted'
              Event_Name_2 : 'Provide fully or partial message text' ## example: IBM MQ Issue : 'Queued Publish/Subscribe Daemon'
            Lib-2/queueName2:
              Event_Name_3 : 'Provide fully or partial message text' ## You can provide the Event_Name as per relevance
          timeFrameText: '15 MINUTES'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
      historyLog:
        filter:
          timeFrame: '1 DAYS'  ## Format is : {value} MINUTES / HOURS / DAYS (Default Value : '10 MINUTES')
      subsystem:
        subsystem_list: 'Provide SUBSYSTEM_DESCRIPTION_LIBRARY/SUBSYSTEM_DESCRIPTION in comma(,) separated way'   # example- 'QDEVELOP/RATS,QINMEDIA/QBASE'
      netstatEventInfo:
        port/address: 'Provide LOCAL_PORT/LOCAL_ADDRESS in comma(,) separated way. You can use * in LOCAL_ADDRESS as wild card.'  #example- '427/9.5.105.61 ,38695/* ,8475/9.51.151.81' [Alert will be triggered if TCP state is not LISTEN or Null]

Configuring for remote monitoring

To start monitoring IBM i instances remotely, you must first install the Instana agent based on your operating system. For more information, see Installing host agents. Then, configure the following agent configuration file <agent_install_dir>/etc/instana/configuration.yaml.

  • See the User Authorization section for the required authorities for the Instana user profile.

  • The field sslEnabled is optional. It is only necessary when you want to establish a secure connection with the host component.

  • If sslEnabled is set to true, you need to import your trusted certificate in your JRE's cacerts (jvm/jre/lib/security/cacerts) by using the keytool command:

    keytool -import -alias ALIAS_NAME -keystore "/path/to/jre/cacerts" -file
    YOUR_CERTIFICATE_NAME.crt
    
  • If you are asked for a password, enter the default password changeit.

See the following configuration example for remote monitoring:

com.instana.plugin.ibmiseries:
  enabled: true
  remote: # multiple configurations supported
    - host: 'remote.host-2.com'
      #For a SSL connection set sslEnabled to either 'true' or 'false'.
      sslEnabled: 'true/false'
      user: 'username'
      password: 'password'
      availabilityZone: 'IBM i Remote Monitoring'
      poll_rate_configuration: #Values are in seconds.
        os_poll_rate: 15 #This is default OS poll rate. Provide the values in seconds.
        db2_poll_rate: 15 #This is default DB2 poll rate. Provide the values in seconds.
        disabled_components:
          components: '<Names of the components or grids to be disabled>' # Use comma (,) to separate them. Example - 'HISTORY_LOGS, JOB_QUEUES'.
        custom_poll_rate: #Multiple Poll Rate is supported. (Optional)
          poll_rate_1:
            polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 30
            components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
            events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
          poll_rate_2:
            polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 60
            components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
            events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
      user_specification: # For user inputs (Optional)
        activeJobs:
          jobs: 'comma separated list of job names. You can use * as a wildcard character in any part of the job name' # example - '*/QUSER/QZDASOINIT, 311353/QLWISVR/ADMIN, 10034*/QUSER/*'
          event:
            identicalJobs:
              jobName/user: 'Provide JOB_NAME_SHORT/JOB_USER values in comma(,) separated way. Use * in user part as wildcard character' # example - 'CT_AGENT/QAUTOMON,QZDASOINIT/*'
              threshold: 5 # Event will be triggered if jobs with same name & user in the running status is less than threshold value.
            runningStatus: 'Provide JobStatus/Subsystem values in comma(,) separated way. You can Use * in subsystem part as wildcard' # example - 'SIGW/QHTTPSVR,DSC/*,DEQW/QAUTOMON'
            inactiveJobs:  'Provide JOB_NAME in comma(,) separated way. You can use * as wildcard character in any part of the JOB_NAME'  # example - '*/QSYS/QAUTOMON , 137640/QSYS/QBATCH , */QWEBADMIN/* , */QSYS/QAUTO*'
            enableInactiveJOBQStatus: #Allowed value 'true' or 'false'. Alert will be triggered for the inactive jobs if JOB_QUEUE_STATUS is 'RELEASED' or 'SCHEDULED' and JOB_STATUS is 'OUTQ'.
        diskStatus:
          operationalState: 'State of the disk which is expected, in comma(,) separated way' #Example 'ACTIVE, BUSY'        
        messageQueue:
          filter: # User defined filter for Message Queue table
            library/queueName: 'Lib-1/queueName1,Lib-2/queueName2'  ## Provide values in comma(,) separated way. (Default Value : 'QSYS/QSYSOPR')
            timeFrame: '10 HOURS'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
          event: # User defined Event for Message Queue
            messageQueueIdEvent:
              Lib-1/queueName1: ## example:  QSYS/QSYSOPR
                Event_Name_1 : 'messageId-1,messageId-2'  ##Example: Hyper Swap Alerts : 'CPC1E1D, CPI1E23'
                Event_Name_2 : 'messageId-3,messageId-4'  ##Example: Error in Device BackUp: 'CPI1E92, AMQ89*'. You can provide * in case you want to provide partial message id
              Lib-2/queueName2:
                Event_Name_3 : 'messageId-5'              ## You can provide the Event_Name as per relevance
            timeFrame: '15 MINUTES'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
            messageQueueTextEvent:
              Lib-1/queueName1: ## example:  QSYS/QSYSOPR
                Event_Name_1 : 'Provide fully or partial message text' ## example: Clean up : 'messages deleted'
                Event_Name_2 : 'Provide fully or partial message text' ## example: IBM MQ Issue : 'Queued Publish/Subscribe Daemon'
              Lib-2/queueName2:
                Event_Name_3 : 'Provide fully or partial message text' ## You can provide the Event_Name as per relevance
            timeFrameText: '15 MINUTES'  ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
        historyLog:
          filter:
            timeFrame: '1 DAYS'  ## Format is : {value} MINUTES / HOURS / DAYS (Default Value : '10 MINUTES')
        subsystem:
          subsystem_list: 'Provide SUBSYSTEM_DESCRIPTION_LIBRARY/SUBSYSTEM_DESCRIPTION in comma(,) separated way'   # example -'QDEVELOP/RATS,QINMEDIA/QBASE'
        netstatEventInfo:
          port/address: 'Provide LOCAL_PORT/LOCAL_ADDRESS in comma(,) separated way. You can use * in LOCAL_ADDRESS as wildcard.'  #example- '427/9.5.105.61, 38695/*, 8475/9.5.151.81' [Alert will be triggered if TCP state is not LISTEN or Null]

Custom poll rate

You can configure multiple Poll Rate in the previously mentioned configuration file. It consists of three fields:

  • os_poll_rate: It is the default poll rate of the sensor for OS metrics. This field is mandatory.
  • db2_poll_rate : It is the default poll rate of the sensor for Db2 metrics. This field is mandatory.
  • custom_poll_rate: You can provide your custom poll rate value with the defined components (each grid is treated as a component) and events (supported custom events) details. This field is optional.
  • disabled_components : You can define the list of components that you don't want to dispaly in the Instana UI. This field is optional. Use comma (,) to separate each component.

Converting event to alert

You can configure Instana to send alerts for events based on the criteria set in the Dynamic Focus query. To configure Instana to send alerts for events, complete the following steps:

  1. Log in to the Instana UI, and click Settings > Alerts > New Alert.
  2. Select the event types as Alert on Event Type(s).
  3. Select Warning as Event types.
  4. Select Scope as Selected Entities Only (Dynamic Focus Query).
  5. Define the criteria in the Dynamic Focus Query section. See the following examples:
    • To get alert for all the defined custom events, specify the following criteria:
      entity.ibmi.os.hostname:XYZ
      
      Where XYZ is the host name of the server.
    • To set specific custom events for a specific host, specify the following criteria:
      (entity.ibmi.os.hostname:XYZ) AND (event.text:'EventName-ABC')
      
      Where ABC is the event name as defined for the custom poll rate events configuration.
  6. Add the preferred Alert Channels in the Alerting section.

For more information about dynamic filtering, see Syntax.

The configured remote IBM i instance will then be shown as a separate box in the specified availabilityZone.

JDBC connection ports

The Instana remote sensor for IBM i uses a remote JDBC connection and a Java Toolkit remote connection. These connections use the following ports:

Ports Description
8470 Port 8470 is used for host code page translation tables and licensing functions.
8471 Port 8471 is used for database access.
8475 Port 8475 is used to check for application administration restrictions.
8476 Port 8476 is used for checking signon verification to authenticate.
449 Port 449 is used to look up service by name and return the port number.
446 Port 446 is used for Distributed data management (DDM)/Distributed Relational Database Architecture (DRDA) to make a remote connection to Db2 for IBM i by using JDBC.

Default poll rate

Custom provider name UI table description Default poll rate
AUXILIARY_STORAGE_POOLS Auxiliary Storage pools 15 seconds
HARDWARE_DISK_DRIVES Hardware Disk Drives (HDD) Info 60 seconds
HISTORY_LOGS History Log 15 seconds
JOB_QUEUES Job Queue 15 seconds
TOP_ACTIVE_JOBS Top Active Jobs 15 seconds
MEMORY_POOLS Active Memory Pools 15 seconds
MESSAGE_QUEUES Message Queue 15 seconds
NETWORK_CONNECTIONS_TOP_RECEIVERS Network Connections (Top Receivers) 60 seconds
NETWORK_CONNECTIONS_TOP_SENDERS Network Connections (Top Sender) 60 seconds
NETSTAT_INTERFACES Netstat Interfaces 120 seconds
NON_VOLATILE_MEMORY Non Volatile Memory Express(NVMe) Information 60 seconds
OUTPUT_QUEUES Output Queues 15 seconds
SOLID_STATE_DISK Solid State Disk (SSD) Info 60 seconds
SPOOL_SPACE Total Spool Space 120 seconds
ACTIVE_SUBSYSTEMS Active Subsystems 15 seconds
SYSTEM_STATUS All KPIs (CPU rate, utilization, thread, active Jobs) 15 seconds
SYSTEM_DISK_STATUS System Overall Disk Information 60 seconds

User authorization

The user profile that is specified within the user configuration parameter must have the *JOBCTL authority. You need to grant the following authorities to the IBM i user profile that Instana uses:

  • *USE authority on the QSYS/CHKPFRCOL(Check Performance Collection) command
  • Job Control (*JOBCTL) special authority
  • *USE authority on the QSYS/WRKPTFGRP(Work with Program Temporary Fix Groups) command
  • *USE authority on the QPMCCDATA authorization list

Metrics collection

To view the metrics, select Infrastructure in the sidebar of the Instana User interface, click a specific monitored host, and then you can see a host dashboard with all the collected metrics and monitored processes.

Configuration data

  • Host name
  • OS Version
  • Total CPU
  • Total Memory
  • Configured CPU
  • Configured Memory
  • Partition ID
  • Number of partitions
  • Restricted state

Performance metrics

System Metrics

Component Name: SYSTEM_STATUS (for custom poll rate component configuration)

Metric Description Granularity
CPU Rate The average CPU rate expressed as a percentage where 100% indicates the processor is running at its nominal frequency. A value above or as follows 100% indicates how much the processor has been slowed down (throttled) or speeded up (turbo) relative to the nominal frequency for the processor model. For instance, a value of 120% indicates the processor is running 20% faster against its nominal speed. 15 seconds
Average CPU Utilization The average CPU utilization for all the active processors. 15 seconds
Min CPU Utilization The CPU utilization of the processor that reported the minimum amount of CPU utilization. 15 seconds
Max CPU Utilization The CPU utilization of the processor that reported the maximum amount of CPU utilization. 15 seconds
Active Jobs The number of jobs active in the system (jobs that have been started, but have not yet ended), including both user and system jobs. 15 seconds
Interactive Jobs The percentage of interactive performance assigned to this logical partition. This value is a percentage of the total interactive performance available to the entire physical system. 15 seconds
Total Jobs The total number of user and system jobs that are currently in the system. The total includes: all jobs on job queues waiting to be processed, all jobs currently active (being processed), all jobs that have completed running but still have output on output queues to be produced. 15 seconds
Max Jobs The maximum number of jobs that are allowed on the system. When the number of jobs reaches this maximum, you can no longer submit or start more jobs on the system. The total includes: all jobs on job queues waiting to be processed, all jobs currently active (being processed), all jobs that have completed running but still have output on output queues to be produced. 15 seconds
Used Auxiliary Storage Pool The percentage of the system storage pool (ASP number 1) currently in use. 15 seconds
Capacity of Auxiliary Storage Pool The storage capacity of the system auxiliary storage pool (ASP number 1) in millions of bytes. This value represents the amount of space available for storage of both permanent and temporary objects. 15 seconds
Current Temporary Storage The current amount of storage, in millions of bytes, in use for temporary objects. 15 seconds
Maximum Temporary Storage Used The largest amount of storage, in millions of bytes, used for temporary objects at any one time since the last IPL. 15 seconds
Active Threads The number of initial and secondary threads in the system (threads that have been started, but have not yet ended), including both user and system threads. 15 seconds
Total Spool Space The total spool space consumed by the output queue in bytes. 15 seconds

Active Memory Pool Metrics

Component Name: MEMORY_POOLS (for custom poll rate component configuration)

Metric Description Granularity
Storage Used The amount of main storage, in megabytes, in the pool. 15 seconds
Storage Reserved The amount of storage, in megabytes, in the pool that is reserved for system use. For example, the pool for save or restore operations. 15 seconds
Storage Defined The size of the pool, in megabytes, as defined in the shared pool, subsystem description, or system value QMCHPOOL. Contains the null value for a pool without a defined size. 15 seconds
Active Threads The number of threads that are currently using the pool. 15 seconds
Ineligible Threads The number of ineligible threads in the pool. 15 seconds
Max Threads The maximum number of threads that can be active in the pool at any time. 15 seconds
Elapsed Database Faults The number of page faults per second against pages that contain database access. 15 seconds
Elapsed Total Faults The total database and non-database page faults per second. 15 seconds
Elapsed Non Database Faults The number of page faults per second against non-database access. 15 seconds

Output Queue Metrics

Component Name: OUTPUT_QUEUES (for custom poll rate component configuration)

Metric Description Granularity
Queue Name The name of the output queue. 15 seconds
Library Name The name of the library that contains the output queue. 15 seconds
Status The status of the output queue. 15 seconds
Files in Queue The total number of spooled files currently on this output queue. 15 seconds
Writer Job Name The qualified job name of the writer job. If more than one writer is started, this is the name of the first writer. Contains the null value if a writer job is not started for this queue. 15 seconds
Writer Job Status The status of the writer job. If more than one writer is started, this is the status of the first writer. 15 seconds

Top Spool Space Consumption

Top 20 users consuming the spool space.

Component Name: SPOOL_SPACE (for custom poll rate component configuration)

Metric Description Granularity
User The name of the user profile that produced the Spool files. 120 seconds
Spool Space The size of the users spooled files, in bytes. 120 seconds

Total Spool Space Consumption

Component Name: SPOOL_SPACE (for custom poll rate component configuration)

Metric Description Granularity
Total Spool Space The total spool space consumed by the output queue in bytes. 120 seconds

Top Active Jobs

Top 20 active jobs that are currently running in the system, along with the job names matching the values that are specified in user_specification:activeJobs:jobs.

Component Name: TOP_ACTIVE_JOBS (for custom poll rate component configuration) Active Subsystems Custom Event:

  • Event Name: IDENTICAL_JOBS_EVENT (for custom poll rate events configuration)
    You can specify the jobName/user and threshold in the user_specification:activeJobs:event:identicalJobs section of the Instana agent configuration file. Then, if the count of active jobs is less than the specified threshold value at any time, an event will be triggered for the jobs that are defined in the configuration.
    • Wildcard Support: To validate a particular job that is irrespective of users with the threshold value, use * for the user part. For example, 'QZDASOINIT/*'
      Note: IBM i 7.2 is not supported to use the IDENTICAL_JOBS_EVENT field.
  • Event Name: RUNNING_JOB_STATUS_EVENT (for custom poll rate events configuration)
    You can specify the runningStatus field, which is combination of the JobStatus/Subsystem, in the user_specification:activeJobs:event:runningStatus section of the Instana agent configuration file. Then, an event will be triggered for the jobs that match the defined criteria.
    • Wildcard Support: To validate a specific job status in all the available Subsystem, use * in the Subsystem part.
      Note: IBM i 7.2 is not supported to use the RUNNING_JOB_STATUS_EVENT field.
  • Event Name: INACTIVE_JOBS_EVENT (for custom poll rate events configuration)
    You can specify the JOB_NAME field in the inactiveJobs field in the user_specification:activeJobs:event:inactiveJobs section of the Instana agent configuration file. Then, an event will be triggered for the jobs that are not in Active state in the system at any given point of time.
    • Wildcard Support: To trigger an inactive job event based on the JOB_NAME, use * as a prefix or suffix within any part of the JOB_NAME part.
  • Event Name: INACTIVE_JOBS_IN_JOBQ_EVENT (for custom poll rate events configuration)
    You can enable or disable the enableInactiveJOBQStatus field in the user_specification:activeJobs:event:enableInactiveJOBQStatus section of the Instana agent configuration file. If the enableInactiveJOBQStatus field is enabled, then an event will be triggered for that Jobs whose JOB_QUEUE_STATUS is RELEASED or SCHEDULED and JOB_STATUS is JOBQ at any given point of time.
    Note: IBM i 7.2 is not supported to use the enableInactiveJOBQStatus field.
Metric** Description Granularity
Job Name The qualified job name. 15 seconds
User Name The user profile under which the initial thread is running at this time. For jobs that swap user profiles, this user profile name and the user profile that initiated the job can be different. 15 seconds
Elapsed CPU Percentage The percent of processing unit time attributed to this job during the measurement time interval. 15 seconds
Temporary Storage The size of the users spooled files, in kilobytes. 15 seconds
Job Status The status of the initial thread of the job. 15 seconds
Job Type Type of active job. 15 seconds
Thread Count The number of active threads in the job. 15 seconds

Auxiliary Storage Pools

Information about auxiliary storage pools (ASPs).

Component Name: AUXILIARY_STORAGE_POOLS (for custom poll rate component configuration)

Metric Description Granularity
ASP Number A unique identifier for an ASP. Possible values are 1 through 255. 15 seconds
Device Description Name The name of the device description that brought the independent ASP (IASP) to varyon/active state. 15 seconds
ASP Type The use that is assigned to the ASP. 15 seconds
ASP State The device configuration status of an ASP. 15 seconds
Number Of Disk Units The total number of disk units in the ASP. If mirroring is active for disk units within the ASP, the mirrored pair of units is counted as one. 15 seconds
Total Capacity The total number of used and unused megabytes in the ASP. A special value of -2 is returned if the size of this field is exceeded. 15 seconds
Total Capacity Utilization Utilization Percentage of the Total Capacity in the ASP. 15 seconds
Protected Capacity The total number of used and unused megabytes in the ASP that are protected by mirroring or device parity. A special value of -2 is returned if the value was too big to return. Contains the null value if the capacity cannot be determined. 15 seconds
Protected Capacity Utilization Utilization Percentage of the Protected Capacity in the ASP. 15 seconds
Unprotected Capacity The total number of used and unused megabytes in the ASP that are not protected by mirroring or device parity. A special value of -2 is returned if the value was too big to return. Contains the null value if the capacity cannot be determined. 15 seconds
Unprotected Capacity Utilization Utilization Percentage of the Unprotected Capacity in the ASP. 15 seconds

Active Subsystems

Information about Active Subsystems

Component Name: ACTIVE_SUBSYSTEMS (for custom poll rate component configuration)

  • Event Name: SUBSYSTEM_STATUS (for custom poll rate events configuration)
    You can specify the subsystem_list in the configuration file user_specification:subsystem. Then, an event will be triggered if the specified subsystems are not in Active status.
Metric Description Granularity
Name The name of the subsystem about which information is being returned. 15 seconds
Library Name The name of the library in which the subsystem description resides. 15 seconds
Active Jobs The number of jobs currently active in the subsystem. This number includes held jobs but excludes jobs that are disconnected or suspended because of a transfer secondary job or a transfer group job. If STATUS is INACTIVE, returns 0. 15 seconds
Max Active Jobs The maximum number of jobs that can run or use resources in the subsystem at one time. Contains the null value if the subsystem description specifies *NOMAX, indicating that there is no maximum. 15 seconds
Description The text description of the subsystem description. 15 seconds

Job Queue

Information about job queue.

Component Name: JOB_QUEUES (for custom poll rate component configuration)

Metric Description Granularity
Job Queue Name The name of the job queue. 15 seconds
Job Queue Library The name of the library that contains the job queue. 15 seconds
Subsystem Name The name of the subsystem that can receive jobs from this job queue. Contains the null value if this job queue is not associated with an active subsystem. 15 seconds
Subsystem Library Name The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. 15 seconds
Number Of Jobs The number of jobs in the queue. 15 seconds
Active Jobs The current number of jobs that are active that came through this job queue entry. Contains the null value if this job queue is not associated with an active subsystem. 15 seconds
Maximum Active Jobs The maximum number of jobs that can be active at the same time through this job queue entry. A value of -1 indicates *NOMAX, no maximum number of jobs is defined. Contains the null value if this job queue is not associated with an active subsystem. 15 seconds
Job Queue Status The status of the job queue. HELD : The queue is held. RELEASED : The queue is released. 15 seconds
Text Description Text that describes the job queue. Contains the null value if there is no text description for the job queue. 15 seconds
Held Jobs The current number of jobs that are in *HELD status. This is the sum of the 10 HELD_JOBS_PRIORITY_n columns. 15 seconds
Released Jobs The current number of jobs that are in *RELEASED status. This is the sum of the 10 RELEASED_JOBS_PRIORITY_n columns. 15 seconds
Scheduled Jobs The current number of jobs that are in *SCHEDULED status. This is the sum of the 10 SCHEDULED_JOBS_PRIORITY_n columns. 15 seconds

Network interfaces

Information about IPv4 and IPv6 interfaces

Component Name: NETSTAT_INTERFACES (for custom poll rate component configuration)

Metric Description Granularity
Internet Address The internet address of the interface. 120 seconds
Subnet Mask The subnet mask for the network, subnet, and host address fields of the internet address that defines the subnetwork for an interface. Contains null if this is an IPv6 connection. 120 seconds
Connection Type The type of connection (IPV4,IPV6). 120 seconds
Interface Line Type The type of line used by the interface. 120 seconds
Line Description The name of the communications line description that identifies the physical network associated with an interface. 120 seconds
VLAN ID The virtual LAN to which this interface belongs. 120 seconds
Status The current status of the logical interface. 120 seconds

Status value Mapping

Metric Value Status
0 ENDING
1 ACTIVE
2 FAILED
3 FAILED_TCP
4 INACTIVE
5 RCYCNL
6 RCYPND
7 STARTING
8 ACQUIRING
9 ACQUIRING
10 ACQUIRING

Network connections (Top Receivers)

Netstat Info For Bytes Received Locally

Component Name: NETWORK_CONNECTIONS_TOP_RECEIVERS (for custom poll rate component configuration)

  • Event Name: ACTIVE_PORTS_LISTENING_STATUS (for custom poll rate events configuration)
    You can specify the LOCAL_PORT/LOCAL_ADDRESS field in the user_specification:netstatEventInfo section of the Instana agent configuration file. Then, if any of the defined port is not in LISTEN or Null state, an event will be triggered for the port that are defined in the configuration.
    • Wildcard Support: To validate a particular Port Number irrespective of the Local Address, use * for the address part. For example, '38695/*'
Metric Description Granularity
Remote Port & Address This column is combination of remote Port and remote Address. Remote Port : The remote host port number. A value of 0 means that the connection is a listening or UDP socket, so this field does not apply. Remote Address : The internet address of the remote host. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. A value of 0 means that the connection is a listening or UDP socket so this field does not apply. For IPv6: The address is in IPv6 address format. A value of :: means that the connection is a listening socket so this field does not apply. 60 seconds
Bind User The user profile of the job on the local system which first performed a sockets API bind() of the socket. 60 seconds
Local Port & Address This column is combination of local Port and local Address. Local Port : The local system port number. Local Address : The local address of this connection on this system. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. For IPv6: The address is in IPv6 address format. A value of :: means the local application specified that any local internet address can be used. 60 seconds
Remote Port Name The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. 60 seconds
Local Port Name The local system well-known port name or the name from the service table entry. Contains null if there is no well-known port name. 60 seconds
Bytes Sent Remotely The number of bytes sent to the remote host. 60 seconds
Bytes Received Locally The number of bytes received from the remote host. 60 seconds
Protocol Identifies the type of connection protocol. TCP : A Transmission Control Protocol (TCP) connection or socket. UDP : A User Datagram Protocol (UDP) socket. 60 seconds
TcpState The state of the connection. CLOSED : This connection has ended. CLOSE-WAIT : Waiting for an end connection request from the local user. CLOSING : Waiting for an end connection request acknowledgment from the remote host. ESTABLISHED : The normal state in which data is transferred. FIN-WAIT-1 : Waiting for the remote host to acknowledge the local system request to end the connection. FIN-WAIT-2 : Waiting for the remote host request to end the onnection. LAST-ACK : Waiting for the remote host to acknowledge an end connection request. LISTEN : Waiting for a connection request from any remote host. SYN-RECEIVED : Waiting for a confirming connection request acknowledgment. SYN-SENT : Waiting for a matching connection request after having sent a connection request. TIME-WAIT : Waiting to allow the remote host enough time to receive the local system's acknowledgment to end the connection. Contains null if PROTOCOL is UDP. 60 seconds

Network connections (Top Senders)

Netstat Info For Bytes Send Locally

Component Name: NETWORK_CONNECTIONS_TOP_SENDERS (for custom poll rate component configuration)

Metric Description Granularity
Remote Port & Address This column is combination of remote Port and remote Address. Remote Port : The remote host port number. A value of 0 means that the connection is a listening or UDP socket, so this field does not apply. Remote Address : The internet address of the remote host. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. A value of 0 means that the connection is a listening or UDP socket so this field does not apply. For IPv6: The address is in IPv6 address format. A value of :: means that the connection is a listening socket so this field does not apply. 60 seconds
Bind User The user profile of the job on the local system which first performed a sockets API bind() of the socket. 60 seconds
Local Port & Address This column is combination of local Port and local Address. Local Port : The local system port number. Local Address : The local address of this connection on this system. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. For IPv6: The address is in IPv6 address format. A value of :: means the local application specified that any local internet address can be used. 60 seconds
Remote Port Name The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. 60 seconds
Local Port Name The local system well-known port name or the name from the service table entry. Contains null if there is no well-known port name. 60 seconds
Bytes Sent Remotely The number of bytes sent to the remote host. 60 seconds
Bytes Received Locally The number of bytes received from the remote host. 60 seconds
Protocol Identifies the type of connection protocol. TCP : A Transmission Control Protocol (TCP) connection or socket. UDP : A User Datagram Protocol (UDP) socket. 60 seconds
TcpState The state of the connection. CLOSED : This connection has ended. CLOSE-WAIT : Waiting for an end connection request from the local user. CLOSING : Waiting for an end connection request acknowledgment from the remote host. ESTABLISHED : The normal state in which data is transferred. FIN-WAIT-1 : Waiting for the remote host to acknowledge the local system request to end the connection. FIN-WAIT-2 : Waiting for the remote host request to end the connection. LAST-ACK : Waiting for the remote host to acknowledge an end connection request. LISTEN : Waiting for a connection request from any remote host. SYN-RECEIVED : Waiting for a confirming connection request acknowledgment. SYN-SENT : Waiting for a matching connection request after having sent a connection request. TIME-WAIT : Waiting to allow the remote host enough time to receive the local system's acknowledgment to end the connection. Contains null if PROTOCOL is UDP. 60 seconds

Message Queue

Information about each message in a message queue. Instana event would be created whenever a message in a Message Queue matches the specifications(Queue Library, Queue Name, Message ID) as provided by the user in configuration.yaml file.

Component Name: MESSAGE_QUEUES(for custom poll rate component configuration)

Custom Event: Event Name: MESSAGE_QUEUE_ID_EVENT You can specify multiple Message Id values, separated by commas, in the user_specification:messageQueue:event section of the Instana agent configuration file. The event is triggered with the defined Event Name if a message in the defined message library and message queue contains any of the defined message ID values. Consider the following example definition:

messageQueueIDEvent:
    QSYS/QSYSOPR
        Hyper Swap Alerts : 'CPC1E1D, CPI1E23'

In this example, an event is triggered with the event name Hyper Swap Alerts if a message in the message library QSYS and message queue QSYSOPR has any of the message IDs CPC1E1D or CPI1E23. The following conditions are applicable for the message event:

  • The message is in the library and queue (defined in the configuration.yaml file).
  • The message ID is listed in the library/queue definition.

Event Name: MESSAGE_QUEUE_TEXT_EVENT You can specify the Message Text value in the user_specification:messageQueue:event section of the Instana agent configuration file. The event is triggered with the defined Event Name if a message in the defined message library and message queue contains any of the defined message text. Consider the following example definition:

messageQueueTextEvent:
    QSYS/QSYSOPR
        IBM MQ Issue : 'queue disconnected'

In this example, an event is triggered with the event name IBM MQ Issue if a message in the message library QSYS and message queue QSYSOPR contains the message text "queue disconnected". The following conditions are applicable for the message event:

  • The message is in the library and queue (defined in the configuration.yaml file).
  • The message text is listed in the library/queue definition.
Metric Description Granularity
Message Id The message ID for this message. Contains the null value if this is an impromptu message or MESSAGE_TYPE is REPLY. 15 seconds
Message Type Type of message. Values are: COMPLETION, DIAGNOSTIC, ESCAPE, INFORMATIONAL, INQUIRY, NOTIFY, REPLY, REQUEST, SENDER. 15 seconds
Severity The severity assigned to the message. 15 seconds
Message Queue Library The name of the library containing the message queue. 15 seconds
Message Queue Name The name of the message queue containing the message. 15 seconds
Message Timestamp The timestamp when the message is sent. 15 seconds
Message Text The first level text of the message including tokens, or the impromptu message text. Contains the null value if MESSAGE_TYPE is REPLY or if the message file could not be accessed. 15 seconds
Message Second Level Text The second level text of the message including tokens. Contains the null value if MESSAGE_ID is null or if the message has no second level text or if the message file could not be accessed. 15 seconds
Message Key The key that is assigned to the message. The key is assigned by the command or API that sends the message. For details, see Message Types and Message Keys in the QMHRCVM API. 15 seconds

History Logs

Information about each message in the history log.

Component Name: HISTORY_LOGS (for custom poll rate component configuration)

Metric Description Granularity
Message Id The message ID for this message. Contains the null value if this is an impromptu message or MESSAGE_TYPE is REPLY. 15 seconds
Message Type Type of message. Values are COMPLETION, DIAGNOSTIC, ESCAPE, INFORMATIONAL, INQUIRY, NOTIFY, REPLY, REQUEST, or SENDER. 15 seconds
Severity The severity that is assigned to the message. 15 seconds
User The current user of the job when the message was sent. 15 seconds
Job The qualified job name when the message was sent. 15 seconds
Program The program that sent the message. 15 seconds
Message Timestamp The timestamp when the message is sent. 15 seconds
Message Text The first level text of the message including tokens, or the impromptu message text. Contains the null value if MESSAGE_ID is null or if the message file could not be accessed. 15 seconds
Message Second Level Text The second level text of the message including tokens. Contains the null value if MESSAGE_ID is null or if the message has no second level text or if the message file could not be accessed. 15 seconds

Hard Disk Info (Advance)

The following table covers information on hard disk with the IBM i operating system 7.3 (Level-22), 7.4 (Level-10), and later versions:

Component Name: HARDWARE_DISK_DRIVES The configuration is applicable only for the custom poll rate component.

Metric Description Granularity
Unit Number Unit number of the disk. 60 seconds
Resource Name The unique system-assigned name of the disk unit. 60 seconds
ASP Number Specifies the storage pool (ASP) number. 60 seconds
Disk Type Disk type number of the disk. 60 seconds
Unit Media Capacity Gb The storage capacity of the unit in billions of bytes. 60 seconds
Percent Used The percentage that the disk unit has been consumed. 60 seconds
Disk Model The model number of the disk. 60 seconds
Elapsed Percent Busy The estimated percentage of time that the disk unit is being used during the elapsed time. 60 seconds

Hard Disk Info (Basic)

The following table covers information on hard disk with the IBM i operating system 7.3 (Level-22), 7.4 (Level-10), and later versions:

Component Name: HARDWARE_DISK_DRIVES The configuration is applicable only for the custom poll rate component.

Metric Description Granularity
Unit Number Unit number of the disk. 60 seconds
ASP Number Specifies the storage pool (ASP) number. 60 seconds
Disk Type Disk type number of the disk. 60 seconds
Unit Storage Capacity Unit storage capacity has the same value as the unit media capacity for configured disk units. This value is 0 for non-configured units. 60 seconds
Percent Used The used space on the disk unit in percentage. 60 seconds

Solid-state disk information (advanced)

The following table covers information on solid-state disks with IBM i OS versions 7.3 (Level-22), 7.4 (Level-10), and later:

Component Name: SOLID_STATE_DISK (for custom poll rate component configuration)

Metric Description Granularity
Unit Number The unit number of the disk. 60 seconds
Resource Name The unique system-assigned name of the disk unit. 60 seconds
Storage Capacity The number of the storage pool (ASP). 60 seconds
Percent Used The percentage consumed by the disk unit. 60 seconds
Serial Number The serial number of the disk unit. 60 seconds
ASP Number The storage pool (ASP) number. 60 seconds
SSD Remaining Life The remaining lifetime of the SSD device in percentage. 60 seconds
SSD Power On days The number of days that the SSD device remains active on a system. 60 seconds
SSD Supported Bytes Written The lifetime number of bytes in gigabytes, which the SSD is expected to physically write. 60 seconds
SSD Bytes Written The lifetime number of bytes in gigabytes, which are physically written to the NAND memory in this particular SSD disk unit. 60 seconds
SSD Read Write Protected The device is read-protected or write-protected. 60 seconds
SSD PFA Warning The predictive failure analysis warning message that is logged. 60 seconds

Solid-state disk information (basic)

The following table covers information on solid-state disk for IBM i OS version 7.2, 7.3 (Level-22), and 7.4 (Level-10):

Component Name: SOLID_STATE_DISK (for custom poll rate component configuration)

Metric Description Granularity
Unit Number The unit number of the disk. 60 seconds
ASP Number The number of the storage pool (ASP). 60 seconds
Disk Type The disk type number of the disk. 60 seconds
Unit Storage Capacity For configured disk units, the unit storage capacity has the same value as the unit media capacity. This value is 0 for the non-configured units. 60 seconds
Percent Used The disk unit usage in percentage. 60 seconds

Non-volatile memory express

The following table covers information on non-volatile memory with IBM i OS versions 7.4 (Level-10) and later:

Component Name: NON_VOLATILE_MEMORY (for custom poll rate component configuration)

Metric Description Granularity
Resource Name The resource name of the NVMe device. 60 seconds
Model Number The model number assigned by the device manufacturer. 60 seconds
Life Remaining The percentage of NVMe device life that remains assigned by the manufacturer. 60 seconds
Spare Capacity The percentage (0 to 100) of the remaining spare capacity that is available for this NVMe device. 60 seconds
Spare Capacity Threshold The threshold percentage (0 to 100) for the spare capacity of this NVMe device. 60 seconds
Namespace Used The quantity of namespaces that is used. 60 seconds
Power Cycles The number of times the NVMe device is powered on and off. 60 seconds
Power On Hours The number of hours during which the NVMe device is powered on. 60 seconds
Media Errors The number of occurrences where the controller detected an unrecovered data integrity error. 60 seconds
Unsafe Shutdowns The number of times a power loss occurs without a shutdown notification being sent. 60 seconds
Firmware Level The level of code running in the NVMe device. 60 seconds

Overall disk status

The following table covers information on the overall disk status with IBM i OS versions 7.4 and later:

Component Name: SYSTEM_DISK_STATUS (for custom poll rate component configuration)

Custom Event: Event Name: DISK_STATUS_EVENT You can specify multiple expected Disk Status values, separated by commas, in the user_specification:diskStatus:operationalState section of the Instana agent configuration file. The event is triggered with the defined Event Name if any of the Disks (HDD/SSD) is not in the desired state in the partition.

Note: This event is applicable from IBM i OS version 7.4 and later.

Metric Description Granularity
Resource Name The unique system-assigned name of the disk unit. 60 seconds
Percent Used The percentage that the disk unit has been consumed. 60 seconds
Disk Type The disk type number. 60 seconds
Unit Number The unit number of the disk. 60 seconds
ASP Number The storage pool (ASP) number. 60 seconds
Elapsed Percent Busy The estimated percentage of time that the disk unit is being used during the elapsed time. 60 seconds
Type Of Disk Unit The type of disk unit (SSD, HDD). 60 seconds