Monitoring IBM i instances
After you install the Instana host agent, the IBM i instances sensor is automatically installed. You can view metrics that are related to xxx in the Instana UI after you configure as outlined in the Configuring section.
Instana supports both remote and local monitoring of IBM i instances. For more information about individual metrics availability, see IBM i services. To install the host agent for IBM i, see Installing the host agent on IBM i.
- Supported information
- Configuring
- Custom poll rate
- Converting event to alert
- JDBC connection ports
- Default poll rate
- User authorization
- Metrics collection
- Configuration data
- Performance metrics
- System Metrics
- Active Memory Pool Metrics
- Output Queue Metrics
- Top Spool Space Consumption
- Total Spool Space Consumption
- Top Active Jobs
- Auxiliary Storage Pools
- Active Subsystems
- Job Queue
- Network interfaces
- Network connections (Top Receivers)
- Network connections (Top Senders)
- Message Queue
- History Logs
- Hard Disk Info (Advance)
- Hard Disk Info (Basic)
- Solid-state disk information (advanced)
- Solid-state disk information (basic)
- Non-volatile memory express
- Overall disk status
Supported information
Supported versions for local monitoring
For local monitoring, Instana supports IBM i 7.4 and later.
Supported versions for remote monitoring
For remote monitoring, Instana supports IBM i 7.2 and later.
The IBMI versions supported on Instana are listed in the following table:
Sensor | Support policy | Latest version | Last supported version |
---|---|---|---|
IBMI | 45 days | 7.5 | 7.5 |
Configuring
Configuring for local monitoring
To monitor IBM i locally, you must first install the Instana host agent on IBM i. For more information, see Installing the host agent on IBM i.
When you install and run the host agent, the agent automatically discovers the processes and starts the sensor with the default configurations. For more information about host agent configurations, see Configuring host agent.
A sample configuration for local monitoring is shown in the following example:
This configuration is optional. To leverage custom events and custom polling, enable the following configuration.
com.instana.plugin.ibmiseries:
enabled: true
local: # Single configuration only
poll_rate_configuration: #Values are in seconds.
os_poll_rate: 15 #This is default OS poll rate. Provide the values in seconds.
db2_poll_rate: 15 #This is default DB2 poll rate. Provide the values in seconds.
disabled_components:
components: 'Names of the Components/Grids to be disabled in comma(,) separated way' # Example - 'HISTORY_LOGS, JOB_QUEUES'
custom_poll_rate: #Multiple Poll Rate is supported. (Optional)
poll_rate_1:
polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 30
components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
poll_rate_2:
polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 60
components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
user_specification: # For user inputs (Optional)
activeJobs:
jobs: 'comma separated list of job names. You can use * as a wild card character in any part of the job name' # example - '*/QUSER/QZDASOINIT, 311353/QLWISVR/ADMIN, 10034*/QUSER/*'
event:
identicalJobs:
jobName/user: 'Provide JOB_NAME_SHORT/JOB_USER values in comma(,) separated way. Use * in user part as wild card character' # example - 'CT_AGENT/QAUTOMON,QZDASOINIT/*'
threshold: 5 # Event will be triggered if jobs with same name & user in the running status is less than threshold value.
runningStatus: 'Provide JobStatus/Subsystem values in comma(,) separated way. You can use * in subsystem part as wild card' # example - 'SIGW/QHTTPSVR,DSC/*,DEQW/QAUTOMON'
inactiveJobs: 'Provide JOB_NAME in comma(,) separated way. You can use * as wild card character in any part of the JOB_NAME' # example - '*/QSYS/QAUTOMON , 137640/QSYS/QBATCH , */QWEBADMIN/* , */QSYS/QAUTO*'
enableInactiveJOBQStatus: #Allowed value 'true' or 'false'. Alert will be triggered for the inactive jobs if JOB_QUEUE_STATUS is 'RELEASED' or 'SCHEDULED' and JOB_STATUS is 'JOBQ'.
diskStatus:
operationalState: 'State of the disk which is expected, in comma(,) separated way' #Example 'ACTIVE, BUSY'
messageQueue:
filter: # User defined filter for Message Queue table
library/queueName: 'Lib-1/queueName1,Lib-2/queueName2' ## Provide values in comma(,) separated way. (Default Value : 'QSYS/QSYSOPR')
timeFrame: '10 HOURS' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
event: # User defined Event for Message Queue
messageQueueIdEvent:
Lib-1/queueName1:
Event_Name_1 : 'messageId-1,messageId-2' ##Example: Hyper Swap Alerts : 'CPC1E1D, CPI1E23'
Event_Name_2 : 'messageId-3,messageId-4' ##Example: Error in Device BackUp: 'CPI1E92, AMQ89*'. You can provide * in case you want to provide partial message id
Lib-2/queueName2:
Event_Name_3 : 'messageId-5' ## You can provide the Event_Name as per relevance
timeFrame: '15 MINUTES' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
messageQueueTextEvent:
Lib-1/queueName1:
Event_Name_1 : 'Provide fully or partial message text' ## example: Clean up : 'messages deleted'
Event_Name_2 : 'Provide fully or partial message text' ## example: IBM MQ Issue : 'Queued Publish/Subscribe Daemon'
Lib-2/queueName2:
Event_Name_3 : 'Provide fully or partial message text' ## You can provide the Event_Name as per relevance
timeFrameText: '15 MINUTES' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
historyLog:
filter:
timeFrame: '1 DAYS' ## Format is : {value} MINUTES / HOURS / DAYS (Default Value : '10 MINUTES')
subsystem:
subsystem_list: 'Provide SUBSYSTEM_DESCRIPTION_LIBRARY/SUBSYSTEM_DESCRIPTION in comma(,) separated way' # example- 'QDEVELOP/RATS,QINMEDIA/QBASE'
netstatEventInfo:
port/address: 'Provide LOCAL_PORT/LOCAL_ADDRESS in comma(,) separated way. You can use * in LOCAL_ADDRESS as wild card.' #example- '427/9.5.105.61 ,38695/* ,8475/9.51.151.81' [Alert will be triggered if TCP state is not LISTEN or Null]
Configuring for remote monitoring
To start monitoring IBM i instances remotely, you must first install the Instana agent based on your operating system. For more information, see Installing host agents. Then, configure
the following agent configuration file <agent_install_dir>/etc/instana/configuration.yaml
.
-
See the User Authorization section for the required authorities for the Instana user profile.
-
The field
sslEnabled
is optional. It is only necessary when you want to establish a secure connection with the host component. -
If
sslEnabled
is set totrue
, you need to import your trusted certificate in your JRE's cacerts (jvm/jre/lib/security/cacerts
) by using the keytool command:keytool -import -alias ALIAS_NAME -keystore "/path/to/jre/cacerts" -file YOUR_CERTIFICATE_NAME.crt
-
If you are asked for a password, enter the default password
changeit
.
See the following configuration example for remote monitoring:
com.instana.plugin.ibmiseries:
enabled: true
remote: # multiple configurations supported
- host: 'remote.host-2.com'
#For a SSL connection set sslEnabled to either 'true' or 'false'.
sslEnabled: 'true/false'
user: 'username'
password: 'password'
availabilityZone: 'IBM i Remote Monitoring'
poll_rate_configuration: #Values are in seconds.
os_poll_rate: 15 #This is default OS poll rate. Provide the values in seconds.
db2_poll_rate: 15 #This is default DB2 poll rate. Provide the values in seconds.
disabled_components:
components: '<Names of the components or grids to be disabled>' # Use comma (,) to separate them. Example - 'HISTORY_LOGS, JOB_QUEUES'.
custom_poll_rate: #Multiple Poll Rate is supported. (Optional)
poll_rate_1:
polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 30
components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
poll_rate_2:
polling_rate: 'Custom Poll Rate value' #Values in seconds. Example- 60
components: 'Components/Grid Names in comma(,) separated way' #Refer documentation for components name.
events: 'Names of the events in comma(,) separated way' #Refer documentation for events name.
user_specification: # For user inputs (Optional)
activeJobs:
jobs: 'comma separated list of job names. You can use * as a wildcard character in any part of the job name' # example - '*/QUSER/QZDASOINIT, 311353/QLWISVR/ADMIN, 10034*/QUSER/*'
event:
identicalJobs:
jobName/user: 'Provide JOB_NAME_SHORT/JOB_USER values in comma(,) separated way. Use * in user part as wildcard character' # example - 'CT_AGENT/QAUTOMON,QZDASOINIT/*'
threshold: 5 # Event will be triggered if jobs with same name & user in the running status is less than threshold value.
runningStatus: 'Provide JobStatus/Subsystem values in comma(,) separated way. You can Use * in subsystem part as wildcard' # example - 'SIGW/QHTTPSVR,DSC/*,DEQW/QAUTOMON'
inactiveJobs: 'Provide JOB_NAME in comma(,) separated way. You can use * as wildcard character in any part of the JOB_NAME' # example - '*/QSYS/QAUTOMON , 137640/QSYS/QBATCH , */QWEBADMIN/* , */QSYS/QAUTO*'
enableInactiveJOBQStatus: #Allowed value 'true' or 'false'. Alert will be triggered for the inactive jobs if JOB_QUEUE_STATUS is 'RELEASED' or 'SCHEDULED' and JOB_STATUS is 'OUTQ'.
diskStatus:
operationalState: 'State of the disk which is expected, in comma(,) separated way' #Example 'ACTIVE, BUSY'
messageQueue:
filter: # User defined filter for Message Queue table
library/queueName: 'Lib-1/queueName1,Lib-2/queueName2' ## Provide values in comma(,) separated way. (Default Value : 'QSYS/QSYSOPR')
timeFrame: '10 HOURS' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
event: # User defined Event for Message Queue
messageQueueIdEvent:
Lib-1/queueName1: ## example: QSYS/QSYSOPR
Event_Name_1 : 'messageId-1,messageId-2' ##Example: Hyper Swap Alerts : 'CPC1E1D, CPI1E23'
Event_Name_2 : 'messageId-3,messageId-4' ##Example: Error in Device BackUp: 'CPI1E92, AMQ89*'. You can provide * in case you want to provide partial message id
Lib-2/queueName2:
Event_Name_3 : 'messageId-5' ## You can provide the Event_Name as per relevance
timeFrame: '15 MINUTES' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
messageQueueTextEvent:
Lib-1/queueName1: ## example: QSYS/QSYSOPR
Event_Name_1 : 'Provide fully or partial message text' ## example: Clean up : 'messages deleted'
Event_Name_2 : 'Provide fully or partial message text' ## example: IBM MQ Issue : 'Queued Publish/Subscribe Daemon'
Lib-2/queueName2:
Event_Name_3 : 'Provide fully or partial message text' ## You can provide the Event_Name as per relevance
timeFrameText: '15 MINUTES' ## Format is : {value} MINUTES/HOURS/DAYS (Default Value : '10 MINUTES')
historyLog:
filter:
timeFrame: '1 DAYS' ## Format is : {value} MINUTES / HOURS / DAYS (Default Value : '10 MINUTES')
subsystem:
subsystem_list: 'Provide SUBSYSTEM_DESCRIPTION_LIBRARY/SUBSYSTEM_DESCRIPTION in comma(,) separated way' # example -'QDEVELOP/RATS,QINMEDIA/QBASE'
netstatEventInfo:
port/address: 'Provide LOCAL_PORT/LOCAL_ADDRESS in comma(,) separated way. You can use * in LOCAL_ADDRESS as wildcard.' #example- '427/9.5.105.61, 38695/*, 8475/9.5.151.81' [Alert will be triggered if TCP state is not LISTEN or Null]
Custom poll rate
You can configure multiple Poll Rate in the previously mentioned configuration file. It consists of three fields:
os_poll_rate
: It is the default poll rate of the sensor for OS metrics. This field is mandatory.db2_poll_rate
: It is the default poll rate of the sensor for Db2 metrics. This field is mandatory.custom_poll_rate
: You can provide your custom poll rate value with the definedcomponents
(each grid is treated as a component) andevents
(supported custom events) details. This field is optional.disabled_components
: You can define the list of components that you don't want to dispaly in the Instana UI. This field is optional. Use comma (,) to separate each component.
Converting event to alert
You can configure Instana to send alerts for events based on the criteria set in the Dynamic Focus query. To configure Instana to send alerts for events, complete the following steps:
- Log in to the Instana UI, and click Settings > Alerts > New Alert.
- Select the event types as Alert on Event Type(s).
- Select Warning as Event types.
- Select Scope as Selected Entities Only (Dynamic Focus Query).
- Define the criteria in the Dynamic Focus Query section. See the following examples:
- To get alert for all the defined custom events, specify the following criteria:
Whereentity.ibmi.os.hostname:XYZ
XYZ
is the host name of the server. - To set specific custom events for a specific host, specify the following criteria:
Where(entity.ibmi.os.hostname:XYZ) AND (event.text:'EventName-ABC')
ABC
is the event name as defined for the custom poll rate events configuration.
- To get alert for all the defined custom events, specify the following criteria:
- Add the preferred Alert Channels in the Alerting section.
For more information about dynamic filtering, see Syntax.
The configured remote IBM i instance will then be shown as a separate box in the specified availabilityZone
.
JDBC connection ports
The Instana remote sensor for IBM i uses a remote JDBC connection and a Java Toolkit remote connection. These connections use the following ports:
Ports | Description |
---|---|
8470 | Port 8470 is used for host code page translation tables and licensing functions. |
8471 | Port 8471 is used for database access. |
8475 | Port 8475 is used to check for application administration restrictions. |
8476 | Port 8476 is used for checking signon verification to authenticate. |
449 | Port 449 is used to look up service by name and return the port number. |
446 | Port 446 is used for Distributed data management (DDM)/Distributed Relational Database Architecture (DRDA) to make a remote connection to Db2 for IBM i by using JDBC. |
Default poll rate
Custom provider name | UI table description | Default poll rate |
---|---|---|
AUXILIARY_STORAGE_POOLS | Auxiliary Storage pools | 15 seconds |
HARDWARE_DISK_DRIVES | Hardware Disk Drives (HDD) Info | 60 seconds |
HISTORY_LOGS | History Log | 15 seconds |
JOB_QUEUES | Job Queue | 15 seconds |
TOP_ACTIVE_JOBS | Top Active Jobs | 15 seconds |
MEMORY_POOLS | Active Memory Pools | 15 seconds |
MESSAGE_QUEUES | Message Queue | 15 seconds |
NETWORK_CONNECTIONS_TOP_RECEIVERS | Network Connections (Top Receivers) | 60 seconds |
NETWORK_CONNECTIONS_TOP_SENDERS | Network Connections (Top Sender) | 60 seconds |
NETSTAT_INTERFACES | Netstat Interfaces | 120 seconds |
NON_VOLATILE_MEMORY | Non Volatile Memory Express(NVMe) Information | 60 seconds |
OUTPUT_QUEUES | Output Queues | 15 seconds |
SOLID_STATE_DISK | Solid State Disk (SSD) Info | 60 seconds |
SPOOL_SPACE | Total Spool Space | 120 seconds |
ACTIVE_SUBSYSTEMS | Active Subsystems | 15 seconds |
SYSTEM_STATUS | All KPIs (CPU rate, utilization, thread, active Jobs) | 15 seconds |
SYSTEM_DISK_STATUS | System Overall Disk Information | 60 seconds |
Metrics collection
To view the metrics, select Infrastructure in the sidebar of the Instana User interface, click a specific monitored host, and then you can see a host dashboard with all the collected metrics and monitored processes.
Configuration data
- Host name
- OS Version
- Total CPU
- Total Memory
- Configured CPU
- Configured Memory
- Partition ID
- Number of partitions
- Restricted state
Performance metrics
System Metrics
Component Name
: SYSTEM_STATUS (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
CPU Rate | The average CPU rate expressed as a percentage where 100% indicates the processor is running at its nominal frequency. A value above or as follows 100% indicates how much the processor has been slowed down (throttled) or speeded up (turbo) relative to the nominal frequency for the processor model. For instance, a value of 120% indicates the processor is running 20% faster against its nominal speed. | 15 seconds |
Average CPU Utilization | The average CPU utilization for all the active processors. | 15 seconds |
Min CPU Utilization | The CPU utilization of the processor that reported the minimum amount of CPU utilization. | 15 seconds |
Max CPU Utilization | The CPU utilization of the processor that reported the maximum amount of CPU utilization. | 15 seconds |
Active Jobs | The number of jobs active in the system (jobs that have been started, but have not yet ended), including both user and system jobs. | 15 seconds |
Interactive Jobs | The percentage of interactive performance assigned to this logical partition. This value is a percentage of the total interactive performance available to the entire physical system. | 15 seconds |
Total Jobs | The total number of user and system jobs that are currently in the system. The total includes: all jobs on job queues waiting to be processed, all jobs currently active (being processed), all jobs that have completed running but still have output on output queues to be produced. | 15 seconds |
Max Jobs | The maximum number of jobs that are allowed on the system. When the number of jobs reaches this maximum, you can no longer submit or start more jobs on the system. The total includes: all jobs on job queues waiting to be processed, all jobs currently active (being processed), all jobs that have completed running but still have output on output queues to be produced. | 15 seconds |
Used Auxiliary Storage Pool | The percentage of the system storage pool (ASP number 1) currently in use. | 15 seconds |
Capacity of Auxiliary Storage Pool | The storage capacity of the system auxiliary storage pool (ASP number 1) in millions of bytes. This value represents the amount of space available for storage of both permanent and temporary objects. | 15 seconds |
Current Temporary Storage | The current amount of storage, in millions of bytes, in use for temporary objects. | 15 seconds |
Maximum Temporary Storage Used | The largest amount of storage, in millions of bytes, used for temporary objects at any one time since the last IPL. | 15 seconds |
Active Threads | The number of initial and secondary threads in the system (threads that have been started, but have not yet ended), including both user and system threads. | 15 seconds |
Total Spool Space | The total spool space consumed by the output queue in bytes. | 15 seconds |
Active Memory Pool Metrics
Component Name
: MEMORY_POOLS (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Storage Used | The amount of main storage, in megabytes, in the pool. | 15 seconds |
Storage Reserved | The amount of storage, in megabytes, in the pool that is reserved for system use. For example, the pool for save or restore operations. | 15 seconds |
Storage Defined | The size of the pool, in megabytes, as defined in the shared pool, subsystem description, or system value QMCHPOOL. Contains the null value for a pool without a defined size. | 15 seconds |
Active Threads | The number of threads that are currently using the pool. | 15 seconds |
Ineligible Threads | The number of ineligible threads in the pool. | 15 seconds |
Max Threads | The maximum number of threads that can be active in the pool at any time. | 15 seconds |
Elapsed Database Faults | The number of page faults per second against pages that contain database access. | 15 seconds |
Elapsed Total Faults | The total database and non-database page faults per second. | 15 seconds |
Elapsed Non Database Faults | The number of page faults per second against non-database access. | 15 seconds |
Output Queue Metrics
Component Name
: OUTPUT_QUEUES (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Queue Name | The name of the output queue. | 15 seconds |
Library Name | The name of the library that contains the output queue. | 15 seconds |
Status | The status of the output queue. | 15 seconds |
Files in Queue | The total number of spooled files currently on this output queue. | 15 seconds |
Writer Job Name | The qualified job name of the writer job. If more than one writer is started, this is the name of the first writer. Contains the null value if a writer job is not started for this queue. | 15 seconds |
Writer Job Status | The status of the writer job. If more than one writer is started, this is the status of the first writer. | 15 seconds |
Top Spool Space Consumption
Top 20 users consuming the spool space.
Component Name
: SPOOL_SPACE (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
User | The name of the user profile that produced the Spool files. | 120 seconds |
Spool Space | The size of the users spooled files, in bytes. | 120 seconds |
Total Spool Space Consumption
Component Name
: SPOOL_SPACE (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Total Spool Space | The total spool space consumed by the output queue in bytes. | 120 seconds |
Top Active Jobs
Top 20 active jobs that are currently running in the system, along with the job names matching the values that are specified in user_specification:activeJobs:jobs
.
Component Name
: TOP_ACTIVE_JOBS (for custom poll rate component configuration) Active Subsystems
Custom Event:
Event Name
: IDENTICAL_JOBS_EVENT (for custom poll rate events configuration)
You can specify thejobName/user
andthreshold
in theuser_specification:activeJobs:event:identicalJobs
section of the Instana agent configuration file. Then, if the count of active jobs is less than the specified threshold value at any time, an event will be triggered for the jobs that are defined in the configuration.- Wildcard Support: To validate a particular job that is irrespective of users with the
threshold
value, use*
for the user part. For example, 'QZDASOINIT/*'
Note: IBM i 7.2 is not supported to use the IDENTICAL_JOBS_EVENT field.
- Wildcard Support: To validate a particular job that is irrespective of users with the
Event Name
: RUNNING_JOB_STATUS_EVENT (for custom poll rate events configuration)
You can specify therunningStatus
field, which is combination of theJobStatus/Subsystem
, in theuser_specification:activeJobs:event:runningStatus
section of the Instana agent configuration file. Then, an event will be triggered for the jobs that match the defined criteria.- Wildcard Support: To validate a specific job status in all the available
Subsystem
, use*
in theSubsystem
part.
Note: IBM i 7.2 is not supported to use the RUNNING_JOB_STATUS_EVENT field.
- Wildcard Support: To validate a specific job status in all the available
Event Name
: INACTIVE_JOBS_EVENT (for custom poll rate events configuration)
You can specify the JOB_NAME field in theinactiveJobs
field in theuser_specification:activeJobs:event:inactiveJobs
section of the Instana agent configuration file. Then, an event will be triggered for the jobs that are not inActive
state in the system at any given point of time.- Wildcard Support: To trigger an inactive job event based on the
JOB_NAME
, use*
as a prefix or suffix within any part of theJOB_NAME
part.
- Wildcard Support: To trigger an inactive job event based on the
Event Name
: INACTIVE_JOBS_IN_JOBQ_EVENT (for custom poll rate events configuration)
You can enable or disable the enableInactiveJOBQStatus field in the user_specification:activeJobs:event:enableInactiveJOBQStatus section of the Instana agent configuration file. If the enableInactiveJOBQStatus field is enabled, then an event will be triggered for that Jobs whose JOB_QUEUE_STATUS isRELEASED
orSCHEDULED
and JOB_STATUS isJOBQ
at any given point of time.
Note: IBM i 7.2 is not supported to use the enableInactiveJOBQStatus field.
Metric** | Description | Granularity |
---|---|---|
Job Name | The qualified job name. | 15 seconds |
User Name | The user profile under which the initial thread is running at this time. For jobs that swap user profiles, this user profile name and the user profile that initiated the job can be different. | 15 seconds |
Elapsed CPU Percentage | The percent of processing unit time attributed to this job during the measurement time interval. | 15 seconds |
Temporary Storage | The size of the users spooled files, in kilobytes. | 15 seconds |
Job Status | The status of the initial thread of the job. | 15 seconds |
Job Type | Type of active job. | 15 seconds |
Thread Count | The number of active threads in the job. | 15 seconds |
Auxiliary Storage Pools
Information about auxiliary storage pools (ASPs).
Component Name
: AUXILIARY_STORAGE_POOLS (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
ASP Number | A unique identifier for an ASP. Possible values are 1 through 255. | 15 seconds |
Device Description Name | The name of the device description that brought the independent ASP (IASP) to varyon/active state. | 15 seconds |
ASP Type | The use that is assigned to the ASP. | 15 seconds |
ASP State | The device configuration status of an ASP. | 15 seconds |
Number Of Disk Units | The total number of disk units in the ASP. If mirroring is active for disk units within the ASP, the mirrored pair of units is counted as one. | 15 seconds |
Total Capacity | The total number of used and unused megabytes in the ASP. A special value of -2 is returned if the size of this field is exceeded. | 15 seconds |
Total Capacity Utilization | Utilization Percentage of the Total Capacity in the ASP. | 15 seconds |
Protected Capacity | The total number of used and unused megabytes in the ASP that are protected by mirroring or device parity. A special value of -2 is returned if the value was too big to return. Contains the null value if the capacity cannot be determined. | 15 seconds |
Protected Capacity Utilization | Utilization Percentage of the Protected Capacity in the ASP. | 15 seconds |
Unprotected Capacity | The total number of used and unused megabytes in the ASP that are not protected by mirroring or device parity. A special value of -2 is returned if the value was too big to return. Contains the null value if the capacity cannot be determined. | 15 seconds |
Unprotected Capacity Utilization | Utilization Percentage of the Unprotected Capacity in the ASP. | 15 seconds |
Active Subsystems
Information about Active Subsystems
Component Name
: ACTIVE_SUBSYSTEMS (for custom poll rate component configuration)
Event Name
: SUBSYSTEM_STATUS (for custom poll rate events configuration)
You can specify thesubsystem_list
in the configuration fileuser_specification:subsystem
. Then, an event will be triggered if the specified subsystems are not inActive
status.
Metric | Description | Granularity |
---|---|---|
Name | The name of the subsystem about which information is being returned. | 15 seconds |
Library Name | The name of the library in which the subsystem description resides. | 15 seconds |
Active Jobs | The number of jobs currently active in the subsystem. This number includes held jobs but excludes jobs that are disconnected or suspended because of a transfer secondary job or a transfer group job. If STATUS is INACTIVE, returns 0. | 15 seconds |
Max Active Jobs | The maximum number of jobs that can run or use resources in the subsystem at one time. Contains the null value if the subsystem description specifies *NOMAX, indicating that there is no maximum. | 15 seconds |
Description | The text description of the subsystem description. | 15 seconds |
Job Queue
Information about job queue.
Component Name
: JOB_QUEUES (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Job Queue Name | The name of the job queue. | 15 seconds |
Job Queue Library | The name of the library that contains the job queue. | 15 seconds |
Subsystem Name | The name of the subsystem that can receive jobs from this job queue. Contains the null value if this job queue is not associated with an active subsystem. | 15 seconds |
Subsystem Library Name | The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. | 15 seconds |
Number Of Jobs | The number of jobs in the queue. | 15 seconds |
Active Jobs | The current number of jobs that are active that came through this job queue entry. Contains the null value if this job queue is not associated with an active subsystem. | 15 seconds |
Maximum Active Jobs | The maximum number of jobs that can be active at the same time through this job queue entry. A value of -1 indicates *NOMAX, no maximum number of jobs is defined. Contains the null value if this job queue is not associated with an active subsystem. | 15 seconds |
Job Queue Status | The status of the job queue. HELD : The queue is held. RELEASED : The queue is released. | 15 seconds |
Text Description | Text that describes the job queue. Contains the null value if there is no text description for the job queue. | 15 seconds |
Held Jobs | The current number of jobs that are in *HELD status. This is the sum of the 10 HELD_JOBS_PRIORITY_n columns. | 15 seconds |
Released Jobs | The current number of jobs that are in *RELEASED status. This is the sum of the 10 RELEASED_JOBS_PRIORITY_n columns. | 15 seconds |
Scheduled Jobs | The current number of jobs that are in *SCHEDULED status. This is the sum of the 10 SCHEDULED_JOBS_PRIORITY_n columns. | 15 seconds |
Network interfaces
Information about IPv4 and IPv6 interfaces
Component Name
: NETSTAT_INTERFACES (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Internet Address | The internet address of the interface. | 120 seconds |
Subnet Mask | The subnet mask for the network, subnet, and host address fields of the internet address that defines the subnetwork for an interface. Contains null if this is an IPv6 connection. | 120 seconds |
Connection Type | The type of connection (IPV4,IPV6). | 120 seconds |
Interface Line Type | The type of line used by the interface. | 120 seconds |
Line Description | The name of the communications line description that identifies the physical network associated with an interface. | 120 seconds |
VLAN ID | The virtual LAN to which this interface belongs. | 120 seconds |
Status | The current status of the logical interface. | 120 seconds |
Status value Mapping
Metric Value | Status |
---|---|
0 | ENDING |
1 | ACTIVE |
2 | FAILED |
3 | FAILED_TCP |
4 | INACTIVE |
5 | RCYCNL |
6 | RCYPND |
7 | STARTING |
8 | ACQUIRING |
9 | ACQUIRING |
10 | ACQUIRING |
Network connections (Top Receivers)
Netstat Info For Bytes Received Locally
Component Name
: NETWORK_CONNECTIONS_TOP_RECEIVERS (for custom poll rate component configuration)
Event Name
: ACTIVE_PORTS_LISTENING_STATUS (for custom poll rate events configuration)
You can specify theLOCAL_PORT/LOCAL_ADDRESS
field in theuser_specification:netstatEventInfo
section of the Instana agent configuration file. Then, if any of the defined port is not inLISTEN
orNull
state, an event will be triggered for the port that are defined in the configuration.- Wildcard Support: To validate a particular
Port Number
irrespective of theLocal Address
, use*
for the address part. For example, '38695/*'
- Wildcard Support: To validate a particular
Metric | Description | Granularity |
---|---|---|
Remote Port & Address | This column is combination of remote Port and remote Address. Remote Port : The remote host port number. A value of 0 means that the connection is a listening or UDP socket, so this field does not apply. Remote Address : The internet address of the remote host. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. A value of 0 means that the connection is a listening or UDP socket so this field does not apply. For IPv6: The address is in IPv6 address format. A value of :: means that the connection is a listening socket so this field does not apply. | 60 seconds |
Bind User | The user profile of the job on the local system which first performed a sockets API bind() of the socket. | 60 seconds |
Local Port & Address | This column is combination of local Port and local Address. Local Port : The local system port number. Local Address : The local address of this connection on this system. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. For IPv6: The address is in IPv6 address format. A value of :: means the local application specified that any local internet address can be used. | 60 seconds |
Remote Port Name | The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. | 60 seconds |
Local Port Name | The local system well-known port name or the name from the service table entry. Contains null if there is no well-known port name. | 60 seconds |
Bytes Sent Remotely | The number of bytes sent to the remote host. | 60 seconds |
Bytes Received Locally | The number of bytes received from the remote host. | 60 seconds |
Protocol | Identifies the type of connection protocol. TCP : A Transmission Control Protocol (TCP) connection or socket. UDP : A User Datagram Protocol (UDP) socket. | 60 seconds |
TcpState | The state of the connection. CLOSED : This connection has ended. CLOSE-WAIT : Waiting for an end connection request from the local user. CLOSING : Waiting for an end connection request acknowledgment from the remote host. ESTABLISHED : The normal state in which data is transferred. FIN-WAIT-1 : Waiting for the remote host to acknowledge the local system request to end the connection. FIN-WAIT-2 : Waiting for the remote host request to end the onnection. LAST-ACK : Waiting for the remote host to acknowledge an end connection request. LISTEN : Waiting for a connection request from any remote host. SYN-RECEIVED : Waiting for a confirming connection request acknowledgment. SYN-SENT : Waiting for a matching connection request after having sent a connection request. TIME-WAIT : Waiting to allow the remote host enough time to receive the local system's acknowledgment to end the connection. Contains null if PROTOCOL is UDP. | 60 seconds |
Network connections (Top Senders)
Netstat Info For Bytes Send Locally
Component Name
: NETWORK_CONNECTIONS_TOP_SENDERS (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Remote Port & Address | This column is combination of remote Port and remote Address. Remote Port : The remote host port number. A value of 0 means that the connection is a listening or UDP socket, so this field does not apply. Remote Address : The internet address of the remote host. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. A value of 0 means that the connection is a listening or UDP socket so this field does not apply. For IPv6: The address is in IPv6 address format. A value of :: means that the connection is a listening socket so this field does not apply. | 60 seconds |
Bind User | The user profile of the job on the local system which first performed a sockets API bind() of the socket. | 60 seconds |
Local Port & Address | This column is combination of local Port and local Address. Local Port : The local system port number. Local Address : The local address of this connection on this system. For IPv4: The address is in IPv4 address format. A value of 0.0.0.0 indicates that either the system is waiting for a connection to open or that a UDP socket is being used. For IPv6: The address is in IPv6 address format. A value of :: means the local application specified that any local internet address can be used. | 60 seconds |
Remote Port Name | The library in which the subsystem description resides. Contains the null value if this job queue is not associated with an active subsystem. | 60 seconds |
Local Port Name | The local system well-known port name or the name from the service table entry. Contains null if there is no well-known port name. | 60 seconds |
Bytes Sent Remotely | The number of bytes sent to the remote host. | 60 seconds |
Bytes Received Locally | The number of bytes received from the remote host. | 60 seconds |
Protocol | Identifies the type of connection protocol. TCP : A Transmission Control Protocol (TCP) connection or socket. UDP : A User Datagram Protocol (UDP) socket. | 60 seconds |
TcpState | The state of the connection. CLOSED : This connection has ended. CLOSE-WAIT : Waiting for an end connection request from the local user. CLOSING : Waiting for an end connection request acknowledgment from the remote host. ESTABLISHED : The normal state in which data is transferred. FIN-WAIT-1 : Waiting for the remote host to acknowledge the local system request to end the connection. FIN-WAIT-2 : Waiting for the remote host request to end the connection. LAST-ACK : Waiting for the remote host to acknowledge an end connection request. LISTEN : Waiting for a connection request from any remote host. SYN-RECEIVED : Waiting for a confirming connection request acknowledgment. SYN-SENT : Waiting for a matching connection request after having sent a connection request. TIME-WAIT : Waiting to allow the remote host enough time to receive the local system's acknowledgment to end the connection. Contains null if PROTOCOL is UDP. | 60 seconds |
Message Queue
Information about each message in a message queue. Instana event would be created whenever a message in a Message Queue matches the specifications(Queue Library, Queue Name, Message ID) as provided by the user in configuration.yaml
file.
Component Name
: MESSAGE_QUEUES(for custom poll rate component configuration)
Custom Event:
Event Name
: MESSAGE_QUEUE_ID_EVENT You can specify multiple Message Id
values, separated by commas, in the user_specification:messageQueue:event
section of the Instana agent configuration
file. The event is triggered with the defined Event Name
if a message in the defined message library and message queue contains any of the defined message ID values. Consider the following example definition:
messageQueueIDEvent:
QSYS/QSYSOPR
Hyper Swap Alerts : 'CPC1E1D, CPI1E23'
In this example, an event is triggered with the event name Hyper Swap Alerts
if a message in the message library QSYS and message queue QSYSOPR has any of the message IDs CPC1E1D or CPI1E23. The following conditions are applicable
for the message event:
- The message is in the library and queue (defined in the
configuration.yaml
file). - The message ID is listed in the library/queue definition.
Event Name
: MESSAGE_QUEUE_TEXT_EVENT You can specify the Message Text
value in the user_specification:messageQueue:event
section of the Instana agent configuration file. The event
is triggered with the defined Event Name
if a message in the defined message library and message queue contains any of the defined message text. Consider the following example definition:
messageQueueTextEvent:
QSYS/QSYSOPR
IBM MQ Issue : 'queue disconnected'
In this example, an event is triggered with the event name IBM MQ Issue
if a message in the message library QSYS and message queue QSYSOPR contains the message text "queue disconnected". The following conditions are
applicable for the message event:
- The message is in the library and queue (defined in the
configuration.yaml
file). - The message text is listed in the library/queue definition.
Metric | Description | Granularity |
---|---|---|
Message Id | The message ID for this message. Contains the null value if this is an impromptu message or MESSAGE_TYPE is REPLY. | 15 seconds |
Message Type | Type of message. Values are: COMPLETION, DIAGNOSTIC, ESCAPE, INFORMATIONAL, INQUIRY, NOTIFY, REPLY, REQUEST, SENDER. | 15 seconds |
Severity | The severity assigned to the message. | 15 seconds |
Message Queue Library | The name of the library containing the message queue. | 15 seconds |
Message Queue Name | The name of the message queue containing the message. | 15 seconds |
Message Timestamp | The timestamp when the message is sent. | 15 seconds |
Message Text | The first level text of the message including tokens, or the impromptu message text. Contains the null value if MESSAGE_TYPE is REPLY or if the message file could not be accessed. | 15 seconds |
Message Second Level Text | The second level text of the message including tokens. Contains the null value if MESSAGE_ID is null or if the message has no second level text or if the message file could not be accessed. | 15 seconds |
Message Key | The key that is assigned to the message. The key is assigned by the command or API that sends the message. For details, see Message Types and Message Keys in the QMHRCVM API. | 15 seconds |
History Logs
Information about each message in the history log.
Component Name
: HISTORY_LOGS (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Message Id | The message ID for this message. Contains the null value if this is an impromptu message or MESSAGE_TYPE is REPLY. | 15 seconds |
Message Type | Type of message. Values are COMPLETION, DIAGNOSTIC, ESCAPE, INFORMATIONAL, INQUIRY, NOTIFY, REPLY, REQUEST, or SENDER. | 15 seconds |
Severity | The severity that is assigned to the message. | 15 seconds |
User | The current user of the job when the message was sent. | 15 seconds |
Job | The qualified job name when the message was sent. | 15 seconds |
Program | The program that sent the message. | 15 seconds |
Message Timestamp | The timestamp when the message is sent. | 15 seconds |
Message Text | The first level text of the message including tokens, or the impromptu message text. Contains the null value if MESSAGE_ID is null or if the message file could not be accessed. | 15 seconds |
Message Second Level Text | The second level text of the message including tokens. Contains the null value if MESSAGE_ID is null or if the message has no second level text or if the message file could not be accessed. | 15 seconds |
Hard Disk Info (Advance)
The following table covers information on hard disk with the IBM i operating system 7.3 (Level-22), 7.4 (Level-10), and later versions:
Component Name
: HARDWARE_DISK_DRIVES The configuration is applicable only for the custom poll rate component.
Metric | Description | Granularity |
---|---|---|
Unit Number | Unit number of the disk. | 60 seconds |
Resource Name | The unique system-assigned name of the disk unit. | 60 seconds |
ASP Number | Specifies the storage pool (ASP) number. | 60 seconds |
Disk Type | Disk type number of the disk. | 60 seconds |
Unit Media Capacity Gb | The storage capacity of the unit in billions of bytes. | 60 seconds |
Percent Used | The percentage that the disk unit has been consumed. | 60 seconds |
Disk Model | The model number of the disk. | 60 seconds |
Elapsed Percent Busy | The estimated percentage of time that the disk unit is being used during the elapsed time. | 60 seconds |
Hard Disk Info (Basic)
The following table covers information on hard disk with the IBM i operating system 7.3 (Level-22), 7.4 (Level-10), and later versions:
Component Name
: HARDWARE_DISK_DRIVES The configuration is applicable only for the custom poll rate component.
Metric | Description | Granularity |
---|---|---|
Unit Number | Unit number of the disk. | 60 seconds |
ASP Number | Specifies the storage pool (ASP) number. | 60 seconds |
Disk Type | Disk type number of the disk. | 60 seconds |
Unit Storage Capacity | Unit storage capacity has the same value as the unit media capacity for configured disk units. This value is 0 for non-configured units. | 60 seconds |
Percent Used | The used space on the disk unit in percentage. | 60 seconds |
Solid-state disk information (advanced)
The following table covers information on solid-state disks with IBM i OS versions 7.3 (Level-22), 7.4 (Level-10), and later:
Component Name
: SOLID_STATE_DISK (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Unit Number | The unit number of the disk. | 60 seconds |
Resource Name | The unique system-assigned name of the disk unit. | 60 seconds |
Storage Capacity | The number of the storage pool (ASP). | 60 seconds |
Percent Used | The percentage consumed by the disk unit. | 60 seconds |
Serial Number | The serial number of the disk unit. | 60 seconds |
ASP Number | The storage pool (ASP) number. | 60 seconds |
SSD Remaining Life | The remaining lifetime of the SSD device in percentage. | 60 seconds |
SSD Power On days | The number of days that the SSD device remains active on a system. | 60 seconds |
SSD Supported Bytes Written | The lifetime number of bytes in gigabytes, which the SSD is expected to physically write. | 60 seconds |
SSD Bytes Written | The lifetime number of bytes in gigabytes, which are physically written to the NAND memory in this particular SSD disk unit. | 60 seconds |
SSD Read Write Protected | The device is read-protected or write-protected. | 60 seconds |
SSD PFA Warning | The predictive failure analysis warning message that is logged. | 60 seconds |
Solid-state disk information (basic)
The following table covers information on solid-state disk for IBM i OS version 7.2, 7.3 (Level-22), and 7.4 (Level-10):
Component Name
: SOLID_STATE_DISK (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Unit Number | The unit number of the disk. | 60 seconds |
ASP Number | The number of the storage pool (ASP). | 60 seconds |
Disk Type | The disk type number of the disk. | 60 seconds |
Unit Storage Capacity | For configured disk units, the unit storage capacity has the same value as the unit media capacity. This value is 0 for the non-configured units. | 60 seconds |
Percent Used | The disk unit usage in percentage. | 60 seconds |
Non-volatile memory express
The following table covers information on non-volatile memory with IBM i OS versions 7.4 (Level-10) and later:
Component Name
: NON_VOLATILE_MEMORY (for custom poll rate component configuration)
Metric | Description | Granularity |
---|---|---|
Resource Name | The resource name of the NVMe device. | 60 seconds |
Model Number | The model number assigned by the device manufacturer. | 60 seconds |
Life Remaining | The percentage of NVMe device life that remains assigned by the manufacturer. | 60 seconds |
Spare Capacity | The percentage (0 to 100) of the remaining spare capacity that is available for this NVMe device. | 60 seconds |
Spare Capacity Threshold | The threshold percentage (0 to 100) for the spare capacity of this NVMe device. | 60 seconds |
Namespace Used | The quantity of namespaces that is used. | 60 seconds |
Power Cycles | The number of times the NVMe device is powered on and off. | 60 seconds |
Power On Hours | The number of hours during which the NVMe device is powered on. | 60 seconds |
Media Errors | The number of occurrences where the controller detected an unrecovered data integrity error. | 60 seconds |
Unsafe Shutdowns | The number of times a power loss occurs without a shutdown notification being sent. | 60 seconds |
Firmware Level | The level of code running in the NVMe device. | 60 seconds |
Overall disk status
The following table covers information on the overall disk status with IBM i OS versions 7.4 and later:
Component Name
: SYSTEM_DISK_STATUS (for custom poll rate component configuration)
Custom Event:
Event Name
: DISK_STATUS_EVENT You can specify multiple expected Disk Status
values, separated by commas, in the user_specification:diskStatus:operationalState
section of the Instana
agent configuration file. The event is triggered with the defined Event Name
if any of the Disks (HDD/SSD) is not in the desired state in the partition.
Note: This event is applicable from IBM i OS version 7.4 and later.
Metric | Description | Granularity |
---|---|---|
Resource Name | The unique system-assigned name of the disk unit. | 60 seconds |
Percent Used | The percentage that the disk unit has been consumed. | 60 seconds |
Disk Type | The disk type number. | 60 seconds |
Unit Number | The unit number of the disk. | 60 seconds |
ASP Number | The storage pool (ASP) number. | 60 seconds |
Elapsed Percent Busy | The estimated percentage of time that the disk unit is being used during the elapsed time. | 60 seconds |
Type Of Disk Unit | The type of disk unit (SSD, HDD). | 60 seconds |