IBM Netcool Operations Insight Version 1.4.1

Troubleshooting Event Analytics

Use the following troubleshooting information to resolve problems with your Event Analytics configuration.

If your problem is not listed in this topic, then refer to the Release notes for additional issues.

Improving Event Analytics performance due to large search results

If you are performing an upgrade of Event Analytics from an earlier version, the upgrade repopulates the existing data from the previous version and aligns this data with the new schema, tables, and views. It is possible that you might see degradation in the performance of Event Analytics operations. Examples of degradation in performance include but are not limited to:

  • Reports can hang.
  • Reports complete, but no data is displaying for seasonal events.

To improve any degradation in the performance of Event Analytics operations due to the upgrade to 1.3.1 or later releases, run the SE_CLEANUPDATA policy as follows:

  1. Log in to the server where IBM Tivoli Netcool/Impact is stored and running. You must log in as the administrator (that is, you must be assigned the ncw_analytics_admin role).
  2. Navigate to the policies tab and search for the SE_CLEANUPDATA policy.
  3. Open this policy by double-clicking it.
  4. Select to run the policy by using the run button on the policy screen toolbar.

The SE_CLEANUPDATA policy cleans up the data. Specifically, the SE_CLEANUPDATA policy:

  • Does not remove or delete any data from the results tables. The results tables hold all the original information about the analysis.
  • Provides some additional views and tables on top of the original tables to enhance performance.
  • Combines some information from related events, seasonal events, rules, and statistics.
  • Cleans up only the additional tables and views.

The Seasonal Event Report stops running before completion.

The Seasonal Event Report does not complete running. No errors are displayed. The report progress does not increase.

This problem occurs if a Seasonal Event Report is running when the Netcool/Impact back-end server goes offline while the Impact UI server is still available. No errors are displayed in the Impact UI and no data is displayed in the widgets/dashboards.

To resolve this problem, ensure that the Netcool/Impact servers are running. Edit and rerun the Seasonal Event Report.

The Netcool/Impact back-end server fails when you run multiple Seasonal Event Reports

The Seasonal Event Reports do not complete running. No errors are displayed. The report progress does not increase.

This problem occurs when multiple Seasonal Event Reports are run simultaneously without increasing the default heap size settings for Netcool/Impact. The default heap size setting for Netcool/Impact is 1200 MB. If the heap size is exceeded, the Netcool/Impact back-end server fails.

To resolve this problem, increase the heap size settings. As a guideline, increase the heap size settings to 80% of the free memory on your system.

For more information, see the Increasing the memory for the Java virtual machine on the Impact profile and Setting the memory for the Java virtual machine on the Impact profile Netcool/Impact topics. You can access these publications from the IBM® Tivoli® Network Management IBM Knowledge Center (http://www-01.ibm.com/support/knowledgecenter/SSSHYH/).

Error displaying Seasonal Event Graphs in Microsoft Internet Explorer browser

The Seasonal Event Graphs do not display in a Microsoft Internet Explorer browser.

This problem happens because Microsoft Internet Explorer requires the Microsoft Silverlight plug-in to display the Seasonal Event Graphs.

To resolve this problem, install the Microsoft Silverlight plug-in.

Submitted Seasonal Event Report remains at 0%

The Seasonal Event Report does not run. It remains at 0%.

This problem occurs when Event Analytics cannot access the ProcessSeasonalityEvents Service. You cannot create a Seasonal Event Report without access to the ProcessSeasonalityEvents Service.

To resolve this issue, ensure that the ProcessSeasonalityEvents Service is running on the Impact Server.

Creating a Seasonal Event Report displays error message Error creating report. Seasonality configuration is invalid

The Seasonal Event Report does not run. An error message is displayed.
Error creating report. 
Seasonality configuration is invalid. Verify settings and retry.

This problem occurs when Event Analytics is not correctly configured before you run a Seasonal Event Report.

To resolve this problem, review the Event Analytics installation and configuration guides to ensure that all of the prerequisites and configuration steps are complete. Also, if you use a table name that is not the standard REPORTER_STATUS, you must verify the settings that are documented in the following configuration topics.

Missing Event Analytics files and directories

The stand-alone Netcool/Impact GUI server contains incorrect column names and untranslated text strings.

This problem occurs when a stand-alone Netcool/Impact GUI server is installed. Some of the Event Analytics files and directories are not installed correctly.

To resolve this problem, copy the files and directories in the following directory in the backend server to the stand-alone Netcool/Impact GUI server:
$IMPACT_HOME/uiproviderconfig

The seasonality report times out when you use large data sets

Before the seasonality policy starts to process a report, the seasonality policy issues a database query to find out how many rows of data need to be processed. This database query has a timeout when the database contains many rows and the database is not tuned to process the query. Within the <impact install>/logs/impact_server.log file, the following message is displayed.
02 Sep 2014 13:00:28,485 ERROR [JDBCVirtualConnectionWithFailOver] JDBC Connection 
Pool recieved
error trying to connect to data source at: jdbc:db2://localhost:50000/database
02 Sep 2014 13:02:28,500 ERROR [JDBCVirtualStatement] JDBC execute failed twice.
com.micromuse.common.util.NetcoolTimeoutException: TransBlock [Executing SQL query: 
select count(*)
as COUNT from DB2INST1.PRU_REPORTER where ((Severity >= 4) AND ( FIRSTOCCURRENCE > 
'2007-
09-02 00:00:00.000' )) AND ( FIRSTOCCURRENCE < '2014-09-02 00:00:00.000')] timed 
out after
120000ms.

Check that you have indexes for the FIRSTOCCURRENCE field and any additional filter fields that you specified, for example, Severity. Use a database tuning utility, or refresh the database statistics, or contact your database administrator for help. Increase the impact.server timeout to a value greater than the default of 120s, see http://www-01.ibm.com/support/docview.wss?uid=swg21621488.

The seasonality report stays at 0% complete and does not progress

Within <install>/impact/logs/NCO_policylogger.log, the following trace entry is visible with no latter trace entries.
12 Sep 2014 11:23:08,817: [ConfigureResults][pool-3-thread-18]Parser log: About to 
Add a new Data

This problem occurs if the seasonality services are not started.

To resolve this problem, complete the following steps.
  1. In the Netcool/Impact UI, select the Seasonality project.
  2. Within the Seasonality project, select the Services tab.
  3. In the Services tab, start the following policies.
    • StartSeasonalityProcessing
    • ProcessSeasonalityEvents

Seasonality reports or related events configurations hang, with error ATKRST132E logged

When you start cluster members, replication starts and the Netcool/Impact database goes down. Any running seasonality reports or related events configurations hang and this error message is logged in the Netcool/Impact server log.
ATKRST132E An error occurred while transferring a request to the following remote 
provider: 'Impact_NCICLUSTER.server.company.com'. Error Message is 
'Cannot access data provider - Impact_NCICLUSTER.server.company.com'.

To resolve this problem, do a manual restart or a scheduled restart of the affected reports or configurations.

Within the event viewer, you are unable to view seasonal events and error ATKRST103E is logged

When you complete the following type of steps, then within the event viewer the seasonal events are not viewable and error ATKRST103E is logged.
  1. Open the event viewer and select to edit the widget from the widget menu.
  2. From the list on the edit screen, select the Impact Cluster data provider.
  3. Select to view either the seasonality report and the report name.
  4. Save the configuration.

To resolve the problem, view seasonal events by using the provided seasonal events pages and view related events parent to child relationships by using the Tivoli Netcool/OMNIbus data provider.

Configuring Netcool/Impact for ObjectServer failover

Netcool/Impact does not process new events for Event Analytics after ObjectServer failover. Seasonal event rule actions are not applied if the Netcool/Impact server is not configured correctly for ObjectServer failover as new events are processed. For example, if a seasonal event rule creates a synthetic event, the synthetic event does not appear in the event list, or if a seasonal event rule changes the column value for an event, the value is unchanged.

This problem occurs when Netcool/Impact is incorrectly configured for ObjectServer failover.

To resolve this problem, extra Netcool/Impact configuration is required for ObjectServer failover. To correctly configure Netcool/Impact, complete the steps in the Managing the OMNIbusEventReader with an ObjectServer pair for New Events or Inserts topic in the Netcool/Impact V 7.1.0.3 documentation: https://www.ibm.com/support/knowledgecenter/SSSHYH_7.1.0.12/com.ibm.netcoolimpact.doc/common/dita/ts_serial_value_omnibus_eventreader_failover_failback.html

When configured, Netcool/Impact uses the failover ObjectServer to process the event.

Update the Seasonal Event date range after you upgrade from 7.1.0.2 to 7.1.0.3 or later

After you upgrade from 7.1.0.2 to 7.1.0.3 or later for Netcool/Impact and web GUI, you must update the seasonal event configuration date range. In 7.1.0.2 a seasonal event configuration has a fixed date range. In 7.1.0.3 or later, the default date range is relative.

To update the date range after you upgrade to 7.1.0.3 or later, complete the following steps:
  • Select the seasonal event configuration in the Configure Analytics portlet.
  • Click the fixed date range radio button.
  • Click Save to save the configuration without running, or Save & Run to save and run the configuration.
The correct fixed date range values are imported from 7.1.0.2.

Seasonal report missing information after you upgrade to Netcool/Impact 7.1.0.3 or later

After you upgrade from Netcool/Impact 7.1.0.1 and 7.1.0.2 to Netcool/Impact 7.1.0.3 or later, information is missing from columns in the group table in the View Seasonal Events portlet.

Information is missing from specific columns in the group table that is displayed in the View Seasonal Events portlet after you upgrade. The information that is missing in 7.1.0.3 or later was not displayed in earlier versions of Netcool/Impact and Netcool Operations Insight.

To display the missing information in the columns, rerun the migrated configuration in the latest build.
Note: Rerunning the migrated configuration in the latest build overwrites data that existed for the previously run configuration.

Unable to create patterns for configurations created before you upgraded to Netcool Operations Insight 1.4.0.1 or later

Configurations that are created before you upgrade to Netcool Operations Insight 1.4.0.1 or later used the override global event identity setting, which is defined in the analytics configuration file. You cannot create patterns for groups in this analytics configuration.

After you upgrade to Netcool Operations Insight 1.4.0.1 or later, ensure that the event history database is indexed based on the SERVERSERIAL and SERVERNAME, or equivalent used, fields. If the historical database is created from the default historical database, the index is in place.

To apply patterns to existing configurations, re-create the configurations in Netcool Operations Insight 1.4.0.1 or later using the original configuration settings and the global event identity.

Warning message related to configuring seasonality event analytics

When you are configuring an event configuration and seasonality event analytics is enabled, the following warning message appears in the log file.

WARNING: Follow this instruction: This error is after the analysis is done. 
The last step is to reinsert the data for UI views.

You login to the Impact UI, go to the Policies tab, and execute the 
following policy:  

SE_CLEANUPDATA 

Following these steps corrects the error and reinserts the data.

Typically, the previous warning message appears in the log file when the seasonality configuration is complete and an error occurred.

To correct the error and reinsert the data, run the SE_CLEANUPDATA policy as follows.

Note: Before you run the SE_CLEANUPDATA policy, it is recommended that you increase the value specified for the impact.server.timeout property defined in the $IMPACT_HOME/etc/ServerName_server.props properties file. Specifically, replace impact.server.timeout=120000 with impact.server.timeout=3600000. The value 3600000 allows for 60 minutes to give more time for the Apache Derby database to work in the complex query. You will need to restart the Netcool/Impact server after you edit the impact.server.timeout property.

  1. Log in to the server where IBM Tivoli Netcool/Impact is stored and running. You must log in as the administrator (that is, you must be assigned the ncw_analytics_admin role).
  2. Navigate to the policies tab and search for the SE_CLEANUPDATA policy.
  3. Open this policy by double-clicking it.
  4. Select to run the policy by using the run button on the policy screen toolbar.

The SE_CLEANUPDATA policy cleans up the data. Specifically, the SE_CLEANUPDATA policy:

  • Does not remove or delete any data from the results tables. The results tables hold all the original information about the analysis.
  • Provides some additional views and tables on top of the original tables to enhance performance.
  • Combines some information from related events, seasonal events, rules, and statistics.
  • Cleans up only the additional tables and views.

Unable to run SE_CLEANUPDATA policy

If the Netcool/Impact server timeout is reached when you are running a seasonality report, the following message displays on the Configure Analytics portlet:

Finished with Errors

If you then try to run the SE_CLEANUPDATA policy, the policy locks on the Netcool/Impact server. To work around this issue, you must manually unlock the file that contains the SE_CLEANUPDATA policy. Then, run the SE_CLEANUPDATA policySE_CLEANUPDATA policy again. To unlock the SE_CLEANUPDATA policy, follow these steps:

  1. Log in to the server where IBM Tivoli Netcool/Impact is stored and running. You must log in as the administrator (that is, you must be assigned the ncw_analytics_admin role).
  2. Navigate to the policies tab and search for the SE_CLEANUPDATA policy.
  3. Right-click the SE_CLEANUPDATA policy and select unlock from the drop-down menu.

Event Analytics configuration Finished with Warnings

The seasonality report or related events configuration completes with a status of Finished with Warnings. This message indicates that a potential problem was detected but it is not of a critical nature. You should review the log file for more information ($NCHOME/logs/impactserver.log). The following is an example of a warning found in impactserver.log:

11:12:38,366 WARN  [NOIProcessRelatedEvents] WARNING: suggested pattern : RE-sqa122-last36months-Sev3-Default_Suggestion4 includes too many types, could be due to configuration of types/patterns. 
The size of the data execeeded the column limit. The pattern will be dropped as invalid.

Event Analytics configuration Finished with Errors

One reason for an Event Analytics configuration to complete with a status of Finished with Errors is because the suggested patterns numbering is not sequential. This can be because, for example, the pattern type found is invalid or the string is too long to be managed by the Derby database. You should review the log file for more information ($NCHOME/logs/impactserver.log).

The pattern displays 0 groups and 0 events

The events pattern that is created and displayed in the Group Sources table in the View Related Events portlet displays 0 groups and 0 events

The pattern displays 0 groups and 0 events for one of the following reasons.
  • The pattern creation process is not finished. The pattern creation process can take a long time to complete due to large datasets and high numbers of suggested patterns.
  • The pattern creation process was stopped before it completed.
To confirm the reason that the pattern displays 0 groups and 0 events, complete the following steps.
  1. To confirm that the process is running,
    1. Append the policy name to the policy logger file from the Services tab, Policy Logger service. For more information about configuring the Policy logger, see https://www.ibm.com/support/knowledgecenter/SSSHYH_7.1.0.12/com.ibm.netcoolimpact.doc/user/policy_logger_service_window.html.
    2. Check the following log file.
      $IMPACT_HOME/logs/<serverName>_policylogger_PG_ALLOCATE_PATTERNS_GROUPS.log

    If the log file shows that the process is running, wait for the process to complete. If the log file shows that the process stopped without completing, proceed to step 2.

  2. To force reallocation for all configurations and patterns run the PG_ALLOCATE_PATTERNS_GROUPS_FORCE from Global projects policy with no parameters from the UI.
  3. Monitor the $IMPACT_HOME/logs/<serverName>_policylogger_PG_ALLOCATE_PATTERNS_GROUPS_FORCE.log log file to track the completion of the process.

Incomplete, stopped, and uninitiated configurations

Configurations do not complete, are stalled on the Configure Analytics portlet, or fail to start.

These problems occur if the services are not started after Event Analytics is installed, or the Netcool/Impact server is restarted.

To resolve these problems, complete the following steps.
  1. In the Netcool/Impact UI, select the Impact Services tab.
  2. Ensure that each of the following services is started. To start a service, right-click the service and select Start.
    • LoadRelatedEventPatterns
    • ProcessClosedPatternInstances
    • ProcessPatternGroupsAllocation
    • ProcessRelatedEventConfig
    • ProcessRelatedEventPatterns
    • ProcessRelatedEventTypes
    • ProcessRelatedEvents
    • ProcessSeasonalityAfterAction
    • ProcessSeasonalityConfig
    • ProcessSeasonalityEvents
    • ProcessSeasonalityNonOccurrence
    • UpdateSeasonalityExpiredRules

Event Analytics: Reports fail to run due to event count queries that take too long.

Reports fail to run due to large or unoptimized datasets that cause the Netcool/Impact server to timeout and reports fails to complete.

To resolve this issue, increase the Netcool/Impact server timeout value to ensure that the Netcool/Impact server processes these events before it times out. As a result of increasing this server timeout value, the Netcool/Impact server waits for the events to be counted, thus ensuring that the reports complete and display in the appropriate portlet.

Edit the Netcool/Impact impact.server.timeout value, at
$IMPACT_HOME/etc/ServerName_server.props

By default, the impact.server.timeout property is set to 120000 milliseconds, which is equal to 2 minutes. The recommendation is to specify a server timeout value of at least 5 minutes. If the issue continues, increase the server timeout value until the reports successfully complete and display in the appropriate portlet.

Backup the Apache Derby database before upgrading to Event Analytics 1.4.0.1

Before you upgrade Event Analytics toNetcool Operations Insight 1.4.0.1, create a backup of the Apache Derby database.

To back up the Apache Derby database, complete the following steps.
  1. Stop the ImpactDatabase service.
  2. Back up the files from the $NCHOME/db/<SERVER_NAME>/derby directory.
  3. Start the ImpactDatabase service again.
To restore the Apache Derby database from a backup file, complete the following steps.
  1. Stop the ImpactDatabase service. Make a copy of the existing ImpactDB database before you restore the backup file.
  2. Copy the backup file to the $NCHOME/db/<SERVER_NAME>/derby directory.
  3. Start the ImpactDatabase service again.
To restore an older version of the Apache Derby database, after a failure complete the following steps, based on the upgrade version.
  1. Change to the following directory:
    cd $IMPACT_HOME/add-ons/NOI/db
  2. List the files that reside in the $IMPACT_HOME/add-ons/NOI/db directory. For example:
    ls
    .
    .
    .
    noi_derby_updatefp03.sql
    noi_derby_updatefp04.sql
    noi_derby_updatefp05.sql
    noi_derby_upgrade_71fp2.sql
    .
    .
    .
  3. Make backup copies to each file before editing.
  4. Using a text editor, open each file for editing and find the following line:
    connect 'jdbc:derby://__PRIMARY_HOST__:__
    PRIMARY_PORT__/
    __PRIMARY_DB__;
    user=__DBUSER__;password=__DBPASSWORD__;';
  5. Change the connection parameters in each of the files.
  6. Write and close each file.
  7. Run the following command:
    $IMPACT_HOME/bin/nci_db connect -sqlfile <one of the sql files>
    Note: You execute all previous versions of the files until Fix Pack 5 as follows:
    • noi_derby_upgrade_71fp2.sql -- Start here if you are upgrading Fix Pack 1.
    • noi_derby_updatefp03.sql -- Start here if you are upgrading Fix Pack 2.
    • noi_derby_updatefp04.sql -- Start here if you are upgrading Fix Pack 3.
    • noi_derby_updatefp05.sql
  8. Restart the Netcool/Impact server.

Event pattern with the same criteria already exists (error message)

An error message is displayed if you create a pattern that has a duplicate pattern criteria selected. Check the following log file to determine which pattern is the duplicate:
$IMPACT_HOME/logs/<serverName>_policylogger_PG_SAVEPATTERN.log

Related Event Details page is slow to load

To avoid this problem, create an index on the Event History Database for the SERVERSERIAL and SERVERNAME columns.
create index myServerIndex on DB2INST1.REPORTER_STATUS (SERVERSERIAL , SERVERNAME )
It is the responsibility of the database administrator to construct (and maintain) appropriate indexes on the REPORTER history database. The database administrator should review the filter fields for the reports as a basis for an index, and should also review if an index is required for Identity fields.

Export of large Related Event configuration fails

The export a configuration with more then 2000 Related Event groups fails. An error message is displayed.
Export failed.
An invalid response was received from the server.

To resolve this issue, increase the Java Virtual Machine memory heap size settings from the default values. For Netcool/Impact the default value of the Xmx is 2400 MB. In JVM, Xmx sets the maximum memory heap size. To improve performance, make the heap size larger than the default setting of 2400 MB. For details about increasing the JVM memory heap size, see https://www.ibm.com/support/knowledgecenter/SSSHYH_7.1.0.12/com.ibm.netcoolimpact.doc/admin/imag_monitor_java_memory_status_c.html.

Configuration run time differences between Netcool/Impact fix pack versions

In comparison to previous fix packs, improvements are noticeable in Netcool/Impact V7.1 fix pack 12 to both run time and heap memory use for seasonal event and related event configurations. Apply the latest available fix packs to upgrade to the latest version of Netcool Operations Insight.

Export of Event Analytics reports causes log out of DASH

If Netcool/Impact and DASH are installed on the same server, a user might be logged out of DASH when exporting Event Analytics reports from DASH. The problem occurs when the Download export result link is clicked in DASH. A new browser tab is opened and the DASH user is logged out from DASH.

To avoid this issue, configure SSO between DASH and Netcool/Impact. For more information, see https://www.ibm.com/support/knowledgecenter/SSSHYH_7.1.0.12/com.ibm.netcoolimpact.doc/admin/imag_configure_single_signon.html.

Two or more returned seasonal events appear to be identical

It is possible for events to have the same Node, Summary, and Alert Group but a different Identifier. In this scenario, the event details of two (or more) events can appear to be identical because the Identifier is not displayed in the details.

Display of a Seasonal Event's historical events screen appears to hang or take a long time

Review the Table Description tab of the SQL Data Type Config settings found on the NOI Data Model : ObjectServerHistory<databaseType>ForNOI and remove any columns that are not required by Event Analytics reports. Is is possible that the Refresh Fields button on that tab has been selected and as a result additional (unwanted) columns are marked for selection/retrieval from the database.
  1. Switch to the NOI Data Model.
  2. Expand the ObjectServerHistory collapsible section appropriate to your historical database type.

    For example, if you are using DB2 as the historical events database then expand the ObjectServerHistoryDB2ForNOI collapsible section.

  3. Edit the SE_HISTORICALEVENTS_DB2 datasource to show the SQL Data Type Config settings.

    If you are using a database other than DB2 then select the appropriate datasource, for example SE_HISTORICALEVENTS_ORACLE for Oracle.

  4. Select the Table Description tab and review the available columns.
  5. To remove a column: select the check box to the left of the column name under the ID column in Table Description and select Delete Selection.
  6. Save any changes.

Event Isolation and Correlation (EIC) page does not load if a non-default cluster name is used

If you use a non-default cluster name, for example NCICLUSTER_Z, the Event Isolation and Correlation page might not load successfully. Complete the following steps to resolve this problem:
  1. Stop the Netcool/Impact GUI server: $IMPACT_HOME/bin/stopGUIServer.sh.
  2. Copy or rename the file with the correct cluster name. For example:
    cp $IMPACT_HOME/opview/displays/NCICLUSTER-EIC_configure.html $IMPACT_HOME/opview/displays/NCICLUSTER_Z-EIC_configure.html
  3. Restart the Netcool/Impact GUI server: $IMPACT_HOME/bin/startGUIServer.sh
  4. Go to the Event Isolation and Correlation page and confirm that the page displays without errors.