IBM Support

QRadar: Troubleshooting performance for expensive custom rules in 7.5.0 UP2 and later

Troubleshooting


Problem

Not properly tuned custom rules can cause performance issues. This article explains how to troubleshoot rule performance issues by using the findExpensiveCustomRules.sh script.

Cause

Rules are processed in order of a condition listed in Rule Editor, with the exception of match count rules, which are processed last. A common cause for this issue is rules that need to be tuned in the QRadar deployment. If a rule takes too long to process through the pipeline, it might cause a performance issue. The result can cause events to be dropped or routed directly to storage.

Examples of Expensive Rules are:

  • Payload-related tests that use regex-based calls.
  • X-Force database is large, not optimal search conditions might be expensive.
  • The order of conditions in Rule Editor can make the difference between a good rule and an expensive rule.
  • Matching multiple values with a reference set, which contains over 100.000 elements.
  • If the asset and port vulnerability databases is large, Host and Port Profiles might be expensive.

Diagnosing The Problem

Events are being dropped by the pipeline, in the qradar.log file you might see similar messages such as:
 
[[type=com.eventgnosis.system.ThreadedEventProcessor][parent=hostname:ecs-ep/EP/Processor2]] com.q1labs.semsources.cre.CRE: [WARN] [NOT:0080004101][x.x.x.x/- -] [-/- -]Custom Rule Engine has sent a total of 3185672 event(s) directly to storage. 120868 event(s) were sent in the last 60 seconds. Queue is at 2 percent capacity.

 

Resolving The Problem

What is the findExpensiveCustomRules script?
The findExpensiveCustomRules.sh script is designed to query the QRadar data pipeline and report on the processing statistics from the Custom Rules Engine (CRE). The script monitors metrics and collects statistics on how many events hit each rule, how long it takes to process a rule, with a various total, maximum and average timings.  When the script completes, it turns off these performance metrics.
The findExpensiveCustomRules script is a useful tool for creating on-demand reports for rule performance; it is not a tool for tracking historical rule data in QRadar. The core functions of this script are often run when users see drops in events or events routed to storage between components in QRadar.

Part 1: How to run the findExepensiveCustomRules script
  1. SSH in to the QRadar Console as the root user.
  2. Optional. Open an SSH session to the QRadar appliance where ECS-EP process runs. The following appliance types run ECS-EP and the log files show the hostname of the appliance that is reporting the issue:
    • QRadar 16xx Event Processor appliances
    • QRadar 17xx Flow Processor appliances
    • QRadar 18xx Combination Event and Flow appliances
    • QRadar 21xx Log Manager appliance
    • QRadar 31xx Consoles
  3. Run the findExpensiveCustomRules script to review for any rules that are expensive and tune them as required:
    /opt/qradar/support/findExpensiveCustomRules.sh -d /root
    Report created
    Note: The -d option specifies the path of the output of findExpensiveCustomRules.sh
  4. Use SCP to move the CustomRule-{timestamp} file to your local laptop or workstation.
  5. Use a compression utility to extract the CustomRule-yyyy-mm-dd-seconds.tar.gz file to a .tar file.
  6. Extract the .tar file a second time to access the Expensive Custom Rules report text file. The output contains two files and a reports folder.
    List of Files
  7. Open CustomRule-yyyy-mm-dd-seconds.txt file in any spreadsheet program as a CSV file.
     
Part 2: What to look for in the CustomRule report

QRadar 7.5.0 Update Package 2 and later

  1. Sort the AverageActionsTime column, AverageTestTime column, and AverageResponseTime column to look for large values. These values identify which rules, on average, take more time to run than others.
  2. The values are in milliseconds
  3. A rule with a value of 0.01 or greater is considered potentially expensive and needs to be reviewed.
  4. If AverageTestTime is high, but the FiredCount is low, the rule might not be what is causing the issue.

Report

    QRadar 7.5.0 Update Package 1 and earlier

    Please follow this link to see the instruction for earlier versions.
    Part 3: What to do next
    1. Find the Rule in the Offenses tab and disable it. Do not modify or delete the rule until it is proven it is a bad rule.
    2. Ensure that the Dashboard notifications no longer indicate warnings.
    3. If the notifications are still occurring, recheck the CustomRule report, and whether there are any other entries that look suspicious.
    4. If that proves to be the bad rule, modify it to be less expensive or delete it.

      Note: The sequence that the rules are laid out can make a difference in performance. Before using a payload test, limit the data you search as much as you can.
       
    Part 4: Other issues that can cause Custom Rule Performance Degradation
    • Verify whether any rules are configured as "Global" rules. In some cases, "Global" rules can cause excessive events to be processed by the console and resulting in a Magistrate Process Core (MPC) queue to fill. In this case, change any Global rules to Local rules, if possible.
    • Payload tests can end up scanning every event, if not careful. Try to filter on log source, log source type, and maybe an IP address before doing payload tests. The order of tests makes a difference in the rule or CRE, see Regular expression (regex) cases and support policies
    • If the asset and port vulnerability databases is large, Host and Port Profiles might be expensive. For more information about host and port profiles, see the tuning guide.
    • X-force lookups, if the X-force data feed is enabled.

    [{"Type":"MASTER","Line of Business":{"code":"LOB24","label":"Security Software"},"Business Unit":{"code":"BU048","label":"IBM Software"},"Product":{"code":"SSBQAC","label":"IBM Security QRadar SIEM"},"ARM Category":[{"code":"a8m0z000000cwtrAAA","label":"Rules"}],"ARM Case Number":"","Platform":[{"code":"PF016","label":"Linux"}],"Version":"7.5.0"}]

    Document Information

    Modified date:
    05 April 2024

    UID

    ibm16953105