Overview: What affects the performance of a Decision Server application

Among the factors that affect the performance and scalability of an application are infrastructure and hardware, data access, architecture and design, engine scalability, and choosing the appropriate rule engine configuration and execution mode.

Satisfactory performance of any application relies on its ability to scale. Scalability is the ability of a system to adapt to changing workloads while maintaining system efficiency. A scalable system can handle a growing amount of work by allocating more resources, typically hardware, to the system. Scalability is often more important than initial performance to the business production cycle.

The following figure illustrates the possibilities to improve performance and scalability.

Diagram showing the four areas to improve performance and scalability

To take advantage of the hardware and to improve the system scalability, consider the following performance aspects:

Infrastructure and hardware

Increasing the number of servers, CPUs, and system memory improves the system performance.

Architecture and design

In terms of architecture and design, you achieve scalability by using multiple threads, optionally in conjunction with rule engine pools. With rule engine pooling, you can maintain a fixed maximum number of rule engines on the server side. This approach ensures that the performance of the server does not deteriorate drastically when a sudden surge in the number of requests occurs. The requests that are not served by any engine are put on hold and the waiting time for these requests is not significant if the time taken by the engine to execute the rules is negligible.

Pools of rule engines also help shorten the response time to a request because the rules used for the engine pool are typically preloaded and cached by the time the engines in the pool are listening for requests from the application. Time is not spent on loading the rules after a request arrives and the rules can be executed immediately.

Note: Rule Execution Server and the rule session API use a pool of rule engines that work on the classic rule engine and on the decision engine.

You can also improve performance by a careful design of business rules deployment. It is common practice to sort the rules on a particular set of objects and group them into rulesets accordingly.

Such design has more than one benefit:

Tests on conditions are more efficient.
Rulesets can be executed independently of each other. It is not necessary to execute all the rules in the application, including irrelevant rules and objects that take up resources.

RetePlus, sequential, and Fastpath execution modes

The rule engine has three execution modes (see Engine execution modes):

RetePlus

RetePlus is designed to optimize the evaluation of large numbers of rules across large numbers of objects. RetePlus filters the tests such that irrelevant tests are not evaluated. Tests can be shared between rules that use similar tests so that they do not need to be re-evaluated for all the rules. The RetePlus algorithm is based on an inference process and supports forward chaining. RetePlus was the default execution mode for ruleflow tasks in Operational Decision Manager V8.5.1 and earlier. Note that the execution mode is always RetePlus if there is no ruleflow.

Sequential

While RetePlus is incremental and stateful, the sequential execution mode implements stateless pattern matching. The performance of an engine in sequential mode significantly improves if it is provided with large rulesets made of basic but test-intensive homogeneous rules.

Fastpath

Fastpath is the default execution mode for ruleflow tasks. The Fastpath execution mode combines the advantages of RetePlus and sequential processing. This mode uses the relations that exist between the tests to organize code structurally instead of a simpler version that would reuse previously computed test values.

Like the sequential mode, it efficiently matches objects against large numbers of rules that individually perform simple discriminations or join tests.
Like the RetePlus mode, it can filter and share tests to avoid evaluating irrelevant tests.

Rule organization

The performance of the rule engine always depends on the number of rules and objects. The time taken to process the rules increases with the number of rules and objects involved. The process time depends on the access time: the shorter the access time to the rules, the shorter the process time.

Ruleflow structures are another way to improve processing efficiency. You can use a ruleflow as a way to structure your rules. For example, you have thousands of rules in a rule task and there is way to split these rules in three different subsets with no or small intersections. In this case, you can replace the large rule task with three smaller ones and some ruleflow control nodes to select the right rule task. This way of organizing the rules in smaller rule tasks works with all rule algorithms. It works even better when the rule tasks use the sequential or Fastpath algorithm. On the other hand, splitting rules into very small sets does not yield good results because the ruleflow control part becomes large and slow.

Note: In the classic rule engine, dynamic rule selection in a rule task always results in slower execution than static rule selection. Therefore, use dynamic rule selection only when it is necessary, see Runtime rule selection.

To improve performance, you can order the rules with the following purposes:

Common tests on different objects are shared.
The number of tests carried out are minimized.
Performance degrades when a single evaluation contains too many variable definitions and conditions. See the note in Rule engine object.
The test uses less memory.

Dynamic bytecode generation

The classic rule engine supports dynamic bytecode generation to speed up the evaluation of the tests for rules in RetePlus execution mode. Rule tests can be translated directly into Java™ bytecode and integrated into the application to improve performance of the rules. Depending upon the rules, bytecode generation can improve processing performance of the engine by a factor ranging from four to more than ten by bypassing Java introspection. The generated bytecode calls Java members directly in the rule tests. Therefore, the more complex the rules and the more objects in working memory, the bigger the gain. Dynamic bytecode generation also reduces the activity of the garbage collector at run time, which enhances performance.

At run time, the RetePlus rule engine can load the rules as interpreted rules or as generated bytecode. Use dynamic bytecode generation to improve the performance of applications that contain many objects and complex rules by a factor ranging from four to more than ten. The sequential and Fastpath rule engines always use bytecode generation.

Rule engine configuration

You can also turn on the following optimization options in a rule engine configuration file:

Hashers and autohashing: Relies on a more efficient internal organization of the memory to improve the performance of equality tests. Using hash tables, you can speed up the evaluation of rules and dramatically reduce the number of combinations that must be evaluated.

Restriction: Hashers and autohashing are supported by the classic rule engine only.

Performance demands and application types

The following types of applications demand performance.

Transaction processing uses asynchronous communication and requires bursts of transactions to be pushed through.
Online services use synchronous communication and have human-system interaction, which requires support of a large number of concurrent users, short response time, and possibly database access.
Batch processing uses offline processing and dedicated machine/time slots for tasks, and involves database access.

The following figure illustrates the performance characteristics of Decision Server with respect to these application types.

Diagram of Decision Server performance depending on application type