The indexing rate is affected by the number of concurrent worker threads that perform text extraction. To achieve an optimal indexing rate, set this number to be as high as system resources permit.
The following table shows the resources that can be overtaxed if the number of concurrent worker threads for a Content Platform Engine server is too high.
Resource | Comment |
---|---|
CPU | Each text extraction worker consumes CPU cycles. Use your operating system tools to monitor CPU consumption by text extraction processes during peak usage. The name of these processes is ibmfndcm. Set the number of concurrent worker threads so that CPU utilization is roughly 70%. |
Memory | Each text extraction worker consumes some amount
of memory. The amount depends on the size of the documents that are
being indexed. Larger documents require more memory. The consumed memory is free physical memory as opposed to the memory that you allocated to the Content Platform Engine JVM. Free physical memory is used because Content Platform Engine delegates text extraction work to an external process. For information about this external process, see Indexable document types and text extraction. Use your operating system tools to monitor the amount of free physical memory that is available during peak usage. Set the number of concurrent worker threads so that the free physical memory is mostly but not wholly consumed. Important: If the demand for free
physical memory by text extraction workers is greater than the available
supply, your system can become swamped and unusable.
|
If the indexing rate is low even though the number of concurrent worker threads is as high as system resources permit, other tuning might be required. For more information, see Parameters that influence performance.
In the administration console, the following properties determine the maximum number of concurrent text extraction workers per Content Platform Engine server:
Property | Effect on number of workers per server |
---|---|
Maximum worker threads for extracting | This property directly determines the maximum number of concurrent text extraction worker per server. For example, if you set the value of this property to 20, the maximum number of concurrent text extraction workers per server is 20. |
Maximum worker threads per batch for extracting | This property determines the number of concurrent
workers for the server in the following way: Concurrent workers = (value of this property) * (number of concurrent batches) For example, suppose that the server is concurrently processing three index batches. If the value of this property is 2, the number of concurrent text extraction workers is 6. Per index area, the number of
index batches that the server can concurrently process is limited
by the Maximum worker threads for indexing
property. Suppose the following values:
If the value of this property is 2, the maximum number
of concurrent text extraction workers for the server is given by the
following calculation:
2 * 4 * 3 = 24 maximum number of concurrent workers |
As described, these properties determine the maximum number of text extraction workers either directly or indirectly. The lesser maximum determines the actual maximum for the server. For example, if the directly determined maximum is 20, and the indirectly determined maximum is 24, the actual maximum is 20.