IBM Streams 4.3.0

Specifying how operators are fused when you submit a job

When you submit a job, you can optionally specify how the operators in the application are fused into processing elements (PE). The fusion scheme that you use can impact the runtime performance of the application.

Operators are fused when you submit the job. Additionally, you can specify how the operators are fused.

Important: To use these fusion schemes for applications that were compiled with versions earlier than IBM® Streams Version 4.2, you must recompile your applications with IBM Streams Version 4.2 or later. If you do not recompile your applications, they will run using the default behavior for releases earlier than Version 4.2. (The operators will be fused using the legacy scheme.)

You can specify the fusion scheme by using a job configuration overlay file from the IBM Streams Console, IBM Streams Studio, or the streamtool submitjob command. Specify the fusionType parameter, which is defined in the parallelRegionConfig clause in the deploymentConfig section.

IBM Streams supports the following fusion schemes (fusionScheme):
automatic
If you specify automatic as your fusion scheme, IBM Streams determines the appropriate number of PEs to assign to the job.

Typically, this fusion scheme results in one PE per resource. However, IBM Streams might change the number of PEs produced to avoid creating PEs that contain a very small number of operators, which can have a negative impact on performance.

manual
If you specify manual as your fusion scheme, you can then specify the number of PEs to assign to the job (by specifying a value for the fusionTargetPeCount parameter). The actual number of PEs might vary based on other job configuration constraints, the specifications of the application, and the configuration of the instance where you plan to deploy the application. For example, if you specify a large number of partition ex-location constraints, the resulting application might have more PEs than you expect.
legacy
If you specify legacy as your fusion scheme, the operators are fused the same way they were before IBM Streams Version 4.2. Typically, each operator is fused into a separate PE if no other placement config is specified in the application bundle file.
Restriction: Any fusion scheme that you specify is influenced by the fusion constraints that are specified in the application bundle file and by the fusion constraints that are specified when the job is submitted.

You can specify only one value for the fusion scheme.

If you do not specify a fusion scheme, IBM Streams uses the default fusion scheme from the instance on which you deploy the application. For more information, see Setting the default fusion scheme at the instance or domain level. (This is the fusion scheme that is used if you set Default in the console.)

For more information about how to specify fusion schemes from the interactive streamtool interface, see streamtool submitjob command.

The following examples show different ways in which you might specify the fusion scheme from the interactive streamtool interface:
  • To use the default fusion scheme that is set at the instance or domain level:
    streamtool submitjob myBundle.sab
  • To set a target of 5 PEs for the job:
    streamtool submitjob -C fusionScheme=manual -C fusionTargetPeCount=5 myBundle.sab
  • To use the fusion scheme from versions before IBM Streams Version 4.3:
    streamtool submitjob -C fusionScheme=legacy myBundle.sab

Specifying how parallel regions are fused

If your application includes parallel regions, you can specify how the operators in the channels in the parallel regions are fused into PEs.
Important:
  • If you specify legacy as your fusion scheme, you cannot specify how parallel regions are fused.
  • If you want the option to change the parallel region width after you submit the job, you must select the channelIsolation option.
IBM Streams supports the following values for the fusionType parameter:
noChannelInfluence (Do not treat parallel regions differently)
(Default) IBM Streams treats the operators in a parallel region the same as the other operators in the application. Inclusion in a parallel region does not have any impact on how the operators are fused.
channelIsolation (Fuse operators in the same channel)
Operators within a channel are fused into a PE only with operators from the same channel. Operators outside the parallel region or from other channels in the same region are fused into different PEs. One or more PEs for a channel can be created depending on the fusion constraints.

You can explicitly collocate an operator in a channel with an operator outside the region or from a different channel in the same region. A separate PE is created if all of the explicitly colocated operators are not members of the same channel.

channelExlocation (Prevent fusion across channels)
Operators within a channel are fused with operators from the same channel or with operators from outside the region. Operators from other channels in the same region are not fused into the same PEs. One or more PEs for a channel can be created depending on the fusion constraints.

You can explicitly collocate an operator in a channel with an operator outside the region or from a different channel in the same region. A separate PE is created if all of the explicitly colocated operators are not members of the same channel.

If you use this value, you cannot change the width of a parallel region while the job is running. You would need to resubmit the job with the fusionType parameter set to channelIsolation or to noChannelInfluence.