What's New in Version 4.2
For updates that are not associated with new features, see Documentation updates for IBM Streams Version 4.2.
New features for Version 4.2 Fix Pack 1 (Version 4.2.0.1)
- Configuring strong encryption for Kerberos
- The procedure to enable strong encryption for the streamtool command-line interface, the Domain Manager, and the Streams Console, no longer requires that you install the Unrestricted SDK Java™ Cryptography Extension (JCE) policy files manually. Learn more...
New features for Version 4.2
- New version management and rolling upgrade options for IBM Streams
- New option for setting up the IBM Streams domain controller service as an unregistered service
- Support for restricting access to IBM Streams resources
- Support for Kerberos authentication
- Support for encrypted PE connections
- Support for using Apache Edgent with IBM Streams
- Support for using Hyperstate Accelerator as a checkpoint data store for IBM Streams
- Support for developing IBM Streams applications with Python
- Support for compiling ODM rules into SPL for use in IBM Streams applications
- New submission-time fusion option for better control over how jobs run
- Improved threading model for better performance
- Support for nested user-defined parallelism
- Support for asynchronous non-blocking checkpointing of operator states
- New SPL compiler option for building optimized code
- Toolkit updates
- New and changed streamtool commands
- Serviceability enhancements
New version management and rolling upgrade options for IBM Streams
IBM Streams Version 4.2 provides the foundation for managed version and rolling upgrade support.
Managed version support enables you to upgrade a domain and its instances independent of each other. Rolling upgrade support enables you to upgrade a domain or instance while it is running.
Version 4.2 provides the foundation for version management and rolling upgrade support because it is the earliest supported version for running an instance at a different version than its domain and the first version from which a rolling upgrade can be performed.
New option for setting up the IBM Streams domain controller service as an unregistered service
A domain controller service runs on every resource in an IBM Streams domain and manages all of the other services on that resource. In previous versions of IBM Streams, the only option for running the domain controller service in a highly available environment was as a registered Linux system service.
Beginning in Version 4.2, the domain controller service can run as an unregistered service, which can be started by a root or non-root user. You can set up the domain controller service as a registered Linux system service or an unregistered service when you configure the resources in an IBM Streams domain.
Support for restricting access to IBM Streams resources
Beginning in Version 4.2, IBM Streams provides tags that you can use to restrict access to a resource. This can be helpful if some resources have special capabilities, such as access to a private data source.
Support for Kerberos authentication
Beginning in Version 4.2, you can customize IBM Streams user authentication by using Kerberos.
Kerberos is a network authentication protocol developed by the Massachusetts Institute of Technology (MIT). The Kerberos protocol uses secret-key cryptography to provide secure communications over a non-secure network. Primary benefits are strong encryption and single sign-on (SSO).
Support for encrypted PE connections
Beginning in Version 4.2, you can use the instance.transportSecurityType property to enable or disable encrypted connections between processing elements (PEs). By default, connections between PEs are not encrypted.
Support for using Apache Edgent with IBM Streams
Apache Edgent is an open source programming model and runtime environment for performing analytics on edge devices. An Apache Edgent application can perform simple analytics on an edge device without transmitting unnecessary data to your central analytics engine. To perform more complex analytics, you can connect the Apache Edgent application to your IBM Streams environment.
Support for using Hyperstate Accelerator as a checkpoint data store for IBM Streams
Hyperstate Accelerator is a hardware-accelerated key-value store (KVS) that is included with IBM Streams Version 4.2. You can use Hyperstate Accelerator as a checkpoint data store for IBM Streams domains and instances.
Hyperstate Accelerator is aimed at real-time analytics where high throughput, low latency, and high availability between tasks are required while the state of the system must still be remembered after a system restart. To provide superior performance, Hyperstate Accelerator optionally uses remote direct memory access over Converged Ethernet (RoCE) for fast network access to the data, and IBM FlashSystem® for persisting data on disk to survive system restarts and for failure recovery.
Support for developing IBM Streams applications with Python
Python is a popular language with a large and comprehensive standard library as well as many third-party libraries. The new IBM Streams Python Application API, which is included in the Topology Toolkit, enables you to create streams processing applications using Python callable classes or functions.
The Python Application API supports common operations, such as source, filter, transform, parallel, union, sink, publish, and subscribe.
Support for compiling ODM rules into SPL for use in IBM Streams applications
IBM Operational Decision Manager (ODM) allows developers and business analysts to create business rules, construct rule flows, and create and deploy rules applications in ODM. In previous versions of IBM Streams, the Rules Toolkit allowed running ODM rules in an ODM installation against streaming data in an IBM Streams application.
The new Rules Compiler and Rules Compiler Toolkit enable you to convert business rules that are written in ODM into an SPL composite that can be incorporated into IBM Streams applications. This provides superior performance compared to the existing Rules Toolkit and does not require an ODM installation.
In addition, the new rules development support in Streams Studio enables developers and business analysts to create rules, convert them into SPL, and use them in their IBM Streams application from within a single development environment.
New submission-time fusion option for better control over how jobs run
In previous versions of IBM Streams, the placement or fusion of operators into processing elements (PEs) was determined when you compiled your application. If an application contained many operators and each operator was fused into a separate PE, performance could be affected at run time. The only way to change how operators were fused was to change the application source and recompile the application.
By using submission-time fusion, you can control how operators are fused into PEs when you submit a job, which can improve runtime performance. By using new job configuration options, you can also define how job submissions are performed in your specific environment without having to recompile the application. You can use the Streams Console, Streams Studio, and the streamtool previewsubmitjob command to obtain information about a job submission before you submit the job.
If you are using existing placement configs to control fusion in an application, those placement configs are still supported.
Improved threading model for better performance
An improved threading model means that you can improve application performance by manually configuring PE threading, or by having the system determine threading behavior automatically.
Support for nested user-defined parallelism
Beginning in Version 4.2, IBM Streams supports nested user-defined parallelism, which allows for parallel regions to contain other parallel regions in your IBM Streams applications. The ability of parallel regions to be nested allows toolkit developers to incorporate user-defined parallelism (UDP) into their operators while allowing those operators to be incorporated into parallel regions.
Support for asynchronous non-blocking checkpointing of operator states
In IBM Streams Version 4.2, you can implement asynchronous non-blocking checkpointing in stateful primitive operators. Non-blocking checkpointing of operator state data reduces the time that the tuple flow is blocked during checkpointing.
New SPL compiler option for building optimized code
Beginning in IBM Streams Version 4.2, the SPL compiler builds optimized code by default. Optimizing the code disables SPL assertions.
To disable code optimization, you can use the new --no-optimized-code-generation option of the sc command. The -a (--optimized-code-generation) option of the sc command to enable code optimization can still be used.
Toolkit updates
The following toolkit updates are included in IBM Streams Version 4.2:
- Rules Compiler Toolkit: You can use this new toolkit and the Rules Compiler to convert business rules that are written in ODM into SPL that can be used in IBM Streams applications.
- Topology Toolkit: You can now use this toolkit to develop IBM Streams applications with Python.
- SPL standard toolkit: The filter parameter on the Import operator supports additional data types: rstring, float64, float32, int64, int32, int16, int8, uint64, uint32, uint16, uint8, and boolean literal value. For more information, see the SPL standard toolkit documentation.
- Requirements and restrictions for several of the specialized toolkits are updated.
- Migration requirements for applications that use the DPS, HBase, and TimeSeries toolkits are added. For more information, see the "Migrating applications" section for your version in the migration guidelines.
New and changed streamtool commands
IBM Streams Version 4.2 includes several new streamtool commands, such as previewsubmitjob, getinstancestate, and history. New commands related to devices and application configurations support the Apache Edgent integration. New commands were also added for the rolling upgrade, Kerberos authentication, and running the domain controller service as an unregistered service.
Additional updates to existing streamtool commands now let you specify a job name. Commands that support specifying a job name include canceljob, getapplicationlog, lsjobs, lspes, lsrestartrecs, restartpe, getjobtopology, and updatepe, among others. The --numresources option was updated to support explicitly requested resources.
There are also new and updated commands for the Version 4.2 serviceability enhancements.
For more information about streamtool commands, see the Command reference or enter streamtool man command-name.
Serviceability enhancements
Enhancement | Learn more... |
---|---|
The minimum Linux user limit (ulimit) value requirements for IBM Streams are increased, and the documentation is improved. | Guidelines for configuring Linux ulimit settings for IBM Streams |
Additional system compatibility checks are added to improve the detection of configuration issues. For example, name resolution and firewall checks are added for environments with multiple resources. | Planning roadmap |
The streamtool getlog command collects more information that can help you
debug issues with IBM
Streams. For example, the command now collects output that is similar to the output provided by the
following streamtool commands:
|
Log and trace services |
IBM
Streams provides metrics to help evaluate the health of IBM
Streams services, to aid in diagnosing performance issues, and to analyze throughput of requests. You
can use the Streams Console or the following streamtool commands to view the metrics data:
|
Metrics Administering a domain in the Streams Console streamtool checkdomainmetrics command |
The domain.jvmSizeComputationEnabled property is added and set to
true, by default. If you do not explicitly set the maximum JVM size, this property
controls whether IBM
Streams tries to select a maximum JVM size based on system memory usage. The default JVM size for all IBM Streams services is increased to 1024 megabytes. You can increase the JVM size by using domain and instance properties. If you are running the domain controller service as a Linux system service, you can also increase the JVM size for the controller by using the streamtool registerdomainhost, chdomainhostconfig, getdomainhostconfig, and rmdomainhostconfig commands. |
Configuration settings for Java memory issues |
You can monitor ZooKeeper performance by using the Streams Console or the streamtool checkzk command. You can reset the ZooKeeper server and connection statistics for the ZooKeeper ensemble by using the Streams Console or the streamtool resetzkstat command. |
Administering a domain in the Streams Console |
Documentation for configuring audit logging is improved. | Configuring audit logging for IBM Streams |
Documentation for configuring IBM Streams to log events and messages in the Linux system log is improved. | Logging events and messages in the Linux system log |
Documentation that you need to gather before contacting IBM Support is improved and includes links to additional IBM Streams resources. | A link to the new IBM Streams Problem Must Gather Information Technote is added to the Before contacting IBM Support procedure in the product documentation. |