What's new in IBM Cloud Pak for Data?

See what new features and improvements are available in the latest release of IBM Cloud Pak® for Data.

Version 4.7.4

Released: October 2023

This release of Cloud Pak for Data is primarily focused on defect fixes. However, this release includes new features for DataStage® and Watson™ Assistant.

Restriction: Cloud Pak for Data Version 4.7.4 must be installed on Red Hat® OpenShift® Container Platform Version Version 4.12.0 or later fixes.

If you are running Red Hat OpenShift Container Platform Version 4.10, you must upgrade your cluster before you upgrade Cloud Pak for Data.

Software Version What does it mean for me?
Cloud Pak for Data platform 4.7.4

Version 4.7.4 includes various fixes to the platform operator.

Related documentation:
Cloud Pak for Data command-line interface (cpd-cli) 13.0.4

Version 13.0.4 of the Cloud Pak for Data command-line interface includes various fixes.

For details, see What's new and changed in the Cloud Pak for Data command-line interface.

Related documentation:
IBM/cpd-cli repository on GitHub
Cloud Pak for Data scheduling service 1.17.0

Version 1.17.0 of the scheduling service includes various fixes.

For details, see What's new and changed in the scheduling service.

Related documentation:
Data Replication 4.7.4

Version 4.7.4 of the Data Replication service includes various fixes.

For details, see What's new and changed in Data Replication.

Related documentation:
Data Replication
DataStage 4.7.4

The 4.7.4 release of DataStage includes the following features and updates:

View validation messages for SQL Pushdown in the DataStage canvas console
Debug your flows more easily by viewing messages about which stages and properties successfully ran in ELT mode.
Use arguments as local variables in your migrated BASIC functions
Now when you migrate a DataStage flow that contains BASIC functions, the arguments are stored as local variables so that you can use them directly in your rewritten Bash scripts. For more information, see Migrating Basic Routines.
Monitor jobs and view the run metrics in the DataStage canvas
You can now view the run metrics for a DataStage flow and for each of its links and stages. You can use these run metrics to see where your jobs are complete, in progress, or failed. For more information, see Creating and managing DataStage jobs.
Use the Expression builder in the Modify stage
You can now access conversion functions in the Expression builder when you write specifications for the Modify stage. For more information, see Modify stage.

Version 4.7.4 of the DataStage service includes various fixes.

For details, see What's new and changed in DataStage.

Related documentation:
DataStage
IBM® Match 360 3.2.38

Version 3.2.38 of the IBM Match 360 service includes various fixes.

For details, see What's new and changed in IBM Match 360.

Related documentation:
IBM Match 360 with Watson
Watson Assistant 4.7.4

The 4.7.4 release of Watson Assistant includes the following features and updates:

Session history summary
You can use the session_history variable to store a summary of the recent messages from a conversation for each customer. You might use session history to provide a summary of the conversation during a transfer to a live agent, or you might use it to call a generative AI extension to generate an answer. For more information, see Session history in the watsonx Assistant documentation.
Specify how often to use No action matches
You can use a new global setting for actions to change how often your assistant routes customers to the No action matches action. By setting this threshold, you can specify when the assistant fetches answers from a search integration, triggers the Fallback action, or switches topics. For more information, see When the assistant can't understand your customer's request in the watsonx Assistant documentation.
See who last edited a collection or action
Now you can see who last edited a collection or action. On the Actions page, you can hover on the values in the Last edited column to see the email address of the person who last modified the collection or action.
Change to multilingual downloads for translation
The ID values that are used in the multilingual downloads for translation are changed. If you used the multilingual download before, you need to download a new CSV file to match the IDs in your assistant. For more information, see Using multilingual downloads for translation in the watsonx Assistant documentation.
Algorithm version provides improved intent detection and action matching for more languages

The algorithm version preview now provides improved intent detection and action matching for Arabic, Chinese (Simplified), Chinese (Traditional), Czech, Dutch, Italian, Japanese, and Korean. It includes a new foundation model that is trained using a transformer architecture to improve intent detection and action matching.

For more information, see Algorithm version and training in the watsonx Assistant documentation.

Version 4.7.4 of the Watson Assistant service includes various security fixes.

Related documentation:
Watson Assistant

Version 4.7.3

Released: September 2023

This release of Cloud Pak for Data is primarily focused on defect fixes. However, this release includes new features for services such as DataStage, OpenPages®, Watson Knowledge Catalog, and Watson Query.

This release also includes the following changes:
  • Removal of the Graph feature in Db2® and Db2 Warehouse.
  • Separation of MANTA Automated Data Lineage into a separately installed and upgraded service.
Software Version What does it mean for me?
Cloud Pak for Data platform 4.7.3

Version 4.7.3 of the platform includes various fixes.

For details, see What's new and changed in the platform.

Related documentation:
Cloud Pak for Data command-line interface (cpd-cli) 13.0.3

The 13.0.3 release of the Cloud Pak for Data command-line interface includes the following features and updates:

New cpd-cli manage command
You can use the new cpd-cli manage get-k8s-details command to get information about the Kubernetes resources that are associated with your IBM Cloud Pak for Data deployment.

Version 13.0.3 of the Cloud Pak for Data command-line interface includes various fixes.

Related documentation:
IBM/cpd-cli repository on GitHub
Cloud Pak for Data common core services 7.3.0
The 7.3.0 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
Version 7.3.0 of the common core services includes the following features and updates:
Access lakehouse data with the new watsonx.data connection
You can now use the watsonx.data connection to access data from a lakehouse in IBM Cloud or in IBM Cloud Pak for Data. watsonx.data is a governed data lakehouse that is optimized for data and AI workloads. For more information, see watsonx.data connection.
Access additional technical and lineage metadata
You can now use the new Oracle Data Integrator connector to import ETL jobs and their lineage metadata. For more information, see Oracle Data Integrator connection.

Version 7.3.0 of the common core services includes various fixes.

For details, see What's new and changed in the common core services.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Cloud Pak for Data scheduling service 1.16.0

Version 1.16.0 of the scheduling service includes various fixes.

For details, see What's new and changed in the scheduling service.

Related documentation:
AI Factsheets 4.7.3

The 4.7.3 release of AI Factsheets includes the following features and updates:

Changes to OpenPages field group names
The field group names that you use to share governance information between OpenPages and AI Factsheets changed.
  • If you are already sharing governance facts between AI Factsheets and OpenPages, the previous field group names will persist in your use cases.
  • If you create new integrations, the new field group names are used.

For more information, see Configuring AI Factsheets with OpenPages.

Version 4.7.3 of the AI Factsheets service includes various fixes.

Related documentation:
AI Factsheets
Analytics Engine powered by Apache Spark 4.7.3

Version 4.7.3 of the Analytics Engine powered by Apache Spark service includes various fixes.

For details, see What's new and changed in Analytics Engine powered by Apache Spark.

Related documentation:
Analytics Engine powered by Apache Spark
Cognos® Analytics 24.3.0

The 24.3.0 release of Cognos Analytics includes the following features and updates:

Updated software version
This release of the Cognos Analytics service provides Version 11.2.4 Fix Pack 2 + IF 1013 of the Cognos Analytics software. For details, see Release 11.2.4 FP2 - New and changed features in the Cognos Analytics documentation.
Updated JDBC driver version for Db2
This release of the Cognos Analytics service provides an updated version of the Db2 driver to Version 4.32.45.

Version 24.3.0 of the Cognos Analytics service includes various fixes.

Related documentation:
Cognos Analytics
Cognos Dashboards 4.7.3

The 4.7.3 release of Cognos Dashboards includes the following features and updates:

Updated software version
This release of the Cognos Dashboards service provides Version 12.0.1 of the Cognos Analytics dashboards software. For details, see Release 12.0.1 - Dashboards in the Cognos Analytics documentation.

Version 4.7.3 of the Cognos Dashboards service includes various fixes.

For details, see What's new and changed in Cognos Dashboards.

Related documentation:
Cognos Dashboards
Data Privacy 4.7.3

Version 4.7.3 of the Data Privacy service includes various fixes.

For details, see What's new and changed in Data Privacy.

Related documentation:
Data Privacy
Data Replication 4.7.3

Version 4.7.3 of the Data Replication service includes various fixes.

For details, see What's new and changed in Data Replication.

Related documentation:
Data Replication
DataStage 4.7.3

The 4.7.3 release of DataStage includes the following features and updates:

Set materialization policy for SQL Pushdown on the DataStage canvas
You can now set the materialization policy for your queries when you use ELT run mode. For more information, see ELT run mode in DataStage.
Connect to Google BigQuery data with ODBC
The ODBC connection now includes the Google BigQuery data source.

For the full list of data sources that are available for the ODBC connection in DataStage, see ODBC connection.

Use a proxy server for the Salesforce (optimized) connection
You can now select a proxy server for the Salesforce (optimized) connection. A proxy server can provide load balancing, increased security, and privacy for the connection. For details, see Salesforce (optimized) connection.

Version 4.7.3 of the DataStage service includes various fixes.

For details, see What's new and changed in DataStage.

Related documentation:
DataStage
Db2 4.7.3

The 4.7.3 release of Db2 includes the following features and updates:

Removal of Db2 Graph
Db2 Graph is not available on Cloud Pak for Data Version 4.7.3. Any existing deployments of Db2 Graph will be removed when you upgrade to Version 4.7.3.

If you want to continue to use Db2 Graph, do not upgrade to Version 4.7.3.

Version 4.7.3 of the Db2 service includes various fixes.

For details, see What's new and changed in Db2.

Related documentation:
Db2
Db2 Big SQL 7.5.3

Version 7.5.3 of the Db2 Big SQL service includes various fixes.

Related documentation:
Db2 Big SQL
Db2 Data Management Console 4.7.3

Version 4.7.3 of the Db2 Data Management Console service includes various fixes.

For details, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse 4.7.3

The 4.7.3 release of Db2 Warehouse includes the following features and updates:

Removal of Db2 Graph
Db2 Graph is not available on Cloud Pak for Data Version 4.7.3. Any existing deployments of Db2 Graph will be removed when you upgrade to Version 4.7.3.

If you want to continue to use Db2 Graph, do not upgrade to Version 4.7.3.

Version 4.7.3 of the Db2 Warehouse service includes various fixes.

For details, see What's new and changed in Db2 Warehouse.

Related documentation:
Db2 Warehouse
Decision Optimization 7.3.0

Version 7.3.0 of the Decision Optimization service includes various fixes.

For details, see What's new and changed in Decision Optimization.

Related documentation:
Decision Optimization
EDB Postgres 15.4, 14.9, 13.12, 12.16

This release of the EDB Postgres service includes various fixes.

Related documentation:
EDB Postgres
Execution Engine for Apache Hadoop 4.7.3

Version 4.7.3 of the Execution Engine for Apache Hadoop service includes various fixes.

For details, see What's new and changed in Execution Engine for Apache Hadoop.

Related documentation:
Execution Engine for Apache Hadoop
MANTA Automated Data Lineage 41.1.0

In previous releases of Cloud Pak for Data, MANTA Automated Data Lineage was an optional component of Watson Knowledge Catalog. Starting in Cloud Pak for Data Version 4.7.3, MANTA Automated Data Lineage is a separately installed service. For more information, see MANTA Automated Data Lineage.

Version 41.1.0 of the MANTA Automated Data Lineage service includes various fixes.

For details, see What's new and changed in MANTA Automated Data Lineage.

Related documentation:
MANTA Automated Data Lineage
OpenPages 9.000.0

The 9.000.0 release of the OpenPages service includes features and updates from OpenPages Version 9.0.0.0. Some of the highlights include:

New service integrations and use cases for custom machine learning models
You can now use the following AI services when you configure your model in OpenPages:
  • Watson Machine Learning on IBM Cloud Pak for Data
  • Watson Machine Learning on IBM Cloud
  • Natural Language Understanding on IBM Cloud

    You can use this integration for features such as classification, entity extraction, and text summarization/

The enhanced integration supports the following use cases:
  • Insight, which displays information for models such as cognitive controls and personally identifiable information (PII).
  • Set Fields, which uses classification models that set fields.
  • Set Tags, which uses models to suggest tags to set on an object.
Custom error messages for workflows
Administrators can now create custom messages that display to users when a validation is triggered within a workflow. Instead of relying on auto-generated messages, administrators can now provide targeted guidance to users about the actions that they must take to advance the workflow.
More options for filtering objects
The Version 9.000.0 release includes many enhancements to object filtering:
  • You can now create ad hoc and private filters for ancestor and descendant objects in dashboards and grid views.
  • You now have more operators to choose from when you create filters.
  • When administrators define Add, Copy Recursive, or Set Primary Parent actions in a task view, they can now prevent users from removing the filters that are set for these actions.
  • With the new shared relationship filter option, administrators can limit users' options for Association or Copy Recursive actions. An administrator can choose to allow users to select only objects with a shared ancestor or descendant to the current object.
New version of the OpenPages REST API and a new Developer Guide
The OpenPages GRC REST API V2 offers the following benefits:
  • Conforms to IBM Cloud API guidelines, which provides consistency with other IBM Cloud products.
  • Compatible with SCIM.
  • Improved ease of use.

The new Developer Guide contains all the documentation and samples that developers need to interact programmatically with OpenPages. The Developer Guide includes detailed information about the following topics:

  • The V1 and V2 versions of the IBM OpenPages GRC REST API
  • IBM OpenPages GRC Java™ API
  • API Samples
  • Triggers

For information about these new features and other updates in OpenPages Version 9.000.0, see New features in Version 9.0.0 in the OpenPages documentation.

Version 9.000.0 of the OpenPages service includes various fixes.

Related documentation:
OpenPages
Planning Analytics 4.7.3

The 4.7.3 release of Planning Analytics includes the following features and updates:

Updated versions of Planning Analytics software
This release of the Planning Analytics service provides the following software versions:
  • TM1® Version 2.0.9.18

    For details about this version of the software, see Planning Analytics 2.0.9.18 in the Planning Analytics documentation.

  • Planning Analytics Workspace Version 2.0.89.

    For details about this version of the software, see 2.0.89 - What's new in the Planning Analytics Workspace documentation.

  • Planning Analytics for Microsoft Excel Version 2.0.89.

    For details about this version of the software, see 2.0.89 - Feature updates in the Planning Analytics for Microsoft Excel documentation.

Version 4.7.3 of the Planning Analytics service includes various fixes.

Related documentation:
Planning Analytics
Product Master 4.2.0

Version 4.2.0 of the Product Master service includes various fixes.

For details, see What's new and changed in Product Master.

Related documentation:
Product Master
Watson Discovery 4.7.3

Version 4.7.3 of the Watson Discovery service includes various fixes.

Related documentation:
Watson Discovery
Watson Knowledge Catalog 4.7.3

The 4.7.3 release of Watson Knowledge Catalog includes the following features and updates:

Legacy features are removed from Watson Knowledge Catalog

For details, see What's new and changed in Watson Knowledge Catalog.

Exclude DataStage job runs when getting ETL job lineage
With the Get ETL job lineage import option, you can decide to include or exclude DataStage job runs in the imported data lineage. You can exclude the job runs to limit the script count for MANTA Automated Data Lineage. For more information, see Advanced import options.
MANTA Automated Data Lineage is now a separately installed service
MANTA Automated Data Lineage is no longer available as an optional feature of Watson Knowledge Catalog. Now, you must install the MANTA Automated Data Lineage service if you want to use advanced metadata import. For more information, see MANTA Automated Data Lineage.
Import metadata from additional data sources
If you install the MANTA Automated Data Lineage service or had the advanced metadata import feature enabled in a previous version, you can now import technical metadata from the following data sources:
  • Oracle Data Integrator
  • watsonx.data
For more information, see Oracle Data Integrator connection and watsonx.data connection
Data quality monitoring and remediation workflows
To focus quality improvement efforts on the data that is most important for your organization you identify critical data elements, define quality expectations, and ensure remediation of data quality issues.
You can now build data quality SLA rules to:
  • Monitor the quality of critical data against specific quality criteria.
  • Trigger remediation workflows if the quality doesn’t meet the expectations.

    You can work with the default remediation workflow or create custom workflows.

You can also view information about SLA rule compliance or violations and the status of remediation tasks on a monitored data asset’s Data quality page.

SLA rule compliance information on the Data quality page
Quickly find catalogs with name and date sorting
You can now sort the list of catalogs on the All Catalogs page by name or by date created. The sort options enable you to find catalogs faster.
  • Click the Name header to sort the catalogs alphabetically by name.
  • Click the Date created header to sort the catalogs by ascending or descending dates.
View column source types
You can check the source datatype for asset columns from the Overview page of an asset.
Import catalog assets in bulk
You can now add or update catalog assets in bulk by supplying asset metadata in a CSV file. To upload a CSV file for a bulk import, open a catalog and select Metadata import from file from the Add to Catalog drop-down menu. For more information, see Adding assets to a catalog.

Version 4.7.3 of the Watson Knowledge Catalog service includes various fixes.

For details, see What's new and changed in Watson Knowledge Catalog.

Related documentation:
Watson Knowledge Catalog
Watson Knowledge Studio 5.2.0

Version 5.2.0 of the Watson Knowledge Studio service includes various fixes.

Related documentation:
Watson Knowledge Studio
Watson Machine Learning Accelerator 4.3.0

Version 4.3.0 of the Watson Machine Learning Accelerator service includes various fixes.

Related documentation:
Watson Machine Learning Accelerator
Watson Pipelines 4.7.3

The 4.7.3 release of Watson Pipelines includes the following features and updates:

User variable values can be pipeline results

When you specify that the value for a user variable can be used as a result, it makes the value available as an output in any Run Pipeline Job nodes that use jobs which are associated with this pipeline. For details, see Configuring global objects for Watson Pipelines.

Share runtime for pipeline activities
Tech previewTasks that required dedicated pods to execute code in an isolated container can now be delegated to a single deployed runtime. A runtime pod is dedicated to users in a Cloud Pak for Data project on a one-to-one basis. Resource allocation for runtime pods, such as hardware spec, controller memory size, and number of workers can be customized.

Version 4.7.3 of the Watson Pipelines service includes various fixes.

Related documentation:
Watson Pipelines
Watson Query 2.1.3

The 2.1.3 release of Watson Query includes the following features and updates:

Log audit events when Watson Query is deployed in a tethered namespace
You can now log audit events when an instance is provisioned in an OpenShift project separate from Cloud Pak for Data (or a "tethered project"). For more information on audit events, see Audit events for Watson Query.
Administrators can now make virtual objects visible to all users
Administrators can now choose to give users a more comprehensive view of the content by making existing virtual objects visible from the Virtualized data page. Data access within those objects continues to adhere to Watson Query authorizations and data protection rules. To enable this feature, administrators need to disable the Restrict visibility setting from Service settings. For more information, see Managing visibility of virtual objects in Watson Query.
Easily format how EXPLAIN information appears for query access plans
You can now use the Cloud Pak for Data web client to format how EXPLAIN information appears when you generate query access plans. You can then run the db2exfmt command from the web client to easily generate and download the EXPLAIN output in text files. For more information about the db2exfmt command, see db2exfmt - Explain table format command in the Db2 documentation.
Use wildcard characters to filter your data sources
Now when you create a virtualized table, you can use the following wildcard characters:
  • % (percent): To represent zero or more characters
  • _ (underscore): To represent a single character

You can use the wildcard characters to customize filters to find the data sources that you need. For more information, see Filtering data in Watson Query.

Configure PEP cache settings from the web client
You can now use the Cloud Pak for Data web client to configure policy enforcement point (PEP) cache settings, such as cache size and cache live time, for data protection rules. For more information on PEP caches, see Enabling enforcement of data protection rules in Watson Query.
Watson Query users can publish their own virtual objects
Users with the User role in Watson Query can now publish virtual objects that they created to governed catalogs. For more information, see Publishing virtual data to a catalog with Watson Query.

Version 2.1.3 of the Watson Query service includes various fixes.

For details, see What's new and changed in Watson Query.

Related documentation:
Watson Query
Watson Speech services 4.7.3

Version 4.7.3 of the Watson Speech to Text service includes various fixes.

For details, see What's new and changed in Watson Speech to Text.

Related documentation:
Watson Speech services
watsonx.data 1.0.3

The 1.0.3 release of watsonx.data includes the following features and updates:

Enable fragment result caching
You can now enable and configure fragment result caching in watsonx.data. Fragment result caching:
  • Enables you to virtualize data sources
  • Improves performance by caching the partially computed results of SQL queries.

For details, see Setting up fragment result cache.

Enable dereference pushdown feature
You can now use the dereference pushdown feature to push down the parquet dereference expressions in the Apache Iceberg connector. With dereference pushdown, the query performance is improved by accessing the nested fields, reducing the CPU and query runtime.
New version of Presto
The Presto engine was updated from Version 0.279 to Version 0.282, which includes enhancements and defect fixes.

For details, see Creating an engine.

Version 1.0.3 of the watsonx.data service includes various fixes.

For details, see What's new and changed in watsonx.data.

Related documentation:
watsonx.data

Version 4.7.2

Released: August 2023

This release of Cloud Pak for Data is primarily focused on defect fixes. However, this release includes new features for the Cloud Pak for Data platform and for services such as DataStage, Watson Assistant, Watson Knowledge Catalog, and watsonx.data.

Software Version What does it mean for me?
Cloud Pak for Data platform 4.7.2

Version 4.7.2 of the Cloud Pak for Data platform includes the following features and updates:

Use additional attributes to create dynamic user groups
By default, you can use only four pre-defined attributes to create dynamic user groups. Starting in Cloud Pak for Data Version 4.7.2, you can use one of the following methods to specify additional attributes when you create a dynamic user group. For a high-level comparison of the options, see Specifying additional attributes that can be used to create dynamic user groups.
Adding attributes from your LDAP to the Identity Management Service
If you want to use attributes that are defined in your identity provider, you can configure the Identity Management Service to return the attributes. Then, you can configure Cloud Pak for Data to use the attributes. For more information, see Using additional attributes from your identity provider to create dynamic user groups.
Creating a custom attribute provider
If you want to use attributes that are not defined in your identity provider, you can create a custom attributes provider. With a custom attribute provider, you can provide information that is not part of your company's primary IdP. After you create a custom attributes provider, you can configure Cloud Pak for Data to use the attributes. For more information, see Using a custom attribute provider to specify additional attributes that can be used to create dynamic user groups.
Give a user the minimum role-based access control to install software
A cluster administrator can give another user the minimum role-based access control (RBAC) to install various Cloud Pak for Data components.
Important: This method is recommended only for customers with rigid security requirements. It is not recommended for most customers because it requires additional planning and maintenance.
Minimum RBAC to install the scheduling service
By default, a cluster administrator must install the scheduling service. However, you can optionally give another user the minimum role-based access control (RBAC) that is needed to install the scheduling service. For more information, see Giving a user the minimum RBAC to install the scheduling service.
Minimum RBAC to install an instance of Cloud Pak for Data
If a user other than the cluster administrator will install Cloud Pak for Data, you must give a Red Hat OpenShift Container Platform user the appropriate permissions to install the software in the instance projects. You can use one of the following methods to give the user the required permissions:
Giving the user the admin role (recommended)
To give a user the admin role on the projects associated with the instance, see Authorizing a user to act as an instance administrator.
Giving the user the minimum RBAC
To give a user the required permissions to install the software without giving the user the admin role on the projects associated with the instance, see Giving a user the minimum RBAC to install Cloud Pak for Data components

Version 4.7.2 of the platform includes various fixes.

For details, see What's new and changed in the platform.

Related documentation:
Cloud Pak for Data command-line interface (cpd-cli) 13.0.2

The 13.0.2 release of the Cloud Pak for Data command-line interface includes the following features and updates:

New cpd-cli manage command
You can use the new cpd-cli manage show-minimum-rbac command to generate YAML files that define the minimum privileges that a user must have to create, modify, and view the resources that are associated with Cloud Pak for Data.

This command provides an alternative method for authorizing a user to act as an instance administrator. However, this method is recommended only if you are not willing to grant the roles described in Authorizing a user to act as an instance administrator.

Version 13.0.2 of the Cloud Pak for Data command-line interface includes various fixes.

For details, see What's new and changed in the Cloud Pak for Data command-line interface.

Related documentation:
IBM/cpd-cli repository on GitHub
Cloud Pak for Data common core services 7.2.0
The 7.2.0 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
Version 7.2.0 of the common core services includes the following features and updates:
Access more technical metadata with new connectors
New connectors are available for importing technical data from the following data sources:
  • Microsoft SQL Server Integration Services
  • Microsoft SQL Server Reporting Services
  • Oracle Business Intelligence Enterprise Edition
  • Qlik Sense

For more information, see Supported data sources for metadata import, metadata enrichment, and data quality rules.

Version 7.2.0 of the common core services includes various fixes.

For details, see What's new and changed in the common core services.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Cloud Pak for Data scheduling service 1.15.0

Version 1.15.0 of the scheduling service includes various fixes.

For details, see What's new and changed in the scheduling service.

Related documentation:
AI Factsheets 4.7.2

The 4.7.2 release of AI Factsheets includes the following features and updates:

Update the model version or approach for a tracked model
You can now move a tracked model from one approach to another and you can edit the version number for a tracked model from the model use case. For more information, see Tracking model versions in use cases.

Version 4.7.2 of the AI Factsheets service includes various fixes.

Related documentation:
AI Factsheets
Analytics Engine powered by Apache Spark 4.7.2

Version 4.7.2 of the Analytics Engine powered by Apache Spark service includes various fixes.

For details, see What's new and changed in Analytics Engine powered by Apache Spark.

Related documentation:
Analytics Engine powered by Apache Spark
Cognos Dashboards 4.7.2

The 4.7.2 release of Cognos Dashboards includes the following features and updates:

Updated software version
This release of the Cognos Dashboards service provides Version 12.0.0 + Interim Fix of the Cognos Analytics dashboards software. For details, see Release 12.0.0 - Dashboards in the Cognos Analytics documentation.

Version 4.7.2 of the Cognos Dashboards service includes various fixes.

Related documentation:
Cognos Dashboards
Data Privacy 4.7.2

Version 4.7.2 of the Data Privacy service includes various fixes.

For details, see What's new and changed in Data Privacy.

Related documentation:
Data Privacy
DataStage 4.7.2

The 4.7.2 release of DataStage includes the following features and updates:

Use new masking, encryption, and regex functions in the Transformer stage
You can call these functions through the Expression Builder in the Transformer stage in your DataStage flows. For more information, see Parallel transform functions.
Drag and drop columns in the Output tab of the Transformer stage
You can now map input columns to output columns in the Output tab of the Transformer stage by selecting input columns and dragging and dropping them into the output column table. For more information, see Transformer stage: Output tab.
Migrate built-in functions that are used in the User variable activity stage to CEL expressions
You can now migrate jobs that use built-in functions in the User variable activity stage from traditional DataStage to DataStage on Cloud Pak for Data. Expressions that use these functions are converted to CEL, and you can access the functions in the Expression Builder. For more information, see CEL expressions in DataStage.
Send emails with multiple attachments in pipelines
You can now use the Send email node to send emails with multiple attachments in your pipeline flows. For more information, see Pipeline components for DataStage.

Version 4.7.2 of the DataStage service includes various fixes.

For details, see What's new and changed in DataStage.

Related documentation:
DataStage
Db2 4.7.2

Version 4.7.2 of the Db2 service includes various fixes.

For details, see What's new and changed in Db2.

Related documentation:
Db2
Db2 Data Gate 4.1.0

Version 4.1.0 of the Db2 Data Gate service includes various fixes.

For details, see What's new and changed in Db2 Data Gate.

Related documentation:
Db2 Data Gate
Db2 Data Management Console 4.7.2

Version 4.7.2 of the Db2 Data Management Console service includes various fixes.

For details, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse 4.7.2

Version 4.7.2 of the Db2 Warehouse service includes various fixes.

For details, see What's new and changed in Db2 Warehouse.

Related documentation:
Db2 Warehouse
Decision Optimization 7.2.0

Version 7.2.0 of the Decision Optimization service includes various fixes.

For details, see What's new and changed in Decision Optimization.

Related documentation:
Decision Optimization
EDB Postgres 14.8, 13.11, 12.15

This release of the EDB Postgres service includes various fixes.

Related documentation:
EDB Postgres
Informix® 6.1.0

Version 6.1.0 of the Informix service includes various fixes.

Related documentation:
Informix
Watson Assistant 4.7.2

The 4.7.2 release of Watson Assistant includes the following features and updates:

Algorithm for improved intent detection and action matching
You can now use the preview version of the Watson Assistant algorithm, which includes a new foundation model that is trained with a transformer architecture. This algorithm provides the following improvements:
  • Improved intent detection and action matching for English, French, German, Portuguese (Brazilian), and Spanish.
  • Improved robustness to variations in user inputs, such as typographical errors and different inflection forms.
  • Reduction in the amount of training data required to reach the same level of performance compared to previous algorithms.

For more information, see Algorithm version and training in the Watson Assistant documentation.

Multiple validation responses
When you edit a validation for a customer response, you can now include several validation responses. For more information, see Customizing validation for a response in the Watson Assistant documentation.
Multilingual downloads
You can download language data files, in CSV format, so that you can translate training examples and assistant responses into other languages for use in other assistants. For more information, see Using multilingual downloads for translation in the Watson Assistant documentation.
Fallback option
The dynamic options response type now includes a fallback static choice, such as None of the above, if the options aren't what the customer wants. You can then add a step that is conditioned on this static option to provide further assistance. For more information, see Dynamic options in the Watson Assistant documentation.
Option to allow change of topic between actions and dialog
If you are using actions and dialog, there is a new setting you can use to ensure that customers can change topics between an action and a dialog node. For more information, see Allow change of topic between actions and dialog in the Watson Assistant documentation.
Unique action and collection name requirement
With this release, each action name must be unique, and each collection name must be unique. If your existing actions or collections have duplicate names, a warning icon will appear in the Status column. For more information, see Overview: Editing actions and Organizing actions in collections in the Watson Assistant documentation.

Version 4.7.2 of the Watson Assistant service includes various fixes.

Related documentation:
Watson Assistant
Watson Knowledge Catalog 4.7.2

The 4.7.2 release of Watson Knowledge Catalog includes the following features and updates:

Legacy features are removed from Watson Knowledge Catalog

For details, see What's new and changed in Watson Knowledge Catalog.

Metadata import enhancements
Import from additional data sources
If the advanced metadata import feature is enabled, you can now import technical metadata from the following data sources:
  • DataStage on Cloud Pak for Data
  • Informatica PowerCenter
  • InfoSphere® DataStage
  • Microsoft SQL Server Analysis Services
  • Microsoft SQL Server Integration Services
  • Microsoft SQL Server Reporting Services
  • Oracle Business Intelligence Enterprise Edition
  • Qlik Sense
  • Statistical Analysis System
  • Talend

For lineage imports from Apache Hive, you can now also provide HiveQL scripts as input.

For lineage imports from Teradata, you can now also provide BTEQ (Basic Teradata Query) scripts as input.

For more information, see Supported data sources for metadata import, metadata enrichment, and data quality rules.

Import without SELECT permission
You can now configure metadata imports from relational databases so that users with access to only the database catalog can run the import. For more information, see Designing metadata imports.
More granular selection of import goals
The create flow for metadata imports now provides a finer grained selection of import goals. The set of available connections is scoped to the matching connections, which makes picking the right one easier. For more information, see Designing metadata imports.
Metadata enrichment enhancements
Advanced profiling
To get more exact results for certain metrics, such as frequency distribution and uniqueness of values within a column, you can now run advanced profiling on selected data assets. In addition, you can choose to write detailed information about distinct values to a database table for further processing. For more information, see Advanced data profiles.
Specify which checks are included in quality analysis
To tailor the type of data quality analysis that is run as part of metadata enrichment, you can now specify which predefined data quality checks are run during analysis. For more information, see Enrichment settings.
Export and import profiling results
When you export or import data from a project or by using the cpd-cli, the profiling results for data assets are now included.
Data quality rules can now contribute to more than one quality dimension
When you configure data quality rules with multiple data quality definitions, you can now choose whether the rule contributes to a single data quality dimension or to all dimensions that are set on the individual definitions. The latter can give you a clearer picture of the areas where your data might have quality issues. For more information, see Creating rules from data quality definitions.
Lineage enhancements
Create relationship mapping files
You can now use relationship mapping files and include them in your lineage graph. Lineage relationship mapping files define a mapping group that contains mappings, where each mapping defines the relationship between two different assets. Lineage relationship mapping files are in CSV format what makes them easy to create manually and import them in your catalog. For more information, see Importing lineage relationship mapping file.
View the quality score of assets in lineage graph

When you view the lineage graph, you can now see the quality score for each asset that is visible on the lineage graph.. For more information, see Managing business lineage.

Version 4.7.2 of the Watson Knowledge Catalog service includes various fixes.

For details, see What's new and changed in Watson Knowledge Catalog.

Related documentation:
Watson Knowledge Catalog
Watson Machine Learning Accelerator 4.2.0

Version 4.2.0 of the Watson Machine Learning Accelerator service includes various fixes.

Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale 4.7.2

Version 4.7.2 of the Watson OpenScale service includes various fixes.

For details, see What's new and changed in Watson OpenScale.

Related documentation:
Watson OpenScale
Watson Pipelines 4.7.2

The 4.7.2 release of Watson Pipelines includes the following features and updates:

Send multiple attachments with emails
You can now send up to four attachments when you send an email using the corresponding node in your pipeline, as well as the option to compress them into a single zip file for your convenience. For more details, see Configuring pipeline nodes.

Version 4.7.2 of the Watson Pipelines service includes various fixes.

Related documentation:
Watson Pipelines
watsonx.data 1.0.2

The 1.0.2 release of watsonx.data includes the following features and updates:

Presto connections improvement
When you install watsonx.data, a secure re-encrypt route for the Presto server is automatically created. You can use this secure re-encrypt route to access the Presto server from outside the Red Hat OpenShift Container Platform cluster.

For more information, see Exposing a secure route to Presto server.

Access the Hive Metastore (HMS) from external applications
You can now use a NodePort to expose the watsonx.data HMS service so that applications outside of the Red Hat OpenShift Container Platform cluster can access the service.

For more information, see Exposing HMS by using a NodePort.

Customize the watsonx.data TLS certificate
You can now define multiple DNS entries in the Subject Alternative Name (SAN) section of the watsonx.data TLS certificate.

For more information, see Importing self-signed certificates from a Hive Metastore server to a Java truststore.

Install on a FIPS-enabled cluster
You can now run the watsonx.data service on a FIPS-enabled cluster.

Version 1.0.2 of the watsonx.data service includes various fixes.

For details, see What's new and changed in watsonx.data.

Related documentation:
watsonx.data

Version 4.7.1

Released: July 2023

Cloud Pak for Data Version 4.7.1 introduces the watsonx.data service.

The release also includes updates for services such as DataStage, Watson Assistant, Watson Discovery, and Watson Knowledge Catalog.

Version 4.7.1 also introduces support for upgrades from Version 4.5.

Software Version What does it mean for me?
Cloud Pak for Data platform 4.7.1

Version 4.7.1 of the Cloud Pak for Data platform includes the following features and updates:

Upgrades from Version 4.5
Starting with Cloud Pak for Data Version 4.7.1, you can upgrade your existing Version 4.5.x environment to Version 4.7. For more information, see Upgrading from IBM Cloud Pak for Version 4.5 to Version 4.7.

Version 4.7.1 of the platform includes various fixes.

Related documentation:
Cloud Pak for Data APIs 4.7.0

Version 4.7.0 of the Cloud Pak for Data APIs includes various fixes.

For details, see What's new and changed in the Cloud Pak for Data APIs.

Related documentation:
Cloud Pak for Data APIs on IBM Cloud Docs
Cloud Pak for Data command-line interface (cpd-cli) 13.0.1

The 13.0.1 release of the Cloud Pak for Data command-line interface includes the following features and updates:

New cpd-clicode-package commands

Version 13.0.1 of the Cloud Pak for Data command-line interface includes various fixes.

For details, see What's new and changed in the Cloud Pak for Data command-line interface.

Related documentation:
IBM/cpd-cli repository on GitHub
Cloud Pak for Data common core services 7.1.0
The 7.1.0 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
Version 7.1.0 of the common core services includes the following features and updates:
Key-pair authentication for Snowflake connections
You can now use key-pair authentication for Snowflake connections. Key-pair authentication is more secure than providing a username and password. For more information on using key-pair authentication, see Snowflake connection.

Version 7.1.0 of the common core services includes various fixes.

For details, see What's new and changed in the common core services.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Cloud Pak for Data scheduling service 1.14.0

Version 1.14.0 of the scheduling service includes various fixes.

For details, see What's new and changed in the scheduling service.

Related documentation:
AI Factsheets 4.7.1

Version 4.7.1 of the AI Factsheets service includes various fixes.

Related documentation:
AI Factsheets
Analytics Engine powered by Apache Spark 4.7.1

The 4.7.1 release of Analytics Engine powered by Apache Spark includes the following features and updates:

Run Spark workloads for watsonx.data
You can integrate Analytics Engine powered by Apache Spark with watsonx.data so that you can run Spark workloads for the new watsonx.data service. For more information, see Configuring an Analytics Engine powered by Apache Spark instance for watsonx.data.

Version 4.7.1 of the Analytics Engine powered by Apache Spark service includes various fixes.

For details, see What's new and changed in Analytics Engine powered by Apache Spark.

Related documentation:
Analytics Engine powered by Apache Spark
Data Privacy 4.7.1

The 4.7.1 release of Data Privacy includes the following features and updates:

New Data Privacy APIs are available
Grant superuser access to all data in all catalogs
If you are an administrator, you can now create dynamic meta rules that grant unrestricted access to all data assets in all catalogs for specific users. The designated users are not subject to data protection rules that restrict access to data in catalogs. You create dynamic meta rules by adding the DMR governance type ("governance_type_id": "DMR") with the create a rule Watson Data API. For more information, see Enforcing data protection rules with IBM Security Guardium® Data Protection.
Make masking jobs incremental
You can use the Watson Data API (PATCH /v2/jobs/<job_id>) to configure a masking job to be incremental. For more information, see Managing jobs using API in the Managing job performance topic.
Output truncated to accommodate column length restrictions
The column length is the maximum length that is defined for a column in a database for string type data. Previously, the generated masking output did not account for the column length, and the masking flow job would fail if any of the output values surpassed the specified column length. Now, the generated output is truncated to ensure that it doesn't exceed the column length restrictions.

For more information, see Masking data with Masking flow.

Version 4.7.1 of the Data Privacy service includes various fixes.

For details, see What's new and changed in Data Privacy.

Related documentation:
Data Privacy
Data Replication 4.7.1

Version 4.7.1 of the Data Replication service includes various fixes.

For details, see What's new and changed in Data Replication.

Related documentation:
Data Replication
DataStage 4.7.1

The 4.7.1 release of DataStage includes the following features and updates:

Connect to a new data source in DataStage
You can now include data from Tableau in your DataStage flows.

For the full list of DataStage connectors, see Supported data sources in DataStage.

Additional connector in ELT run mode
You can now use the Db2 Optimized connector in Extract, Load, Transform mode in DataStage. For more information, see ELT run mode in DataStage.
Easily update custom job subroutines after migration
If you migrate a job with custom before-job or after-job subroutines, you can now create a script to update any subroutines that reference the job. For more information, see Migrating DataStage jobs.
Define custom environment variables with PROJDEF
You can now define custom environment variables in the PROJDEF parameter set and add them to your flows. For more information, see PROJDEF parameter set in DataStage.

Version 4.7.1 of the DataStage service includes various fixes.

For details, see What's new and changed in DataStage.

Related documentation:
DataStage
Decision Optimization 7.1.0

Version 7.1.0 of the Decision Optimization service includes various fixes.

For details, see What's new and changed in Decision Optimization.

Related documentation:
Decision Optimization
EDB Postgres 14.8, 13.11, 12.15

This release of the EDB Postgres service includes various fixes.

Related documentation:
EDB Postgres
IBM Match 360 3.1.25

The 3.1.25 release of IBM Match 360 includes the following features and updates:

Help prevent the creation of low-quality entities by defining glue record thresholds
This release of IBM Match 360 introduces a new glue record threshold configuration to help prevent the creation of poor quality entities. When IBM Match 360 forms entities through matching, some low-quality records can act as glue records. Because the records are not very detailed, glue records can appear to match with many different records. This matching behavior inadvertently and incorrectly creates very large entities that have only one low-quality glue record in common.

By setting a glue record threshold in the matching algorithm for each entity type, a data engineer can prevent glue records from causing the formation of large, poorly matched entities. For more information, see Configuring a glue record threshold.

Finer control over which records are considered for matching
You can now define filters that determine which records are considered during the matching process. By default, all records associated with the entity’s record type are considered during matching. However, with record filtering, you can create rules that remove unwanted records from consideration, based on their attribute values.

For example, if you are matching person entity data but want to match only records that are based in the United States, you can define a record filter on the Country attribute so that the matching engine considers only records with US-based addresses.

For more information, see Selecting which records get considered for matching and Matching algorithms > Entity types (bucketing).

More control over how entities select attribute values from member records
With the enhanced attribute composition capabilities, data engineers have more control over how the attribute values for master data entities are selected from the entity's member records. You can now define filter and sorting conditions that control which records are considered and how attributes are selected for an entity. For attributes that can potentially include more than one value, such as a list, you can configure the maximum number of values.

By defining and prioritizing a set of rules, filters, and other conditions, you have more control over which record attribute values are surfaced to the entity. For more information, see Defining attribute composition rules.

Version 3.1.250 of the IBM Match 360 service includes various fixes.

For details, see What's new and changed in IBM Match 360.

Related documentation:
IBM Match 360 with Watson
Planning Analytics 4.7.1

The 4.7.1 release of Planning Analytics includes the following features and updates:

Updated versions of Planning Analytics software
This release of the Planning Analytics service provides the following software versions:
  • Planning Analytics Workspace Version 2.0.88.

    For details about this version of the software, see 2.0.88 - What's new in the Planning Analytics Workspace documentation.

  • Planning Analytics Spreadsheet Services Version 2.0.88.

    For details about this version of the software, see 2.0.88 - Feature updates in the TM1 Web documentation.

Version 4.7.1 of the Planning Analytics service includes various fixes.

Related documentation:
Planning Analytics
Product Master 4.1.0

Version 4.1.0 of the Product Master service includes various fixes.

For details, see What's new and changed in Product Master.

Related documentation:
Product Master
Voice Gateway 1.3.3

Version 4.7.1 of the Voice Gateway service includes various fixes.

Related documentation:
Voice Gateway
Watson Assistant 4.7.1

The 4.7.1 release of Watson Assistant includes the following features and updates:

Edit step titles
You can now add and edit titles for each step, which can help you more easily identify what a step does in an action. For more information, see Editing actions.
Filtering the list of actions
You can locate specific actions by filtering the list of actions by subaction, custom extension, or variable. For more information, see Filtering actions.
See which actions use a specific variable
The Variables page now includes a new Actions count column. You can click the number in the column to see which actions use a variable. For more information, see Creating a session variable.
New method for setting a session variable
Previously, if you wanted to use an expression to set or modify a variable value, you needed to pick an existing variable or create a new variable and select the expression option. Now, you can use the new Expression option to write an expression without first picking a variable. For more information, see Storing a value in a session variable.
Changes to the date and number formats in assistant responses
You might see changes to the date and number formats in assistant responses, such as:
  • Added or removed periods in dates. For example:
    • In Spanish, 18 abr. 2021 changes to 18 abr 2021.
    • In Portuguese, 18 de. abr changes to 18 de abr.
  • Delimiter character changes for numbers in some languages. For example, in French, nonbreaking space (NBSP) changes to narrow no-break space (NNBSP).

These changes are the result of migrating Watson Assistant to Java 17, which uses CLDR 39 for locale formats.

To avoid or minimize the impact of similar changes in the future, use Display formats.

Changes in contextual entity detection for dialog skills with few annotations
If you have 10 to 20 examples of contextual entities in your dialog skill, you might see differences in the entities that are detected. The differences are based on updates that were made to address critical vulnerabilities. The changes are limited to newly-trained models. Existing models are unaffected.

You can mitigate the impact of the change by annotating more examples. For more information, see Annotation-based method.

Version 4.7.1 of the Watson Assistant service includes various security fixes.

For details, see What's new and changed in Watson Assistant.

Related documentation:
Watson Assistant
Watson Discovery 4.7.1

The 4.7.1 release of Watson Discovery includes the following features and updates:

Optical character recognition V2 is used by default
The latest version of optical character recognition (OCR) is used automatically when you enable OCR for English, German, French, Spanish, Dutch, Brazilian Portuguese, and Hebrew collections.

The newest version of the OCR model is better at extracting text from scanned documents and other images in the following situations:

  • The images are low quality because of incorrect scanner settings, insufficient resolution, poor lighting (such as with mobile capture), loss of focus, misaligned pages, and poor print quality.
  • The documents contain irregular fonts, various colors, different font sizes, or a background.
For more information, see Optical character recognition in the Watson Discovery product documentation.
Improved tool for creating Smart Document Understanding (SDU) user-trained models
The SDU tool that you use to annotate documents was rebuilt to be more responsive and easier to use.

Version 4.7.1 of the Watson Discovery service includes various fixes.

Related documentation:
Watson Discovery
Watson Knowledge Catalog 4.7.1

The 4.7.1 release of Watson Knowledge Catalog includes the following features and updates:

New metadata enrichment options
  • In addition to running relationship analysis manually, you can now include relationship analysis as part of metadata enrichment.
  • The Profile data option now includes primary key analysis.

For more information, see Creating a metadata enrichment asset.

Data quality rules enhancement
You can now include the data quality definition name in your rule output. Thus, you can identify for which data quality definition an output record was created, especially when your data quality rule includes multiple data quality definitions.

For more information, see Creating rules from data quality definitions.

Enhancement to automatic term assignment
When you run automatic term assignment, you no longer need to choose between a custom machine learning model and the built-in machine learning model. You can now use both models in parallel.

For more information, see Automatic term assignment.

Manage column relationships for data assets in a catalog
You can now create and manage relationships for data asset columns in a catalog. You can create column relationships between:
  • A column and an asset
  • A column and an artifact
  • A column and another column

To add a column relationship, select a column on the Overview page of an asset. In the side pane, select an option from the Related items list.

For more information, see Asset relationships.

Version 4.7.1 of the Watson Knowledge Catalog service includes various fixes.

For details, see What's new and changed in Watson Knowledge Catalog.

Related documentation:
Watson Knowledge Catalog
Watson Knowledge Studio 5.1.0
Version 5.1.0 of the Watson Knowledge Studio includes the following update:
Online backup and restore with OADP
You can now use the Cloud Pak for Data OpenShift APIs for Data Protection (OADP) backup and restore utility to do an online backup and restore of Watson Knowledge Studio.

For more information, see Cloud Pak for Data online backup and restore.

Offline backup and restore with OADP is not available for Watson Knowledge Studio.

This release also includes various fixes.
Related documentation:
Watson Knowledge Studio
Watson Machine Learning Accelerator 4.1.0

Version 4.1.0 of the Watson Machine Learning Accelerator service includes various fixes.

Related documentation:
Watson Machine Learning Accelerator
Watson Pipelines 4.7.1

Version 4.7.1 of the Watson Pipelines service includes various fixes.

Related documentation:
Watson Pipelines
Watson Speech services 4.7.1

Version 4.7.1 of the Watson Speech to Text service includes various fixes.

For details, see What's new and changed in Watson Speech to Text.

Related documentation:
Watson Speech services
watsonx.data 1.0.1

The IBM watsonx.data service is a new data source service that is available in IBM Cloud Pak for Data Version 4.7.1.

The watsonx.data service is a separately priced service.

With watsonx.data, you can simplify your data management operations with a flexible data store that's optimized for analytics and AI workloads.

Eliminate data duplication and facilitate better collaboration. The watsonx.data service, combines the advantages of data warehouses and data lakes in a single, integrated platform. With watsonx.data you can collect, store, query, and analyze your structured, semi-structured, and unstructured enterprise data.

Open
Facilitate data access and sharing across applications with open data formats, including Apache Iceberg, a high-performance open data format that supports industry-standard file formats.
Flexible
Connect to and access data remotely in a hybrid cloud. Then, choose from various query and analytics engines, such as Presto and Spark, to process big data efficiently and reliably.
Scalable
Explore, shape, and analyze data at any scale by separating storage and compute resources.
Related documentation:
watsonx.data

What's new in Version 4.7

Cloud Pak for Data 4.7 includes new features for most services on the platform, from AI Factsheets to Watson Studio. Version 4.7 also includes broader adoption of Day 2 operations features like auditing, shut down and restart, backup and restore, and automatic scaling of services.

Cloud Pak for Data 4.7 introduces additional security hardening through FIPS 140-2 compliance, CIS Benchmarks, and the introduction of the private topology.

For more information, review the information in the following sections:

Platform enhancements

The following table lists the new features that were introduced in Cloud Pak for Data Version 4.7.

What's new What does it mean for me?
FIPS 140-2 compliance
Many of the services in IBM Cloud Pak for Data Version 4.7 are FIPS 140-2 compliant.
Important: In previous versions of Cloud Pak for Data, most software could be installed on a FIPS-enabled cluster, however, the software did not meet the FIPS requirements. For example:
  • The software did not use FIPS-certified modules for encryption.
  • Some software implicitly turned off FIPS mode to access modules that were not FIPS-compliant on Red Hat OpenShift Container Platform or on Red Hat Enterprise Linux®.

In Version 4.7, services that are FIPS 140-2 compliant use FIPS-certified modules for encryption and use only modules that are available in FIPS mode. In some situations, this might result in a loss of functionality if the service previously ran on FIPS-enabled clusters without being FIPS 140-2 compliant. For example, some JDBC drivers are not FIPS-compliant, so connections that worked in previous releases of Cloud Pak for Data might not work in Version 4.7. For more information, see Known issues on FIPS-enabled clusters.

For a complete list of the services that are FIPS 140-2 complaint, see Services that support FIPS.

CIS Benchmarks
The CIS Benchmarks, from the Center for Internet Security, are a set of best practices that help security practitioners implement and maintain cybersecurity defenses. The Kubernetes CIS Benchmark includes configuration guidelines for Red Hat OpenShift Container Platform v4.

The IBM Cloud Pak for Data control plane and services are tested against the OpenShift Compliance Operator CIS profiles. For more information, see CIS Benchmark for Red Hat OpenShift Container Platform v4.

Simpler process to back up and restore Cloud Pak for Data
Starting in Cloud Pak for Data Version 4.7, backups that are created with the Cloud Pak for Data OADP backup and restore utility are run at the instance level. With this approach, you can back up and restore all the projects (namespaces) that are associated with an instance of Cloud Pak for Data in a single orchestrated sequence instead of backing up and restoring the projects separately.

For more information, see the following topics:

Estimate the amount of storage for a backup
You can use the cpd-cli oadp du-pv command to estimate how much storage space you need for a backup. Use the command to ensure that you have sufficient space for your backup.

For more information, see What's new and changed in the Cloud Pak for Data command-line interface.

Cloud Pak for Data APIs restructured and improved
The available APIs in Cloud Pak for Data topic includes more links to APIs that you can use. You can also find links to service-specific APIs from service landing pages. For example, you can find the link to the Watson Data API from the Watson Knowledge Catalog service landing page.

The Cloud Pak for Data Platform API in IBM Cloud API Docs now includes the following methods and examples:

For more information, see What's new and changed in Cloud Pak for Data APIs.

More control over instances with the private topology
Starting in IBM Cloud Pak for Data Version 4.7, each instance of Cloud Pak for Data has its own set of operators. The private topology simplifies the process of installing and managing multiple instances of Cloud Pak for Data at different releases on a single cluster.

The private topology replaces the express installation topology and the specialized installation topology.

If you are upgrading to IBM Cloud Pak for Data Version 4.7, your must migrate your existing installation to the private topology.

For more information, see Supported project (namespace) configurations.

Service enhancements

The following table lists the new features that are introduced for existing services in Cloud Pak for Data Version 4.7:

Software Version What does it mean for me?
Cloud Pak for Data common core services 7.0.0
The 7.0.0 release of the common core services includes changes to support features and updates in Watson Studio and Watson Knowledge Catalog.
Version 7.0.0 of the common core services includes the following features and updates:
Authenticate to Google BigQuery with workload identity federation
You can now use workload identity federation to authenticate to Google BigQuery, rather than using your Google service account key. Workflow identity federation provides increased security and centralized management.

To use workload identity federation, you must have an identity provider (IdP) that supports one of the following specifications:

  • AWS Signature Version 4
  • OpenID Connect (OIDC)
  • SAML 2.0

For more information, see Google BigQuery connection.

Access data from IBM Product Master
You can now create a connection to IBM Product Master. For more information, see IBM Product Master connection.
Name change for the IBM Cloud Compose for MySQL connection
The IBM Cloud Compose for MySQL connection was renamed to IBM Cloud® Databases for MySQL. Your previous settings for the connection remain the same. Only the connection name has changed.

Version 7.0.0 of the common core services includes various fixes.

For more information, see What's new and changed in the common core services.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Cloud Pak for Data scheduling service 1.13.0

The 1.13.0 release of the scheduling service includes the following features and updates:

Use node scoring to have more control over pod placement
Starting in Cloud Pak for Data Version 4.7, you can use node scoring to have more control over where pods are scheduled. For example, you can use node scoring to configure the scheduling service to:
  • Schedule pods on nodes that have more allocated memory or vCPU
  • Distribute pods across nodes

For more information, see Configuring node scoring for the scheduling service.

Version 1.13.0 of the scheduling service includes various fixes.

For more information, see What's new and changed in the scheduling service.

Related documentation:
You can install and upgrade the scheduling service when you install or upgrade the shared cluster components. For more information, see:
AI Factsheets 4.7.0

The 4.7.0 release of AI Factsheets includes the following features and updates:

Track different model use case solutions with approaches
When you track models in a use case, you can now create one or more approaches to track different methods and model versions for addressing a business problem. For example, you might create two different approaches in a use case to compare how different algorithms affect model performance so you can find the best solution. For more information, see Managing model versions in a use case.
Asset tab showing models being tracked.
Enhanced options for governing external models

You can now use AI Factsheets to govern a wider range of external models, including models developed, deployed, and monitored on a platform other than Cloud Pak for Data. In addition to more comprehensive metadata tracked for external models, the Python client and API commands provide more features for moving models and deployments to different environments to more accurately track the lifecycle for these assets. For details, see Adding an external model to the model inventory.

Exercise more control over attachments
Model inventory administrators can create attachment groups and create attachment definitions so that users can view attachments in a more organized fashion and upload attachments in an approved format. For more information, see Adding and managing attachments for factsheets.
Add branding to your reports
Customize the report templates that you use to create reports from factsheets by adding branding information and a logo. For more information, see Generating reports for factsheets and model use cases.
Automatically scale the AI Factsheets service
You can enable automatic scaling of resources for the AI Factsheets service. AI Factsheets uses the Red Hat OpenShift Horizontal Pod Autoscaler (HPA) to increase or decrease the number of pods in response to CPU or memory consumption. For more information, see Automatically scaling resources for services.
Shut down and restart AI Factsheets
You can now shut down and restart AI Factsheets. Shutting down services when you don't need them helps you conserve cluster resources. For more information, see Shutting down and restarting services.

Version 4.7.0 of the AI Factsheets service includes various fixes.

For more information, see What's new and changed in AI Factsheets.

Related documentation:
AI Factsheets
Analytics Engine powered by Apache Spark 4.7.0

The 4.7.0 release of Analytics Engine powered by Apache Spark includes the following features and updates:

Deprecation of R 3.6
R 3.6 is deprecated and will be removed in a future release. Use R 4.2 in your Spark applications.
Removal of Spark 3.2
Spark 3.2 was removed. Use Spark 3.3 in your Spark applications.
Deprecation of Spark SQL Cloudant
Spark SQL Cloudant DB Driver is deprecated and will be removed in a future release.
Deprecation of Python 3.9
Python 3.9 is deprecated and will be removed in a future release. Use Python 3.10 in Spark environments and Spark applications. Python 3.10 is set as default in Spark applications.

Version 4.7.0 of the Analytics Engine powered by Apache Spark includes various fixes.

For more information, see What's new and changed in Analytics Engine powered by Apache Spark.

Related documentation:
Analytics Engine powered by Apache Spark
Cognos Analytics 24.0.0

The 24.0.0 release of Cognos Analytics includes the following features and updates:

Data migration with the cpd-cli export-import utility
You can now use the cpd-cli export-import utility to migrate Cognos Analytics data between Cloud Pak for Data instances. For more information, see Migrate Cognos Analytics data between clusters.
Cognos PowerCubes data source
You can now use Cognos PowerCubes as a data source. For more information, see Set up Cognos PowerCubes.
Reduced footprint
The vCPU requirements for Cognos Analytics are less than in Cloud Pak for Data Version 4.6.

Cognos Analytics instances now use fewer resources. The following resources are required for each plan size:

  • Fixed minimum: 10 vCPU (down from 11)
  • Small: 16 vCPU (down from 18)
  • Medium: 18 vCPU (down from 22)
  • Large: 25 vCPU (down from 27)

For more information, see Provisioning the Cognos Analytics service.

For the minimum resources required to install Cognos Analytics, see Hardware requirements.

Updated software version for Cognos Analytics
The 24.0.0 release of the Cognos Analytics service provides Version 11.2.4 Fix Pack 1 + Interim Fix of the Cognos Analytics software. For more information, see Release 11.2.4 FP1 - New and changed features
in the Cognos Analytics documentation.

Version 24.0.0 of the Cognos Analytics service includes various fixes.

Related documentation:
Cognos Analytics
Cognos Dashboards 4.7.0

The 4.7.0 release of Cognos Dashboards includes the following features and updates:

Limits on the number of users who can create and view dashboards
The Cloud Pak for Data Standard Edition license and Enterprise Edition license limit the number of users who can create and view dashboards. Specifically, your license entitles you to:
  • Assign up to 5 users with the Create dashboards permission.

    The Create dashboards permission includes permission to view dashboards.

  • Assign up to 20 users with the View dashboards permission

If you need to exceed these limits, you must purchase a Cognos license.

For more information, see Setting up Cognos Dashboards permissions and Tracking usage of Cognos Dashboards licenses.

Updated dashboard features
Cognos Dashboards includes the latest dashboard features from Cognos Analytics Version 12.0.0. For more information, see Dashboards - New and changed features in the Cognos Analytics documentation.
New dashboard visualizations
Cognos Dashboards now includes the following visualizations:
  • Decision tree.

    For details see Decision tree in the Cognos Analytics documentation.

  • Driver analysis.

    For details see Driver analysis in the Cognos Analytics documentation.

  • Spiral.

    For details see Spiral in the Cognos Analytics documentation.

  • Sunburst.

    For details see Sunburst in the Cognos Analytics documentation.

Visualization insights and forecasting
Cognos Dashboards now provides insights and forecasting for your dashboard visualizations. For more information, see Insights and forecast in the Cognos Analytics documentation.
New data source connections
You can now connect to the following data sources from Cognos Dashboards:
  • IBM Cloud Data Engine
  • IBM Cloud Databases for MySQL
  • IBM Db2 Big SQL
  • IBM Db2 for i
  • IBM Informix
  • Amazon RDS for MySQL
  • Amazon RDS for Oracle
  • Amazon RDS for PostgreSQL
  • Amazon Redshift
  • Cloudera Impala
  • Dremio
  • MariaDB
  • Microsoft Azure SQL Database
  • MySQL
  • Oracle
  • Snowflake
  • Teradata
  • Uploaded data files in Microsoft Excel file format
Checking and refreshing data sources
You can now check the status of the data sources for your dashboards. You can also refresh the data sources to ensure that the data in your dashboards is up to date. For more information, see Checking and refreshing your dashboard data sources.
Joining tables from a multi-sheet file
Now you can use Cognos Dashboards to join tables from uploaded data files that contain multiple sheets.

For more information, see Creating a relationship between sheets in a multi-sheet file data source.

Migrating dashboards to Cognos Analytics
You can now migrate dashboards from Cognos Dashboards to Cognos Analytics. When you migrate a dashboard to Cognos Analytics, you can use the additional features that Cognos Analytics provides, including enterprise reporting, AI features, and a more powerful dashboard experience. For more information, see Migrating dashboards from Cognos Dashboards to Cognos Analytics.

Version 4.7.0 of the Cognos Dashboards service includes various fixes.

Related documentation:
Cognos Dashboards
Data Privacy 4.7.0
The 4.7.0 release of Data Privacy includes the following features and updates:
Streamlined steps for defining masking in data protection rules
Now you can specify advanced masking options with fewer clicks when you create data protection rules. You no longer need to explicitly enable advanced masking options to redact or obfuscate data.

When you choose your criteria to create a rule, the New data protection rule page prompts you for any applicable data masking options based on your selected data class.

New data protection rule page.

For more information, see Mask data.

Row filtering rules are applied in masking flows
You can create data protection rules that filter rows in the assets that they affect. Starting in Cloud Pak for Data Version 4.7, when you create a masking flow with the masking type Bulk copy, row filtering rules that affect the asset are applied. However, row filtering is not available for masking flows with the masking type Copy related records across tables If you try to create a masking flow of that type and any row filtering rules affect that asset, the masking flow job fails.

For more information, see Creating masking flows.

Advanced masking options are available for Watson Query
Data protection rules that are defined with advanced masking options are now enforced for Watson Query (Data virtualization). Rules can implement format preserving obfuscation on any of the predefined data classes, except IBAN and URL.

For more information, see Advanced masking options.

NULL values in masked columns are obfuscated
When a column is masked by a data protection rule that is defined to obfuscate values, any NULL values are now masked with random obfuscation. If any errors are encountered during masking either NULL or non-NULL values, all the values in the column are redacted.

For more information, see Obfuscating data method.

Version 4.7.0 of the Data Privacy service includes various fixes.

For more information, see What's new and changed in Data Privacy.

Related documentation:
Data Privacy
Data Refinery 7.0.0

The 7.0.0 release of Data Refinery includes the following features and updates:

The Calculate operation works on date columns
You can now use the Calculate operation on date data type columns to add or subtract day or month values.
Data Refinery Calculate operation

For more information, see GUI operations in Data Refinery.

Updates for environments for running Data Refinery flow jobs
  • The Default Spark 3.2 & R 3.6 environment was removed.
  • The Default Spark 3.3 & R 3.6 environment is deprecated and will be discontinued in a future update.
  • The Default Spark 3.3 & R 4.2 environment is now available.

    You can select Default Spark 3.3 & R 4.2 when you select an environment for a Data Refinery flow job.

If you are upgrading from a previous version of Cloud Pak for Data and your flow jobs use a discontinued environment, a deprecated environment, or a custom Spark 3.0 environment, update the jobs to use the new Default Spark 3.3 & R 4.2 environment. Use the new environment for new jobs.

For more information, see Data Refinery environments.

The environment change affects the following GUI operations:

  • Split
  • Tokenize

If you are upgrading from a previous version of Cloud Pak for Data and your flow jobs include these GUI operations, you must update the Data Refinery flow. To update a flow, open it and save it. For more information, see Managing Data Refinery flows.

Audit logging
Data Refinery now integrates with the Cloud Pak for Data audit logging service. Auditable events for Data Refinery flows are forwarded to the security information and event management (SIEM) solution that you integrate with.

Version 7.0.0 of the Data Refinery service includes various fixes.

For more information, see What's new and changed in Data Refinery.

Related documentation:
Data Refinery
Data Replication 4.7.0

The 4.7.0 release of Data Replication includes the following features and updates:

Use Apache Avro to serialize data that you write to Apache Kafka
You now have the option to select Apache Avro to serialize the data that you write to an Apache Kafka connection with a schema registry. For more information, see Replicating Apache Kafka data.
Import Data Replication assets into deployment spaces
Now you can export Data Replication assets from your project and import them into deployment spaces as read-only assets. You can use deployment spaces to store your Data Replication assets, deploy assets, and manage your deployments. For more information, see Importing spaces and projects into deployment spaces.

Version 4.7.0 of the Data Replication service includes various fixes.

For more information, see What's new and changed in Data Replication.

Related documentation:
Data Replication
DataStage 4.7.0

The 4.7.0 release of DataStage includes the following features and updates:

Use additional transform functions in pipeline flows
You can now use built-in DataStage transforms with the Expression Builder in Watson Pipelines. For more information, see DataStage Functions used in pipelines Expression Builder.
Use ELT run mode with additional stages
You can now use the following stages in ELT run mode in DataStage:
  • Lookup
  • Filter
  • Funnel
  • Amazon Redshift connector

For more information, see ELT run mode in DataStage.

Transform data with the new XML Output stage
You can now use the XML Output stage to transform tables into hierarchical XML data. For more information, see XML Output stage.
Use transform procedures in the Oracle and Teradata connectors
You can now use transform procedures as stored procedures in the Oracle and Teradata connectors. For more information, see Using stored procedures.
Maintain separate environments with deployment spaces
Use deployment spaces for testing and production to maintain a strict separation from the development environment. For more information, see Deployment spaces in DataStage.

Version 4.7.0 of the DataStage service includes various fixes.

For more information, see What's new and changed in DataStage.

Related documentation:
DataStage
Db2 4.7.0

The 4.7.0 release of Db2 includes the following features and updates:

Export audit logs
After you enable audit logging on a Db2 database instance, you can configure the service to stream audit logs to the Cloud Pak for Data audit logging service, which can export audit logs to security information and event management (SIEM) solution, such as Splunk, Mezmo, or QRadar®.

For more information, see:

Enhanced controller
The Db2 enhanced controller helps simplify administrative tasks. You can automate and manage typical activities for database instances, such as configuring audit log streaming.
Disable database encryption in the user interface
In environments where storage-layer encryption is available, you now have the choice of disabling Db2 native encryption for new deployments. This option is available for users who want to optimize resource efficiency and bypass possible compatibility issues. For more information, see Creating a database deployment on the cluster.
Easy HADR deployment using the Db2uHadr custom resource
When configuring HADR with a single standby in a single IBM Cloud environment, you can now use the Db2uHadr custom resource to easily deploy, instead of using scripts. For more information, see Using the Db2 HADR API.
Role-aware HADR
When configuring HADR using the Db2uHadr custom resource, you will have the option to create an additional Kubernetes service that will redirect traffic to the current HADR primary deployment. Instead of configuring Automatic Client Reroute (ACR) on the server or client, applications can now connect to Db2 using a single hostname, which will always redirect to the primary database in the case of a failover. For more information, see Using the Db2 HADR API.
Q Replication enhancements
  • You can now deploy Q Replication on PowerLinux systems.
  • You can now use Q Replication with source and target databases that use custom names. If you have a newly installed source database using a custom database name, you are required to have a target database running on Cloud Pak for Data version 4.7.0.

Version 4.7.0 of the Db2 service includes various fixes.

For more information, see What's new and changed in Db2.

Related documentation:
Db2
Db2 Big SQL 7.5.0

The 7.5.0 release of Db2 Big SQL includes the following features and updates:

Connect to TLS (SSL) enabled Hadoop clusters with a CA certificate
You can now connect Db2 Big SQL to a Hadoop cluster that uses TLS (SSL) protocols with a secret that contains your company's CA certificate. The CA certificate is used to connect to all Hadoop services, and you no longer need to add a separate certificate for each service. For more information, see Connecting to a TLS (SSL) enabled Hadoop cluster.

Version 7.5.0 of the Db2 Big SQL service includes various fixes.

Related documentation:
Db2 Big SQL
Db2 Data Gate 4.0.0
Version 4.0.0 of the Db2 Data Gate service includes the following features and updates:
New synchronization event reporting
Individual synchronization events are now reported on the dashboard of a Db2 Data Gate instance. You can use this event information to analyze failures or bottlenecks in the synchronization process. To learn more, see Monitoring a Db2 Data Gate instance.
Easier Db2 credentials management
Db2 Data Gate can now access Db2 target database services if the Db2 services use certificates that are signed by a certificate authority (CA), and if these certificates are stored in the Cloud Pak for Data vault. To learn more, see:
Faster status retrieval
Status information about a Db2 Data Gate instance is now retrieved and displayed on the dashboard faster than before because most of the information is provided by the instance itself and fewer stored procedure calls are required to obtain it.

Version 4.0.0 of the Db2 Data Gate service includes various fixes.

For more information, see What's new and changed in Db2 Data Gate.

Related documentation:
Db2 Data Gate
Db2 Data Management Console 4.7.0

The 4.7.0 release of Db2 Data Management Console includes updates to the following features:

Alerts
  • The Notification center now displays the time that an alert was closed.
  • You can now refine alert conditions by configuring the following properties to generate an alert based on the number of times an event occurs within a specified time period:
    • Number of occurrences
    • Number of collection intervals

    For more information, see Managing alerts.

Monitoring
  • The In-flight executions page now provides information about the metrics for the rows that were inserted, updated, or deleted in a table for each execution.
  • You can now view the memory usage for instances and databases from the new Memory page.
    • To view the memory consumed at the instance level, go to Monitor > Memory > Instance memory.
    • To view all the memory sets and memory pools within each set for a selected database, go to Monitor > Memory > Database memory.
  • To view the table space utilization information, go to Monitor > Storage > Table space utilization.
  • To view the overall queue activities and resource usage to analyze the division of system resources among service super classes, go to Monitor > Workload management

For more information, see Monitoring profile.

Run SQL
You can use the new show or hide feature to control the visibility of system schemas for object lists.

For more information, see Running SQL.

Version 4.7.0 of the Db2 Data Management Console service includes various fixes.

For more information, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse 4.7.0

The 4.7.0 release of Db2 Warehouse includes the following features and updates:

Stream audit logs
After you enable audit logging on a Db2 Warehouse database instance, you can configure the service to stream audit logs to the Cloud Pak for Data audit logging service, which can export audit logs to security information and event management (SIEM) solution, such as Splunk, Mezmo, or QRadar.

For more information, see:

Enhanced controller
The Db2 Warehouse enhanced controller helps simplify administrative tasks. You can automate and manage typical activities for database instances, such as configuring audit log streaming.
Disable database encryption in the user interface
In environments where storage-layer encryption is available, you now have the choice of disabling Db2 Warehouse native encryption for new deployments. This option is available for users who want to optimize resource efficiency and bypass possible compatibility issues. For more information, see Creating a database deployment on the cluster.
Easy HADR deployment using the Db2uHadr custom resource
When configuring HADR with a single standby in a single IBM Cloud environment, you can now use the Db2uHadr custom resource to easily deploy, instead of using scripts. For more information, see Using the Db2 Warehouse HADR API.
Role-aware HADR
When configuring HADR using the Db2uHadr custom resource, you will have the option to create an additional Kubernetes service that will redirect traffic to the current HADR primary deployment. Instead of configuring Automatic Client Reroute (ACR) on the server or client, applications can now connect to Db2 Warehouse using a single hostname, which will always redirect to the primary database in the case of a failover. For more information, see Using the Db2 Warehouse HADR API.
Q Replication enhancements
  • You can now deploy Q Replication on PowerLinux systems.
  • You can now use Q Replication with source and target databases that use custom names. If you have a newly installed source database using a custom database name, you are required to have a target database running on Cloud Pak for Data version 4.7.0.

Version 4.7.0 of the Db2 Warehouse service includes various fixes.

For more information, see What's new and changed in Db2 Warehouse.

Related documentation:
Db2 Warehouse
Decision Optimization 7.0.0

The 7.0.0 release of Decision Optimization includes the following features and updates:

Customize engine parameters for Decision Optimization experiments (Watson Studio)
You can now add an OPL parameter settings (.ops) file in your Decision Optimization experiment. With this file, you can view and customize the engine parameters that are used to solve your model in a new visual editor. You can also import an existing OPL settings file and search for existing settings.
Engine settings file in Visual Editor view with one customized parameter.

For more information, see OPL engine settings.

Export data from Decision Optimization experiments to your project
You can now export tables to your project from either the Prepare data or Explore solution view in your Decision Optimization experiment so that you can reuse your data in other models or services. You can also export data by using the Decision Optimization Python client.
UI to export table to your project

For more information, see Exporting data from Decision Optimization experiments.

New view for saved models in Decision Optimization experiments
When you save models for deployment from experiments, you can now review the input and output schema and environment information before saving the model. For more information, see Deploying a Decision Optimization model by using the user interface.
Python 3.9 is deprecated
Python is used to run and deploy Decision Optimization models formulated in DOcplex in Decision Optimization experiments. Modeling Assistant models also use Python because DOcplex code is generated when models are run or deployed.

The Decision Optimization environment currently supports Python 3.10 and 3.9. The default version is Python 3.10. Python 3.9 is deprecated.

Version 7.0.0 of the Decision Optimization service includes various fixes.

For more information, see What's new and changed in Decision Optimization.

Related documentation:
Decision Optimization
EDB Postgres 4.14.0

The 4.14.0 release of EDB Postgres includes the following features and updates:

TLS certificates are available when you create an instance
When you create your EDB Postgres database instance, you can optionally specify that you want to create a custom TLS certificate. You can specify this option from the database's custom resource. For more information, see Configuring TLS for EDB Postgres.
Backup and restore with OADP
You can now use the Cloud Pak for Data OpenShift APIs for Data Protection (OADP) backup and restore utility to do an online or offline backup and restore of EDB Postgres.

The PostgreSQL backup and restore methods are still available.

With the 4.14.0 release of the EDB Postgres service, you can install the following versions of EDB Postgres:
  • 12.14
  • 13.10
  • 14.7

Version 14.8, 13.11, 12.15 of the EDB Postgres service includes various fixes.

Related documentation:
EDB Postgres
Execution Engine for Apache Hadoop 4.7.0

The 4.7.0 release of Execution Engine for Apache Hadoop includes the following features and updates:

IBM Spectrum® Conductor clusters are no longer supported
You can no longer set up or select IBM Spectrum Conductor clusters to use Execution Engine for Apache Hadoop in Watson Studio.

You must install and set up Hadoop clusters to use Execution Engine for Apache Hadoop. Watson Studio interacts with Hadoop clusters through WebHDFS, Jupyter Enterprise Gateway, and Livy for Spark services.

Version 4.7.0 of the Execution Engine for Apache Hadoop service includes various fixes.

For more information, see What's new and changed in Execution Engine for Apache Hadoop.

Related documentation:
Execution Engine for Apache Hadoop
IBM Match 360 3.0.55

The 3.0.55 release of IBM Match 360 includes the following features and updates:

New data quality workflow helps data stewards remediate potential match issues
Use the new IBM Match 360 potential matches workflow to fix potential matching issues in your master data. Streamline your data stewards' workflow by defining the range of matching scores that qualifies for clerical review, then create governance tasks to help data stewards make decisions that enhance confidence in your master data.

The potential matches workflow provides the framework that data stewards can use to:

  • Quickly generate governance tasks for potential matching issues in your data or a subset of your data.
  • Review and remediate the generated tasks by making match or no-match decisions on records for which the matching algorithm cannot make a confident matching decision.

For information on configuring the potential matches workflow, see Configuring master data workflows.

For information on identifying, reviewing, and remediating potential match issues, see Remediating potential matches to improve data quality.

New data quality dimension measures entity confidence
IBM Match 360 now contributes a new entity confidence data quality dimension to the Data quality tab for an asset in a project. Entity confidence measures the percentage of master data entities in the system that IBM Match 360 is confident are complete and accurate. You can improve an asset's entity confidence score by tuning your matching algorithm or remediating potential match issues.
Entity confidence score

For more information about entity confidence, see Remediating potential matches to improve data quality.

IBM Match 360 protects sensitive data according to governance rules
When you associate IBM Match 360 with a governed data catalog that uses data protection rules, IBM Match 360 enforces the rules by masking sensitive data.

When you are working with governed data assets in the master data explorer, a shield icon on an attribute name indicates that its values are masked by a data protection rule. Governed data is also protected when it is accessed through the IBM Match 360 API.

For more information about using data protection rules with IBM Match 360, see Working with governed data in IBM Match 360.

Stream change events from your master data to downstream systems
Now IBM Match 360 can, in real time, propagate changes in your record and entity data directly to downstream systems through a connected Apache Kafka server. Streaming ensures that your users and systems always have the freshest and most up-to-date master data. Master data streaming is available only through the IBM Match 360 API.

For more information, see Streaming record and entity data changes.

Define entity attributes that persist with entities
This release introduces a new type of attribute that is saved directly on your master data entities, rather than being composited from their member records. Customize the data model of your entity types to add new entity attribute definitions, then edit individual entities to specify the attribute values.

Rather than relying on only record data to provide entity attribute values, the ability to specifically define entity attributes gives organizations the flexibility to store, capture, and manage digital twin attributes for each entity to better track behavior indicators, engagement preferences, and other key customer data points.

For more information about entity attributes, see Data concepts in IBM Match 360.

Version 3.0.55 of the IBM Match 360 service includes various fixes.

For more information, see What's new and changed in IBM Match 360.

Related documentation:
IBM Match 360 with Watson
Informix 6.0.0

Version 6.0.0 of the Informix service includes various fixes.

Related documentation:
Informix
OpenPages 8.302.2

Version 8.302.2 of the OpenPages service includes various fixes.

Related documentation:
OpenPages
Planning Analytics 4.7.0

The 4.7.0 release of Planning Analytics includes the following features and updates:

Updated versions of Planning Analytics software
This release of the Planning Analytics service provides the following software versions:
  • TM1 Version 2.0.9.17

    For details about this version of the software, see Planning Analytics 2.0.9.17 in the Planning Analytics documentation.

  • Planning Analytics Workspace Version 2.0.87.

    For details about this version of the software, see 2.0.87 - What's new in the Planning Analytics Workspace documentation.

  • Planning Analytics Spreadsheet Services Version 2.0.87.

    For details about this version of the software, see 2.0.87 - Feature updates in the TM1 Web documentation.

  • Planning Analytics for Microsoft Excel Version 2.0.88.

    For details about this version of the software, see 2.0.88 - Feature updates in the Planning Analytics for Microsoft Excel documentation.

  • Planning Analytics Engine Version 12.2.

    For details about this version of the software, see What's new in Planning Analytics Engine in the Planning Analytics Engine documentation.

Version 4.7.0 of the Planning Analytics service includes various fixes.

Related documentation:
Planning Analytics
Product Master 4.0.0

The 4.0.0 release of Product Master includes the following features and updates:

OpenSearch-based search
The inbuilt Free text search feature now uses OpenSearch instead of Elastic search. You need to install OpenSearch Version 2.6 to use the Free text search feature. For more information, see Installing OpenSearch on Red Hat OpenShift Container Platform .
Data flattening
The product data can now stored as an Attribute-Value pair in JSON, where name of the Attribute becomes the column name, and value becomes the row.
Example
"Product ID": "3912"

You can export this data to external systems for analytics or reporting.

New Magento connector
You can configure Magento connector with the Product Master application. Product Master - Magento connector is a downstream connector for publishing the items to the Adobe Magento e-commerce platform.

For more information, see Magento connector in the Product Master documentation.

Version 4.0.0 of the Product Master service includes various fixes.

For more information, see What's new and changed in Product Master.

Related documentation:
Product Master
RStudio® Server Runtimes 7.0.0

The 7.0.0 release of RStudio Server Runtimes includes the following features and updates:

New runtime
You can now open RStudio in the Runtime 23.1 on R 4.2 environment to create scripts and Shiny apps.

The Runtime 22.1 on R 3.6 environment is deprecated. It is recommended that you use Runtime 23.1 on R 4.2 instead.

When you launch the RStudio IDE environment, select the RStudio runtime using the dialog provided. You have the option to select the deprecated Runtime 22.1 on R 3.6 environment.

For more information, see RStudio environments.

Deprecation of Spark 3.2 and R 3.6
Spark 3.2 and R 3.6 are deprecated and will be discontinued in a future release. Use Spark 3.3 and R 4.2.

Version 7.0.0 of the RStudio Server Runtimes service includes various fixes.

For more information, see What's new and changed in RStudio Server Runtimes.

Related documentation:
RStudio Server Runtimes
SPSS® Modeler 7.0.0

The 7.0.0 release of SPSS Modeler includes the following features and updates:

Generate nodes without dragging them to the canvas
Instead of dragging filter nodes to the canvas from the node palette, you can now generate filter nodes by clicking the Generate Filter node in the Feature Selection nugget panel for a Modeler flow.
Duplicate, rename, or download assets
You can now duplicate, rename, or download assets from the updated drop-down list in the Assets tab of your Modeler flow project.
Schedule model building as a batch job
Model building can take hours. Now, you can schedule it as a batch job. The batch job creates a copy of the original stream and adds in newly created model nuggets.
Save nodes on the stream canvas as a new flow
While working in the SPSS Modeler flow, you can now select a set of nodes and save them as a new flow.
Run SPSS Modeler flows in pipelines
You can now create SPSS Modeler flow jobs and use them as steps in a Watson Pipelines pipeline. You can also save the output to a database or files to be used by other tools. For more information, see Configuring pipeline nodes

Version 7.0.0 of the SPSS Modeler service includes various fixes.

For more information, see What's new and changed in SPSS Modeler.

Related documentation:
SPSS Modeler
Voice Gateway 1.0.8

Version 4.7.0 of the Voice Gateway service includes various fixes.

Related documentation:
Voice Gateway
Watson Assistant 4.7.0

The 4.7.0 release of Watson Assistant includes the following features and updates:

The new Watson Assistant experience is available for all new instances
When you create a new instance of Watson Assistant, the new Watson Assistant experience is the default interface to use for building your assistants. The new experience makes it easier to use actions to build customer conversations. If you don't want to use the new experience, you can use the Manage menu to switch to the classic experience.
The new Watson Assistant experience.

For more information, see Welcome to the new Watson Assistant in the Watson Assistant documentation on IBM Cloud.

All languages are now enabled by default
You don't need to add languages during installation. All supported languages are now enabled by default with no increase in footprint. For more information, see Supported languages in the Watson Assistant documentation on IBM Cloud.
New algorithm version provides improved irrelevance detection
A new algorithm version is available. The Latest (20 Dec 2022) version includes a new irrelevance detection implementation to improve off-topic detection. For more information, see Algorithm version and training in the Watson Assistant documentation on IBM Cloud.
Actions templates updated with a new design and new choices
The actions template catalog has a new design. Now you can select multiple templates at the same time. The catalog also has new and updated templates, including starter kits that you can use with external services such as Google and HubSpot.
New template catalog

For more information, see Building actions from a template in the Watson Assistant documentation on IBM Cloud.

Organize actions into collections
You can put actions into collections, which are folder-style groups. You can create collections based on the concepts that are important to your organization. For example, you can create collections to group actions by use case, team, status, and so on.
Collections of actions

For more information, see Organizing actions in collections in the Watson Assistant documentation on IBM Cloud.

Display an iframe inline in the conversation
In the web chat, an assistant can now include an iframe response within the conversation. This new option is useful if you need to include smaller pieces of website content within the context of the conversation.
Include an iframe response

For more information, see Adding an iframe response in the Watson Assistant documentation on IBM Cloud.

New validation choices for date, time, and numeric customer responses
If a customer responds with a number, date, time, currency, or percentage, you can customize the validation to check for a specific answer, such as a range of dates or a limited currency amount.
Date validation example

For more information, see Customizing validation for a response in the Watson Assistant documentation on IBM Cloud.

New options when a customer changes the conversation topic
Confirmation to return to previous action
If a customer changes to a different topic, assistants now ask a yes or no confirmation question to determine whether the customer wants to return to the previous action. Previously, assistants returned to the previous action without asking. New assistants use this confirmation by default.
Confirmation settings

For more information, see Confirmation to return to previous topic in the Watson Assistant documentation on IBM Cloud.

New Never return choice
In some cases, you might not want a customer to return to a previous action after the customer changes the topic. To set up this option, use the new Never return choice in Action settings.

For more information, see Disabling returning to the original topic in the Watson Assistant documentation on IBM Cloud.

Allow changing topics in free text and regex responses
By default, customers can't change topics when the assistant is asking for a free text response or when an utterance matches the pattern in a regex response. Now you can set free text and regex customer responses to allow a customer to digress and change topics.

For more information, see Enabling changing the topic for free text and regex customer responses in the Watson Assistant documentation on IBM Cloud.

Adding and using multiple environments
Each assistant has a draft environment and live environment. You can now add up to three environments to test your assistant before deployment. You can build content in the draft environment and test versions of your content in the additional environments.
Multiple environments

For more information, see Adding and using multiple environments in the Watson Assistant documentation on IBM Cloud.

Display formats for variables
In the Global settings page for actions, you can use the Display formats tab to specify the display formats for variables that use date, time, numbers, currency, or percentages. You can also choose a default locale to ensure that the variable is displayed correctly in the web chat for your assistant. For example, you can choose to have the output of a time variable use the HH:MM format instead of the HH:MM:SS format. For more information, see Display formats in the Watson Assistant documentation on IBM Cloud.
Debug custom extensions
You can use the new extension inspector in the action editor Preview pane to debug problems with custom extensions. The extension inspector shows detailed information about what data is being sent to and returned from an external API.

For more information, see Debugging failures in the Watson Assistant documentation on IBM Cloud.

New expression choice for setting a session variable
Previously, to use an expression to set or modify a variable value, you needed to pick an existing variable or create a new one and select the expression option. Now you can use a new Expression choice to write an expression without first picking a variable. For more information, see Storing a value in a session variable in the Watson Assistant documentation on IBM Cloud.
Using the Cloud Object Storage importer to migrate chat logs
You can use the Cloud Object Storage importer service to migrate your chat logs from one installation of to another. For more information, see Using the Cloud Object Storage importer to migrate chat logs in the Watson Assistant documentation on IBM Cloud.
Migration from MinIO to Multicloud Object Gateway
Starting in Cloud Pak for Data Version 4.7, MinIO is replaced by Multicloud Object Gateway. All data that was stored in MinIO will be migrated to Multicloud Object Gateway when you upgrade to Cloud Pak for Data Version 4.7.

Ensure that Multicloud Object Gateway is installed before you install or upgrade Watson Assistant and that you create the secrets that Watson Assistant needs to communicate with Multicloud Object Gateway. For more information, see:

Installs
Upgrades
Backup and restore with OADP
You can now use the Cloud Pak for Data OpenShift APIs for Data Protection (OADP) backup and restore utility to do an online or offline backup and restore of Watson Assistant using CSI snapshots.
Reduced footprint
The vCPU requirements for the Watson Assistant service are less than in Cloud Pak for Data Version 4.6. For the minimum resources required to install Watson Assistant, see Hardware requirements.

Version 4.7.0 of the Watson Assistant service includes various security fixes.

Related documentation:
Watson Assistant
Watson Discovery 4.7.0
Version 4.7.0 of the Watson Discovery service includes the following features and updates:
Change how words are normalized for a collection
You can now configure a collection to use stemming to normalize words in the index and queries. For more information, see Enabling the stemmer for uncurated data in the Watson Discovery documentation on IBM Cloud.
Specify the types of files to add to your collection from crawled sources
When you connect to the local file system or a FileNet® P8 data source to crawl data, you can limit the types of files that are added to the collection. For example, you can choose to add only PDF or JSON files. For more information, see the following topics in the Watson Discovery documentation on IBM Cloud:
Secure Windows File System traffic with TLS
Secure the traffic that is sent between the Windows Agent service and the crawler by configuring your Windows File System collections to use the transport layer security (TLS) protocol. For more information, see Windows File System in the Watson Discovery documentation on IBM Cloud.
Online backup and restore with OADP
You can now use the Cloud Pak for Data OpenShift APIs for Data Protection (OADP) backup and restore utility to do an online backup and restore of Watson Discovery.

For more information, see Cloud Pak for Data online backup and restore.

Offline backup and restore with OADP is not available for Watson Discovery.

Migration from MinIO to Multicloud Object Gateway
Starting in Cloud Pak for Data Version 4.7, MinIO is replaced by Multicloud Object Gateway. All data that was stored in MinIO will be migrated to Multicloud Object Gateway when you upgrade to Cloud Pak for Data Version 4.7.

Ensure that Multicloud Object Gateway is installed before you install or upgrade Watson Discovery and that you create the secrets that Watson Discovery needs to communicate with Multicloud Object Gateway.

For more information about how to install Multicloud Object Gateway and create secrets, complete the required prerequisite steps in the topics that describe how to install and upgrade the service.

API updates
The Collections API has the following enhancements:
  • You can define JSON normalizations for documents.
  • New objects are available that share information about the status of documents that are being enriched or added to a collection.

For more information, see the Collections API reference in the Watson Discovery documentation on IBM Cloud.

Version 4.7.0 of the Watson Discovery service includes various fixes.

For more information, see What's new and changed in Watson Discovery.

Related documentation:
Watson Discovery
Watson Knowledge Catalog 4.7.0

The 4.7.0 release of Watson Knowledge Catalog includes the following features and updates:

Legacy features removed from Watson Knowledge Catalog
Before you upgrade Cloud Pak for Data, you must migrate all data from the legacy components to their replacement features. However, some legacy features do not have replacements in Version 4.7.0. If replacements are not available for the features that you use, postpone your upgrade. Review the guidance in Migrating and removing legacy governance features.
New UI capabilities for creating custom assets and managing custom properties for columns
Catalog collaborators with the Admin or Editor role can now complete the following tasks from the web client:
  • Create custom assets from the catalog. To add a custom asset, select Custom asset from the Add to catalog drop-down menu.
  • Manage custom properties for data asset columns. To manage custom properties, select a column in the Overview of an asset and edit the properties in the side pane.

To learn more about custom properties for data assets, see Custom asset types, properties, and relationships.

Find catalogs easily with search
With the updated Catalogs page, you can now search for a catalog by name, and you can see more catalogs on the page for easier scanning.
Reporting now available for custom assets
You can now create queries, reports, and dashboards based on custom-defined properties for any asset in a project or in a catalog. You can define new custom properties for assets to extend any provided or custom asset types and then create reports based on these relationships. For example, you can create a report on your data quality rules and artifact relationships to extrapolate the accuracy of your data. For more information, see Setting up reporting.
Reporting improvements for data quality rules
  • Receive and manage reports on data quality issues for each data asset in a catalog or a project.
  • Monitor ongoing data quality for data assets in projects and catalogs by using reporting for data quality scores and data quality dimensions scores. The data quality score is based on a weighted average from data quality dimension scores. The data quality dimensions scores are based on results from relevant data quality checks.
  • For data quality rules that include multiple rule definitions, see the data quality check statistics (results) by rule definition in the BI reporting schema.

For more information, see Data model.

Metadata import improvements
Import from additional data sources
You can now import technical and lineage metadata from Cognos Analytics data sources. For more information, see Supported data sources for metadata import, metadata enrichment, and data quality rules.
Additional import options
For metadata imports that use the Discover method, you can now specify options to:
  • Scope imports from relational databases to tables or to views and aliases.
  • Allow incremental imports so that only new or modified data assets are imported when you rerun the import.

For more information, see Designing metadata imports.

Capture lineage of ETL jobs
You can run metadata import to generate lineage information for ETL jobs in MANTA Automated Data Lineage. In the MANTA Automated Data Lineage user interface, you can see how a job moves data across systems and any data transformations that occur along the way. For more information, see Capturing ETL job lineage.
Data quality improvements
Data quality at a glance
Data quality information has a new home. For each data asset in a catalog or a project, a Data quality page is populated with quality information that comes from predefined data quality checks and data quality rules. You can see the applicable data quality dimensions and the results of individual quality checks. You can drill down into the results for each check or even into the results for each column.
Data quality tab in catalogs and projects.

For more information, see Data quality.

Run data quality rules on additional data sources
You can now apply data quality rules to asset from the following data sources:
  • Amazon DynamoDB (through a generic JDBC connection)
  • Apache Kudu (through a generic JDBC connection)
  • IBM Data Virtualization Manager for z/OS®
  • IBM Db2 Warehouse
  • IBM Match 360
  • Presto

For more information, see Supported data sources for data quality rules.

Metadata enrichment improvements
More data quality information for assets and columns
Now you can access the following quality information for assets and columns from metadata enrichment results:
  • Scores for applicable data quality dimensions
  • Results of predefined data quality checks
  • Results of data quality rules
Data quality information for enriched assets.

For more information, see Metadata enrichment results.

Store the results of data quality analysis in a database table
You now have the option to write the output of the predefined data quality checks that are run as part of metadata enrichment to a database. For example, you might want to store this data so that you can use the tables for tracking quality issues and as input to remediation processes. For more information, see Creating a metadata enrichment.
Introducing key-value search for advanced searches
Now, you can use the key:value format in the search bar to search within asset and artifact properties. You can use key-value pairs to search for specific descriptions, tags, custom properties, column names, and so on. For example, to search for assets that have the word customer in the column name, use the following key-value pair: column:customer. You can also save your queries for later use. For more information, see Searching for properties.
Enhanced Data Privacy content in Knowledge Accelerators
The Knowledge Accelerators Data Privacy content includes a set of classified business terms and data classes to accelerate the discovery and governance of personal information. In addition, sample data privacy policies and rules are available to describe the activities that are related to processing personal information.

The business terms and data classes have classifications to guide the identification of personal information (PI) and sensitive personal information (SPI). You can use metadata enrichment in Watson Knowledge Catalog to assign the business terms to imported data assets to identify assets that contain personal data.

The updated data privacy content includes:

Personal data taxonomy
  • 400-600 business terms that are grouped according to key data privacy concepts and aligned with IBM Personal Information (PI) and Sensitive Personal Information (SPI) classification standards.
  • Categories of business terms related to the GDPR and CCPA regulations.
Restructured and new data classes
  • 30 new data classes that are focused on identifying data that is relevant to data privacy.
  • New category structure to provide a single taxonomy for both the new and existing Watson Knowledge Catalog data classes. This structure also aids in selecting data classes in metadata enrichment.
New Watson Knowledge Catalog policies and rules for Data Privacy
  • Over 30 policy examples that reflect the main processes in Privacy by Design and the main regulatory requirements.
  • Set of sample prebuilt Data Privacy-specific data governance rules related to the policies.
  • Guidance on how to create data protection rules that can enforce governance policies and rules.
Knowledge Accelerators data privacy scope in Watson Knowledge Catalog.

For more information, see Data Privacy scope.

New Synonym terms in Knowledge Accelerators
Synonyms define alternative words and phrases for core business terms and facilitate communication across the business by converging terminology into one vocabulary.

Each of the Knowledge Accelerators now includes various common synonyms for core business vocabulary terms. The synonym terms aid in semantic search and metadata enrichment.

For more information, see Knowledge Accelerators synonyms.

Version 4.7.0 of the Watson Knowledge Catalog service includes various fixes.

For more information, see What's new and changed in Watson Knowledge Catalog.

Related documentation:
Watson Knowledge Catalog
Watson Knowledge Studio 5.0.0

The 5.0.0 release of Watson Knowledge Studio includes the following update:

Migration from MinIO to Multicloud Object Gateway
Starting in Cloud Pak for Data Version 4.7, MinIO is replaced by Multicloud Object Gateway. All data that was stored in MinIO will be migrated to Multicloud Object Gateway when you upgrade to Cloud Pak for Data Version 4.7.

Ensure that Multicloud Object Gateway is installed before you install or upgrade Watson Knowledge Studio and that you create the secrets that Watson Knowledge Studio needs to communicate with Multicloud Object Gateway.

For more information about how to install Multicloud Object Gateway and create secrets, complete the required prerequisite steps in the topics that describe how to install and upgrade the service.

Announcement

Version 4.7 is the last major release of Cloud Pak for Data that includes the Watson Knowledge Studio operator. The operator will be removed from the IBM Watson® Discovery for IBM Cloud Pak for Data cartridge in the next major release of Cloud Pak for Data. In addition, the service will not be displayed in the Services catalog. The change will not impact existing deployments of the operator on Cloud Pak for Data Version 4.7 or earlier releases.

Migrate your solutions to the Watson Discovery service, which has powerful custom natural language processing capabilities. You can import your existing Watson Knowledge Studio rules-based or machine learning models to Watson Discovery and apply them to your data as custom enrichments. You can also use the entity extractor feature in Watson Discovery to label and train new custom entity models.

For more information about these features, see Choose enrichments in the Watson Discovery product documentation on IBM Cloud.

For more information about migrating your solutions, see Migrating Knowledge Studio solutions in the Watson Discovery product documentation on IBM Cloud.

Version 5.0.0 of the Watson Knowledge Studio service includes various fixes.

Related documentation:
Watson Knowledge Studio
Watson Machine Learning 4.7.0

The 4.7.0 release of Watson Machine Learning includes the following features and updates:

Train an AutoAI experiment with large, tabular data in the AutoAI tool
When you train an AutoAI experiment with a large training data set, you use incremental learning to train pipelines with batches of the training data. After the training is complete, you review how individual batches affected the resulting pipeline. Now you can complete all training in the AutoAI tool without having to finish the training in a notebook. For more information, see Using incremental learning to train with a large data set.
Predict anomalies in time-series model predictions
Now you can use the AutoAI anomaly prediction feature to predict outlier values that are outside of the expected range in your time-series model predictions. For more information, see Creating a time series anomaly prediction.
Support added to AutoAI for virtualized data tables
You can now use virtualized data tables, created using Watson Query, as input for training or deploying an AutoAI experiment. For more information, see AutoAI overview.
Train AutoAI experiments with a smaller resource allocation
Conserve computing resources by choosing the new small size for training an AutoAI experiment. For more information, see AutoAI overview.
Process federated learning transactions without decrypting data
Now your Federated Learning experiments can use homomorphic encryption to train models with encrypted data without first decrypting the data. This feature provides an extra measure of security for federated data sources. For more information, see Applying homomorphic encryption for security and privacy.
Implement new model types and tuning methods for Federated Learning experiments
You can use new model types for existing frameworks. Classification and regression training model types are available for Tensorflow models, and K-means is available for Scikit-learn models.

For better model tuning, you can specify Epochs and toggle Data Sketch as optional hyper parameters.

For more information, see Frameworks, fusion methods, and Python versions.

Expanded user access for Federated Learning model training
Editors and viewers on Federated Learning experiments have more permissions:
  • Users with the Editor role in a project can edit and start an experiment.
  • Users with the Viewer role in a project can participate in model training

For more information, see Federated Learning architecture.

Evaluate model deployments from a deployment space
From a deployment space, you can now configure Watson OpenScale monitors to:
  • Evaluate online deployments for fairness
  • Monitor a deployment for drift from accuracy

To use this integrated feature, you must have access to a Watson OpenScale instance. For more information, see Monitoring a deployment for fairness.

Import Spark MLib, Scikit-learn, XGBoost, Tensorflow, and PyTorch machine learning models
In addition to models in PMML format, you can now the following types of models to use with Watson Machine Learning: Spark MLlib, Scikit-learn, XGBoost, Tensorflow, and PyTorch. For more information, see Importing models into a deployment space.
Frameworks and software specifications that are based on Python 3.10
Use the latest frameworks and software specifications to train and deploy your machine learning assets. For more information, see Supported frameworks and software specifications.

Version 4.7.0 of the Watson Machine Learning service includes various fixes.

Related documentation:
Watson Machine Learning
Watson Machine Learning Accelerator 4.0.0
Version 4.0.0 of the Watson Machine Learning Accelerator service includes the following features and updates:
New NVIDIA GPU Operator version
You can now use the following versions of the NVIDIA GPU Operator with Watson Machine Learning Accelerator:
  • On Red Hat OpenShift Container Platform Version 4.10, use NVIDIA GPU Operator v23.3.2, v22.9.2, v22.9.1, v22.9.0, 1.11, 1.10
  • On Red Hat OpenShift Container Platform Version 4.12, use NVIDIA GPU Operator v22.9.2 and v23.3.2
New deep learning libraries
You can now use the following deep learning libraries with Watson Machine Learning Accelerator:
  • Python 3.10.10
  • TensorFlow 2.12.0
  • PyTorch 2.0.0

If you have existing models, update and test your models to use the latest supported frameworks. For more information, see Supported deep learning frameworks in the Watson Machine Learning Accelerator documentation.

Migrating from a tethered namespace
Starting in Cloud Pak for Data Version 4.7, you cannot install the Watson Machine Learning Accelerator service to a tethered namespace.

If you previously provisioned the Watson Machine Learning Accelerator service instance in a tethered project, you must migrate the service instance to the project where the Cloud Pak for Data control plane is installed before you upgrade Watson Machine Learning Accelerator. For details, see: Migrating Watson Machine Learning Accelerator from a tethered namespace.

Using data assets or connections with deep learning experiments
When using deep learning experiments, you can now use training data that is available in your project data assets or connections. You can also customize your specifications for hardware when setting model definition attributes. For details, see: Training neural networks using the deep learning experiment builder.
New parameters added to elastic distributed training
A metrics parameter was added to the FabricModel definition that lists the metrics to be evaluated during model training and testing. A new dataset format was added to support TensorFlow training. For details, see: Elastic distributed training.
New data sources
Watson Machine Learning Accelerator now supports new data sources. For a list of IBM services and third-party services supported by Watson Machine Learning Accelerator, see: Supported data sources for Watson Machine Learning Accelerator.
New hardware specification
Watson Machine Learning Accelerator workloads can now run on CPU devices.
Watson Machine Learning Accelerator workloads can now run on MIG devices and mixed strategy devices.
Using Watson Machine Learning Accelerator notebooks
Starting in Cloud Pak for Data Version 4.7, Watson Machine Learning Accelerator notebooks are no longer available from the Watson Machine Learning Accelerator console.
The Watson Machine Learning Accelerator notebook runtime must now be installed for Watson Machine Learning Accelerator notebooks to be available as part of Watson Studio. After installing Watson Machine Learning Accelerator and Watson Studio, you will need to install and configure the Watson Machine Learning Accelerator notebook runtime. For details, see Working with Watson Machine Learning Accelerator notebooks.
If you have previously used Watson Machine Learning Accelerator notebooks, make sure to export your notebooks before upgrading Watson Machine Learning Accelerator. For details, see Preparing to upgrade Watson Machine Learning Accelerator.
Creating a custom runtime for Watson Machine Learning Accelerator
Starting in Cloud Pak for Data Version 4.7, conda runtime is not available to run workloads.

Use your own custom runtime image to run workloads. For details, see: Creating a custom runtime.

Version 4.0.0 of the Watson Machine Learning Accelerator service includes various fixes.

Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale 4.7.0

The 4.7.0 release of Watson OpenScale includes the following features and updates:

Integrate Watson OpenScale with your deployment spaces
You can now use Watson OpenScale to review model evaluation results and transaction records from Watson Machine Learning deployment spaces. For more information, see Evaluating deployments in spaces.
Configure deployments with a new guided setup
A new setup wizard is available to help you add deployments to the Watson OpenScale Insights dashboard and provide model details. For more information, see Adding deployments for evaluations.
Add multi-target prediction models
When you add deployments in Watson OpenScale, you can now specify multiple prediction columns to provide details about your model output to configure quality evaluations. For more information, see Providing model details.
Configure new drift evaluation to provide more insights
You can configure a new version of the drift evaluation in Watson OpenScale to generate the following new metrics:
  • Output drift
  • Feature drift
  • Model quality drift

For more information, see Configuring drift v2 evaluations.

Understand model performance with model health evaluations
Watson OpenScale now provides new model health evaluations by default to help you understand how efficiently your model processes your transactions. For more information, see Model health metrics.
New fairness metrics for batch deployments
When you add batch deployments in Watson OpenScale, you can now configure the following fairness metrics to measure performance:
Run fairness evaluations with unstructured data
You can now enable fairness evaluations on unstructured data types to identify bias. For more information, see Configuring fairness evaluations.
Evaluate batch deployments in preproduction
You can now configure batch processing for preproduction deployments. For more information, see Configuring batch processing.

Version 4.7.0 of the Watson OpenScale service includes various fixes.

For more information, see What's new and changed in Watson OpenScale.

Related documentation:
Watson OpenScale
Watson Pipelines 4.7.0

The 4.7.0 release of Watson Pipelines includes the following features and updates:

Run an SPSS Modeler job in a pipeline
You can now include SPSS Modeler jobs as a supported asset in a pipeline. Configure the new Run SPSS Modeler node to initiate a job so that you can use the results in your automated flow. For more information, see Configuring pipeline nodes.
Import pipelines into deployment spaces
You can import a pipeline as a read-only asset into a deployment space and run the pipeline job from the space as you would run other types of jobs. This capability makes it easier to prepare a flow for production. For more information, see Creating a pipeline.
More options for using cached data
Exercise greater control over how you save and use cached data. For example, you can:
  • Configure more global options for cache behavior
  • Reset the cache at runtime

For more information, see Managing default settings.

Set default environment parameters
By linking variables to values in the PROJDEF parameter set, you can reuse variables across pipelines. For more information, see Configuring global objects.
Data transformation with new expressions
You have more options when you apply transformations to date and string data types, and to utility and conversion functions. For more information, see DataStage functions used in pipelines Expression Builder.
Duplicate Pipelines flows
You can duplicate pipeline flows in your project. The process is similar to how you duplicate other project assets.
Automatically scale the Watson Pipelines service
You can enable automatic scaling of resources for the Watson Pipelines service. Watson Pipelines uses the Red Hat OpenShift Horizontal Pod Autoscaler (HPA) to increase or decrease the number of pods in response to CPU or memory consumption. For more information, see Automatically scaling resources for services.
Shut down and restart Watson Pipelines
You can now shut down and restart Watson Pipelines. Shutting down services when you don't need them helps you conserve cluster resources. For more information, see Shutting down and restarting services.

Version 4.7.0 of the Watson Pipelines service includes various fixes.

Related documentation:
Watson Pipelines
Watson Query 2.1.0
Version 2.1.0 of the Watson Query service includes the following features and updates.
Choose your query mode to prioritize either performance or consistency
You can now choose between running queries in Max Pushdown mode or in Max Consistency mode.
  • Max Pushdown mode ignores semantic difference between Watson Query and data source for single source queries. Therefore, more single source queries might be fully pushed down to data source, improving query performance. Query results are consistent with data source semantics for fully pushed down queries in this mode. Max Pushdown mode does not impact mulitple-source queries.
  • Max Consistency mode follows Watson Query semantics to evaluate whether operations can be pushed down to the data source. If the operation that is executed on the data source generates the same result as Watson Query, the operation can be pushed down. Queries in this mode might be fully pushed down if the remote data source has the same semantics as Watson Query.
Pushdown enhancements to improve query performance
Query pushdown is an optimization feature that reduces query times and memory use. This release of Watson Query includes the following enhancements in queries that use pushdown:
  • The following data source connections have been optimized to take advantage of more data source capabilities to improve query performance on single-source tables:
    • Salesforce.com
    • Db2 for i
  • Query performance is improved in pushdown mode in the following situations:
    • When you query string data on remote data sources with the IN predicate. For details about the IN predicate, see IN predicate in the Db2 documentation.
    • When you query data where the total width of the columns in the Select list is greater than 32 thousand.
    • When you use common sub-expressions (CSE) pushdown capabilities.
    • When you reference numeric data type functions in the query.
    • When you reference date and time type functions in the query.
Use your platform credentials to access Watson Query connections
When you use a platform connection to access Watson Query, you are prompted for your credentials. You can optionally select Use my platform login credentials, rather than entering your personal credentials for the connection. The connection uses your current session JSON Web Token (JWT).
Use advanced data masking on virtualized data
In this release of Watson Query, data masking performance is substantially improved. You can now use the advanced data masking options to avoid exposing sensitive data. See Masking virtual data in Watson Query to learn about the updated masking behavior in this release and for instructions on how to revert to the masking behavior from Cloud Pak for Data version 4.6.x if necessary.
Maintain authorizations when you rename a group
When you rename a group in Watson Query, you can now migrate the group-level authorizations to the new group name by using the MIGRATE_GROUP_AUTHZ stored procedure. For more information, see MIGRATE_GROUP_AUTHZ stored procedure.
Connect to data sources that have Kerberos authentication
You can now connect to data sources that use Kerberos authentication. For more information, see Enabling Kerberos authentication in Watson Query.
Query data in Microsoft Azure Data Lake Storage Gen2 data lakes
You can now connect to Microsoft Azure Data Lake Storage Gen2 data sources. For more information, see Supported data sources in Watson Query.
Manage who can access and perform operations on individual data sources
With data source access restrictions, you can explicitly manage access to individual data source connections that use shared credentials. You can assign users, user groups, and roles as collaborators for a data source connection. Only those collaborators can access the data source connection. You assign specific privileges to the collaborators to manage the actions that they can perform on the data sources. This enables you to separate privileges from roles, so that some users who are assigned a role such as Admin can access and take action on different data source connections than other Admin users.

For more information, see Data source connection access restrictions in Watson Query.

Deploy multiple instances of Watson Query
Previously, you could provision only one Watson Query service instance in a given instance of Cloud Pak for Data. You can now provision multiple Watson Query service instances by using tethered projects.

Each Watson Query service instance must be in a different project. For example, you can provision one service instance to the project where the Cloud Pak for Data control plane is installed, another instance to tethered project A, and a third instance to tethered project B.

Format and save formatted access plans for performance tuning
You can now format and save formatted access plans for performance tuning in Watson Query by using the EXPLAIN_FORMAT stored procedure. Run this procedure to build query access plans and download the generated EXPLAIN output in text files. For more information, see EXPLAIN_FORMAT stored procedure in Watson Query.
Use improved audit logging to monitor user activity and data access
You can monitor user activity with additional Watson Query auditable events in the areas of caching and data source isolation. You also now can monitor data access by using the Db2 audit facility. For more information, see Audit events for Watson Query and Auditing in Watson Query.

Version 2.1.0 of the Watson Query service includes various fixes.

For more information, see What's new and changed in Watson Query.

Related documentation:
Watson Query
Watson Speech services 4.7.0

The 4.7.0 release of the Watson Speech services includes the following features and updates:

Online backup and restore with OADP
You can now use the Cloud Pak for Data OpenShift APIs for Data Protection (OADP) backup and restore utility to do an online backup and restore of the Watson Speech services.

For more information, see Cloud Pak for Data online backup and restore.

Offline backup and restore with OADP is not available for the Watson Speech services.

Shut down and restart the Watson Speech services
You can now shut down and restart the Watson Speech services. Shutting down services when you don't need them helps you conserve cluster resources. For more information, see Shutting down and restarting services.

Version 4.7.0 of the Watson Speech services includes various fixes.

For more information, see What's new and changed in Watson Speech to Text.

Related documentation:
Watson Speech services
Watson Studio 7.0.0

The 7.0.0 release of Watson Studio includes the following features and updates:

Runtime 23.1 with Python and R
You can now use Runtime 23.1, which includes the latest data science frameworks on Python 3.10 and on R 4.2, to run Watson Studio Juypter notebooks, to train models, and to run Watson Machine Learning deployments.

Runtime 23.1 on R 4.2 in notebooks is supported on x86-64 hardware only.

To change environments, see Changing the environment of a notebook.

Enhanced Natural Language Processing capabilities in Runtime 23.1
Runtime 23.1 contains the new Watson Natural Language Processing library 4.1 and a new set of pre-trained models. The NLP library contains the following enhancements and updates:
  • Many included models are now transformer-based. These models were trained on the Slate large language model (LLM), which was created by IBM. The models are available in two versions:
    • Optimized for CPU-only environments
    • For environments with GPUs or CPUs
  • Many included models for different NLP tasks are now workflow-based instead of block-based, so you can apply the models directly on input text without worrying about preprocessing steps.
  • NLP includes a Slate foundation model that you can use for fine-tuning your NLP tasks. You can use the Slate model or any transformer-based model from Hugging Face as a base to build your own models with Watson NLP.
  • All models provided by IBM are now exclusively trained on unbiased data with state-of-the-art filtering for hate, bias, and profanity.

For more information, see Watson Natural Language Processing library.

New flow for adding data from a project file to a notebook
The notebook toolbar contains a new Code snippets icon that you can use to open the Code snippets pane. From the Code snippets pane, you can read data from a file or connection that was added to the project.

To generate code that inserts data to your notebook, you must now click the Code snippets icon, click Read data, and then select the data source from your project.

The Find and load data pane now serves only to upload data to a project; it does not generate any code inside the notebook.

For more information, see Loading and accessing data in a notebook.

Use JupyterLab and Juypter Notebook extensions to customize and enhance your development environment
You can now install JupyterLab and Juypter Notebook extensions to customize and enhance your development experience. Extensions can provide themes, editors, file viewers, and more. For more information, see Adding customizations to images.
Create, store, and share machine learning features
You can now speed the development of machine learning models by creating and sharing features. You add a feature group to a data asset in a project to identify the features of that data set. You can share the features with your organization by publishing the data asset to a catalog, which acts as a feature store. For more information, see Managing feature groups.
Improvements for managing your notification settings
You can now turn on Do not disturb to turn off the notifications that appear briefly in the web client.

To enable Do not disturb, click the Notifications icon (Notifications icon) in the toolbar. Then, click the Settings icon (Settings icon).

When you turn on Do not disturb, you can still see that you have unread notifications on the Notifications icon (Notifications icon) in the toolbar.

For more information, see Setting your notification preferences.

Use connections from different Cloud Pak for Data instances in Git-based projects
Git-based projects can be imported to multiple instances of Cloud Pak for Data. To ensure that you can access the same data from different instances of Cloud Pak for Data, you can create connections that are based on a copy of a platform connection. A connection that is based on a copy of a platform connection can be used across instances of Cloud Pak for Data. For more information, see Connecting to data sources in a Git-based project.
Removal of Scala environments
All runtime environments based on the Scala programming language have been removed.

Version 7.0.0 of the Watson Studio service includes various fixes.

For more information, see What's new and changed in Watson Studio.

Related documentation:
Watson Studio
Watson Studio Runtimes 7.0.0

The 7.0.0 release of Watson Studio Runtimes includes the following features and updates:

Runtime 23.1 with Python and R
You can now use Runtime 23.1, which includes the latest data science frameworks on Python 3.10 and on R 4.2, to run Watson Studio Juypter notebooks, to train models, and to run Watson Machine Learning deployments.

Runtime 23.1 on R 4.2 in notebooks is supported on x86-64 hardware only.

To change environments, see Changing the environment of a notebook.

Enhanced Natural Language Processing capabilities in Runtime 23.1
Runtime 23.1 contains the new Watson Natural Language Processing library 4.1 and a new set of pre-trained models. The NLP library contains the following enhancements and updates:
  • Many included models are now transformer-based. These models were trained on the Slate large language model (LLM), which was created by IBM. The models are available in two versions:
    • Optimized for CPU-only environments
    • For environments with GPUs or CPUs
  • Many included models for different NLP tasks are now workflow-based instead of block-based, so you can apply the models directly on input text without worrying about preprocessing steps.
  • NLP includes a Slate foundation model that you can use for fine-tuning your NLP tasks. You can use the Slate model or any transformer-based model from Hugging Face as a base to build your own models with Watson NLP.
  • All models provided by IBM are now exclusively trained on unbiased data with state-of-the-art filtering for hate, bias, and profanity.

For more information, see Watson Natural Language Processing library.

Removal of Scala environments
All runtime environments based on the Scala programming language have been removed.

Version 7.0.0 of the Watson Studio Runtimes service includes various fixes.

For more information, see What's new and changed in Watson Studio Runtimes.

Related documentation:
Watson Studio Runtimes

Installation enhancements

What's new What does it mean for me?
Red Hat OpenShift Container Platform support
You can deploy Cloud Pak for Data Version 4.7 on the following versions of Red Hat OpenShift Container Platform:
  • Version 4.10.0 or later fixes
  • Version 4.12.0 or later fixes
More control over instances with the private topology
Starting in IBM Cloud Pak for Data Version 4.7, each instance of Cloud Pak for Data has its own set of operators. The private topology simplifies the process of installing and managing multiple instances of Cloud Pak for Data at different releases on a single cluster.

The private topology replaces the express installation topology and the specialized installation topology.

If you are upgrading to IBM Cloud Pak for Data Version 4.7, you must migrate your existing installation to the private topology.

For more information, see Supported project (namespace) configurations.

Install or upgrade multiple components in parallel
Starting in IBM Cloud Pak for Data Version 4.7, you can install or upgrade multiple components in parallel. When you run a batch installation or upgrade, the apply-cr command automatically installs or upgrades up to 4 components at a time.

The apply-cr command ensures that the specified components are installed in the correct order. For example, the command ensures that the control plane is installed before any services are installed.

You can adjust the number of components that are installed or upgraded in parallel by specifying the --parallel_num option. For more information, see the manage apply-cr command reference.

Removals and deprecations

What's changed What does it mean for me?
LDAP integration provided by Cloud Pak for Data
By default, IBM Cloud Pak for Data user records are stored in an internal repository. However it is strongly recommended that you use an enterprise-grade password management solution, such as single sign-on (SSO) or LDAP.

If you decide to use an LDAP server, you currently have two methods for connecting to your LDAP server:

  • The LDAP integration provided by Cloud Pak for Data.
  • The LDAP integration provided by the IBM Cloud Pak foundational services Identity Management Service.

The LDAP integration provided by Cloud Pak for Data is deprecated and will be removed in a future release.

Installing
If you are installing IBM Cloud Pak for Data for the first time and you want to use an LDAP server to manage access to the platform, use the LDAP integration provided by the Identity Management Service. For more information, see Integrating with the Identity Management Service.
Upgrading
If you are upgrading to IBM Cloud Pak for Data Version 4.7 and you currently use the LDAP integration provided by Cloud Pak for Data, it is strongly recommended that you migrate your existing configuration to the Identity Management Service. To migrate your LDAP configuration:
  1. Integrate with the Identity Management Service.
  2. Change the cpadmin user to admin.
  3. Connect to your current LDAP server from the Identity Management Service (links to the IBM Cloud Pak foundational services documentation).

Repeat these steps for each instance of Cloud Pak for Data, that you want to integrate with the Identity Management Service.

Express and specialized installation topologies
Previously, all instance of IBM Cloud Pak for Data on a cluster were managed by a set of shared operators. The location of the operators depended on the installation topology that you chose when you installed IBM Cloud Pak for Data:
Express installation topology
In an express installation, the shared IBM Cloud Pak for Data operators were co-located with the shared IBM Cloud Pak foundational services operators in the ibm-common-services project.
Specialized installation topology
In a specialized installation, the shared IBM Cloud Pak for Data operators were in a separate project from the shared IBM Cloud Pak foundational services operators, which were typically in the ibm-common-services project.

Starting in IBM Cloud Pak for Data Version 4.7, both of these installation topologies are replaced by the private topology. In the private topology, each instance of Cloud Pak for Data has its own set of operators. When you upgrade to IBM Cloud Pak for Data Version 4.7, you must migrate your current installation to the private topology. For more information, see Upgrading Cloud Pak for Data.

Watson Knowledge Catalog legacy features are removed
Starting with IBM Cloud Pak for Data Version 4.7.0, the legacy features of Watson Knowledge Catalog are removed.

These features were part of the InfoSphere Information Server components that were installed with the Watson Knowledge Catalog base configuration. They can no longer be installed, upgraded, or used.

If you want to upgrade a system with the base configuration to Cloud Pak for Data Version 4.7.0, you must migrate all data from the legacy components to their equivalents in the Watson Knowledge Catalog core installation. However, some legacy features do not have replacements in Version 4.7.0. To check whether replacements for your legacy features are available, see Migrating metadata and removing Watson Knowledge Catalog legacy components. If some replacements are not available, work with your IBM representative to determine the best timeline for your upgrade.

IBM Spectrum Conductor clusters are no longer supported (Execution Engine for Apache Hadoop)
You can no longer set up or select IBM Spectrum Conductor clusters to use Execution Engine for Apache Hadoop in Watson Studio.

You must install and set up Hadoop clusters to use Execution Engine for Apache Hadoop. Watson Studio interacts with Hadoop clusters through WebHDFS, Jupyter Enterprise Gateway, and Livy for Spark services.

Discontinued connections (common core services)
The following connections were removed in IBM Cloud Pak for Data Version 4.7:
  • IBM Db2 Event Store
  • IBM Db2 Hosted
Python 3.9 Python 3.9 is deprecated is deprecated and will be removed in a future release. Start using Python 3.10 instead.

This change affects the following services:

  • Decision Optimization
  • Watson Studio Runtimes
Scala runtime environments removed All runtime environments based on the Scala programming language have been removed.

This change affects the following services:

  • Watson Studio
  • Watson Studio Runtimes
Spark 3.2 Spark 3.2 is deprecated and will be removed in a future release.

This change affects the following services:

  • RStudio Server Runtimes
  • Watson Studio Runtimes

Spark 3.2 was removed from the following services:

  • Analytics Engine powered by Apache Spark
  • Data Refinery

Start using Spark 3.3 instead.

R 3.6 R 3.6 is deprecated and will be removed in a future release. Start using R 4.2 instead.

This change affects the following services:

  • Data Refinery
  • RStudio Server Runtimes
  • Watson Studio
  • Watson Studio Runtimes
Submitting a request for data
The data requests (Data > Data requests) feature was removed. Use workflows instead.
Offline backup and restore
Offline backup and restore is deprecated and will be removed in a future release. It is recommended that you create online backups. For more information, see Backing up and restoring Cloud Pak for Data.

Previous releases

Looking for information about previous releases? See the following topics in IBM Documentation: