What's new in IBM Cloud Pak for Data?

See what new features and improvements are available in the latest release of IBM Cloud Pak for Data.

What's new in Version 5.0

IBM Cloud Pak for Data Version 5.0 includes new features for the platform and for many services. Version 5.0 includes focused end-user experiences for customers who install multiple solutions on the Cloud Pak for Data control plane, context-aware documentation recommendations, HTTP proxy support, and a standard method for storing and accessing custom certificates. Version 5.0 also includes expanded deployment environment options and the ability to run workloads on remote physical locations.

This release includes enhancements to existing services, new services, and changes to how services are delivered.
New services
  • Data Product Hub enables teams to share governed data assets so that teams can easily access the data that they need.
  • IBM Knowledge Catalog Standard offers basic governance tooling for cataloging and AI-augmented data enrichment.
  • IBM Knowledge Catalog Premium includes the full governance framework with data privacy, data quality, cataloging, and enrichment across the data lifecycle with a generative AI layer for enhanced data enrichment.
  • watsonx Code Assistant for Red Hat® Ansible® Lightspeed helps automation teams create, adopt, and maintain Ansible content more efficiently.
  • watsonx Code Assistant for Z helps developers modernize their mainframe applications using a combination of automation and generative AI.
Renamed services
  • Db2 Data Gate is now Data Gate
  • Watson Pipelines is now Orchestration Pipelines
  • Watson Query is now Data Virtualization

For more information, review the information in the following sections:

In addition, review the following topics:

Platform enhancements

The following table lists the new features that were introduced in Cloud Pak for Data Version 5.0.

What's new What does it mean for me?
Focused experiences for environments with multiple solutions
IBM Cloud Pak for Data is the foundation for multiple solutions. Starting in Version 5.0, Cloud Pak for Data includes multiple experiences. The experiences that are available in your environment depend on the services that you install.

An experience provides focused access to the tools that you need to complete specific tasks. For example:

  • In the Data Product Hub experience, teams can focus on publishing and sharing data products.
  • In the watsonx experience, teams can focus on training, validating, tuning, and deploying generative AI solutions.

Each experience has a dedicated home page. The cards that are available on the home page help you get started with the solution and give you quick access to the tools that you need.

For more information on the new experiences, see Switching between experiences.

Run Cloud Pak for Data workloads on remote clusters
By default, an instance of IBM Cloud Pak for Data runs in a set of projects (namespaces) on a single Red Hat OpenShift® Container Platform cluster. Starting in Cloud Pak for Data Version 5.0, you can expand your Cloud Pak for Data deployment by installing IBM Cloud Pak for Data agents on a remote cluster to create remote physical locations.

After you set up a remote physical location, you can register the physical location with the instance of Cloud Pak for Data that you want to expand. Then, you can add the physical location to a data plane. A data plane is a logical grouping of one or more physical locations. Users can deploy workloads to a data plane. The workload will be scheduled on one of the physical locations associated with the data plane.

For more information, see:

New privileged monitors provide more insight into your cluster health
The privileged monitoring service includes new monitors that give Cloud Pak for Data administrators more insight into the health of the cluster where Cloud Pak for Data is deployed:
Cluster operator status check (check-cluster-operator-status)
Checks the status of the cluster operators that comprise the Red Hat OpenShift Container Platform infrastructure to determine whether:
  • All of the operators are AVAILABLE
  • Any of the operators are DEGRADED
Network status check (check-network-status)
Checks the status of the PodNetworkConnectivityCheck objects for cluster resources to determine whether the objects are Reachable.
Node imbalance status check (check-node-imbalance-status)
Checks whether vCPU requests are balanced across nodes or whether one node is supporting a disproportionately high load.

To install the privileged monitoring service, see Installing privileged monitors for an instance of IBM Cloud Pak for Data.

If you installed the privileged monitor service in IBM Cloud Pak for Data Version 4.8, see Upgrading the privileged monitoring service.

Collecting telemetry data for Cloud Pak for Data
You can send telemetry data to IBM Software Central, which enables you to view license data and entitlements across your hybrid cloud deployments in one place. IBM Software Central helps you:
  • Remain compliant with your license agreements.
  • Get insight into your use so that you can more accurately predict future software needs and spending.

For information about the data that is collected, see How Metering API works in the IBM Software Central documentation.

To configure Cloud Pak for Data to send telemetry data, see Collecting telemetry data from IBM Cloud Pak for Data.

Standard method for managing custom certificates
Previously, each service used a different method for using custom certificates. Cloud Pak for Data now has a standard method for using custom certificates.

If you install the Cloud Pak for Data configuration admission controller, you can create a secret that contains a set of custom certificates that you can use across multiple services. After you create the secret, you can inject the secret into Cloud Pak for Data pods so that they have access to the custom certificates.

For more information, see Creating a secret to store shared custom certificates

Apply cluster-wide HTTP proxy settings to Cloud Pak for Data
If you have a cluster-wide HTTP proxy configuration on your cluster, you can apply the proxy configurations to your Cloud Pak for Data deployment. For more information, see Applying cluster HTTP proxy settings to IBM Cloud Pak for Data.
Context-aware documentation recommendations
You can open the Assist me panel in the Cloud Pak for Data web client to get context-aware documentation recommendations. Assist me uses embedded keyword searches to find relevant documentation on ibm.com. You can also use Assist me to run your own searches.

To get started with Assist me, click Assist me Assist me icon in the web client toolbar.

Validate network connectivity
You can use the cpd-cli health network-connectivity command to run network health checks on resources in your Cloud Pak for Data deployment after installation, before upgrading, and after upgrading.

You can also run the network-connectivity command as part of the runcommand command.

Validate service functionality
You can use the cpd-cli health service-functionality command to validate that your installed services are functioning correctly.
Polish translations available
The IBM Cloud Pak for Data control plane and select services are now available in Polish. For more information, see Language support.

Service enhancements

The following table lists the new features that are introduced for existing services in Cloud Pak for Data Version 5.0:

Software Version What does it mean for me?
Cloud Pak for Data common core services 9.0.0
This release of Common core services includes the following features:
Use data source definitions to manage and protect data that is accessed from connections
Data source definitions are a new type of asset that you define based on a connection or connected data asset's endpoints. When you create a data source definition, you can monitor where your data is stored across multiple projects, catalogs, or multi-node data sources. You can also apply the correct protection solution (enforcement engine) based on the data source definition.

For details, see Data protection with data source definitions.

Name change for the IBM Watson Query connection
The "IBM Watson Query" connection is renamed to "IBM Data Virtualization." Your previous settings for the connection remain the same. Only the connection name is changed.
New menu terms to open the Platform connections page
Previously the path to the Platform connections page in the navigation menu was Data > Platform connections. The new path is Data > Connectivity. The Connectivity page has a tab for Platform connections.
Access more data with new connectors
You can now work with data from these data sources:
Flight service supported by watsonx.data and Data Product Hub
You can now use Flight service with watsonx.data and Data Product Hub to load data securely. To see a complete list of services that support the Flight service APIs, see Accessing data sources with the Flight service.

Version 9.0.0 of the common core services includes various fixes.

For details, see What's new and changed in the common core services.

If you install or upgrade a service that requires the common core services, the common core services will also be installed or upgraded.

Cloud Pak for Data scheduling service 1.25.0
This release of scheduling service includes the following features and updates:
Schedule workloads on remote physical locations
If you plan to extend your Cloud Pak for Data deployment with remote physical locations, you must install the scheduling service on the primary Cloud Pak for Data cluster and on the remote physical location. For more information, see:

Version 1.25.0 of the scheduling service includes various fixes.

For details, see What's new and changed in the scheduling service.

Related documentation:
AI Factsheets 5.0.0

Version 5.0.0 of the AI Factsheets service includes various fixes.

For details, see What's new and changed in AI Factsheets.

Related documentation:
AI Factsheets
Analytics Engine powered by Apache Spark 5.0.0
This release of Analytics Engine powered by Apache Spark includes the following features:
Use the Sparks labs IDE to develop or debug your own applications
Now you can develop or debug and run your own applications in the new Spark labs IDE, which is installed as a Visual Studio Code extension. For more information, see Running Spark applications interactively.
Instance credentials are now masked for better security
The credentials present in Spark Configurations and Environment Variables are automatically masked to improve security. By default, all V4 APIs mask all the secrets and credentials passed in Spark configurations and Environment variables in the Instance and Application APIs. The change is visible from the Instance Details page.
Updated GET API now returns all applications
The Get Applications List API for Analytics Engine now returns all applications by default to allow pagination. You can use the query parameter state and pagination queries to filter the applications in the API.
Data retention feature for administrators
OCP administrators can now retain or delete a designated number of Spark applications and kernels that are associated with a specific Spark instance from the IBM Analytics Engine metastore.
Auto-scaling Spark workloads
You can now enable the auto-scaling feature for a Spark application by adding the configuration setting ae.spark.autoscale.enable=true to the existing application configuration.

A Spark application that has auto-scaling enabled can automatically determine the number of executors required by the application based on the application's usage.

Separate storage for shuffle data
You can now store shuffle data separately from the compute nodes. Separate storage allows for more efficient resource utilization. Data is stored in a separate shared volume or object store. For more information, see Running Spark applications interactively.

Version 5.0.0 of the Analytics Engine powered by Apache Spark service includes various fixes.

Related documentation:
Analytics Engine powered by Apache Spark
Cognos Analytics 26.0.0
This release of Cognos Analytics includes the following features and updates:
Integration with Planning Analytics

You can now create data server connections to Planning Analytics service instances that are running on Cloud Pak for Data. For details, see Support for Planning Analytics as a Service in the Cognos Analytics documentation.

Cognos Analytics uses CA certificates to connect

You can now use your company's CA certificates on Cloud Pak for Data to validate certificates from your internal servers and connect to Cognos Analytics. Previously, you had to manually copy the certificates to the artifacts shared volume before you could use them to connect to Cognos Analytics. For details, see Creating a secret to store shared custom certificates.

Audit logging

Cognos Analytics now has auditable events that are generated and forwarded by the Audit Logging Service to help you detect and prioritize security threats and data breaches. For details, see Audit events for Cognos Analytics.

New instance plan size

Cognos Analytics now has a new plan size, XSmall, which you can select when you create a service instance. For details, see Creating a service instance for Cognos Analytics.

Updated software version for Cognos Analytics
The 26.0.0 release of the service provides Version 12.0.3 of the Cognos Analytics software. For details, see Release 12.0.3 - New and changed features in the Cognos Analytics documentation.

Version 26.0.0 of the Cognos Analytics service includes various fixes.

Related documentation:
Cognos Analytics
Cognos Dashboards 5.0.0
This release of Cognos Dashboards includes the following features and updates:
Updated software version
This release of the service provides Version 12.0.3 of the Cognos Analytics dashboards software. For details, see Release 12.0.3 - Dashboards in the Cognos Analytics documentation.

Version 5.0.0 of the Cognos Dashboards service includes various fixes.

Related documentation:
Cognos Dashboards
Data Gate 6.0.0
This release of Data Gate includes the following features and updates:
Show the RUNSTAT timestamp for target tables in the web UI
In the Data Gate web UI, you can now view timestamps that show when the most recent RUNSTAT operation occurred for each target table. With this information, you can check to ensure that RUNSTAT is running as expected on the target tables.

Version 6.0.0 of the Data Gate service includes various fixes.

For details, see What's new and changed in Data Gate.

Related documentation:
Data Gate
Data Privacy 5.0.0
This release of Data Privacy includes the following features and updates:
Data protection rules no longer enforced in projects

Data protection rules are now only enforced either in governed catalogs or by a deep enforcement solution. A deep enforcement solution is a protection solution to enforce rules on data that is outside of Cloud Pak for Data when the data source is integrated with one of these services:

  • IBM Data Virtualization
  • IBM Security Guardium Data Protection
  • IBM watsonx.data

Assets that are added into projects from a governed catalog no longer have preview, download or profiling restricted by data protection rules unless you have configured a deep enforcement solution.

You will be reminded of the revised data protection rule enforcement protocols when you:
  • Create a data protection rule
  • Copy an asset from a governed catalog into a project
For details, see Revised protocol for enforcing data protection rules.
Defining a data source definition with a protection solution

A protection solution is a method of enforcing the data protection rules either in governed catalogs or by a deep enforcement solution.

To configure the platform with a deep enforcement solution, you can create a data source definition to set the data source type. The data source type determines which types of connections the data source definition can be associated with and your available protection solution options. For details, see Protection solutions for data source definition.

Tracking data protection rule enforcement decisions
You can now track enforcement decisions as audit events when the Send policy evaluations to audit logs check box is selected from the Manage rule settings page. For details, see Audit events for Data Privacy.

Version 5.0.0 of the Data Privacy service includes various fixes.

For details, see What's new and changed in Data Privacy.

Related documentation:
Data Privacy
Data Refinery 9.0.0

Version 9.0.0 of the Data Refinery service includes various fixes.

For details, see What's new and changed in Data Refinery.

Related documentation:
Data Refinery
Data Replication 5.0.0

Version 5.0.0 of the Data Replication service includes various fixes.

For details, see What's new and changed in Data Replication.

Related documentation:
Data Replication
DataStage 5.0.0
This release of DataStage includes the following features and updates:
Run DataStage jobs in multiple locations with a remote data plane

You can now deploy on a remote data plane to run DataStage jobs in multiple locations, including in different geographies or cloud providers, without creating multipleDataStage instances. For more information, see Deploying on a remote data plane.

Import and export selected asset types

You can now select specific asset types to import or export from a .zip file that contains DataStage assets. By default, all asset types are selected.

Set up metrics storage at the project level for your DataStage flows

You can now use the metrics repository to store metrics in a database. For more information, see Storing and persisting DataStage metrics.

Name changes for DataStage connections and connectors
  • "Apache Cassandra (optimized)" is now "Apache Cassandra for DataStage."
  • "IBM Db2 (optimized") is now "IBM Db2 for DataStage."
  • "IBM Netezza Performance Server (optimized)" is now "IBM Netezza Performance Server for DataStage."
  • "IBM Watson Query" is now "IBM Data Virtualization."
  • "Oracle (optimized)" is now "Oracle Database for DataStage."
  • "Salesforce.com (optimized)" is now "Salesforce API for DataStage."
  • "Teradata (optimized)" is now "Teradata database for DataStage."

Your previous settings for the connections, connectors, and their associated jobs remain the same. Only the connection and connector names are changed.

Connect to more data sources in DataStage
You can now include data from these data sources in your DataStage flows:
  • IBM Planning Analytics
  • Microsoft Azure Databricks
  • MinIO
  • SAP BAPI

For the full list of connectors, see Supported data sources in DataStage.

Version 5.0.0 of the DataStage service includes various fixes.

For details, see What's new and changed in DataStage.

Related documentation:
DataStage
Data Virtualization 3.0.0
This release of Data Virtualization includes the following features and updates:
Watson Query is now Data Virtualization
The Watson Query service was renamed to Data Virtualization, and you will notice some changes in the user interface. The IBM Data Virtualization connector is also renamed to IBM Data Virtualization. Your previous settings for the connector remain the same. Only the connector name is changed.
Enforce data protection rules across Cloud Pak for Data
You can now use the new Cloud Pak for Data Data Source Definitions (DSD) to enforce IBM Knowledge Catalog data protection rules consistently across Cloud Pak for Data, regardless of whether you query the object through Data Virtualization or preview it in a catalog or project. A DSD is automatically created when you provision or upgrade your Data Virtualization instance to Cloud Pak for Data 5.0. For details, see Data protection with data source definitions. See also Governing virtual data with data protection rules in Data Virtualization.
New supported data source
REST API is now a supported data source in Data Virtualization.
    • REST API is a generic third-party data source that you access by using an API. This type of data source requires that you first create a Model file to map the API outputs to table structures in Data Virtualization.
    For details, see Supported data sources.
Updates to supported data sources
  • Generic JDBC driver functionality now supports Databricks using the native driver.
  • Spark SQL is a third-party data source that has two authentication options to set a connection: username and password credentials or Kerberos authentication.
For details, see Supported data sources.
Pushdown enhancements to improve query performance
This release of Data Virtualization improves the performance of queries that use pushdown. Query pushdown is an optimization feature that reduces query times and memory use. Data Virtualization now includes the ability to:
    • Support OLAP functions when you connect to Oracle data sources. This support includes functions MIN, MAX, SUM, COUNT, COUNT_BIG, ROW NUMBER/ROWNUMBER, RANK, DENSERANK, DENSE_RANK, STDDEV_SAMP, PERCENTILE_CONT, PERCENTILE_DISC, and PERCENT_RANK when used in the query with the OLAP function specification. For details, see OLAP specification in the IBM Db2 documentation.
    • Common subexpression pushdown to Oracle data sources.
    • Use pushdown for various other string functions, including CASTs, TRIM, BITAND, and others.
Query tables from previous Presto and Databricks catalogs with multiple catalog support
Virtual tables that you create from Presto and Databricks catalogs are now fully accessible. You can run queries on these tables regardless of any changes that you make to the catalog filters. This means that you do not need to switch back to previous Presto or Databricks catalogs to ensure the functionality of existing queries. For details on supported data sources, see Supported data sources in Data Virtualization.
Automatically scale Data Virtualization instances
You can now automatically scale Data Virtualization instances to support high-availability or increase processing capacity, rather than manually setting the size, CPU, and memory resource values after you provision instances. For details, see Scaling Data Virtualization.
Mask multibyte characters for enhanced privacy of sensitive data
You can now perform partial redaction and basic obfuscation of multibyte characters such as symbols, characters from non-Latin alphabets like Chinese or Arabic, and special characters that are used in mathematical notation. The rest of the masking methods that involve multibyte characters are masked with the character “X”. For details, see Masking virtual data in Data Virtualization.
View the data protection rules that are applied to a user
You can now view details about the data protection rules that apply to a Data Virtualization object for a specific user by using the EXT_AUTHORIZER_EXPLAIN stored procedure. For details, see EXT_AUTHORIZER_EXPLAIN stored procedure.
Data Virtualization connections in catalogs now reference the platform connection
When you publish objects to a catalog, the Data Virtualization connections that are created from that publication now reference the main Data Virtualization connection in Platform connections. This means that information such as personal credentials only needs to be defined or updated one time in the Data Virtualization platform connection. All referenced connections now automatically reflect changes that are made to the main Data Virtualization connection.
Enhanced catalog visibility for Presto and Databricks
The Presto and Databricks web client now displays the name of the catalog that you selected in the breadcrumbs of the Explore view, and beside each schema name in the List view.
Enhanced security for profiling results in Data Virtualization views
To prevent unexpected exposure to value distributions through the profiling results of a view, all users are denied access to profiling results in Data Virtualization views in all catalogs and projects.

Version 3.0.0 of the Data Virtualization service includes various fixes.

For details, see What's new and changed in Data Virtualization.

Related documentation:
Data Virtualization
Db2 5.0.0

Version 5.0.0 of the Db2 service includes various fixes.

For details, see What's new and changed in Db2.

Related documentation:
Db2
Db2 Big SQL 7.7.0
This release of Db2 Big SQL includes the following features:
Query data in Microsoft Azure Data Lake Storage Gen2 data lakes
You can now connect to Microsoft Azure Data Lake Storage Gen2 data sources. For details, see Setting up a connection from Db2 Big SQL to a remote data source.
Improved process to secure Db2 Big SQL instances
Capabilities that are provided by the Cloud Pak for Data platform now make it easier for you to enable TLS and connect Db2 Big SQL instances to TLS-enabled Hadoop clusters. For details, see Connecting to a TLS (SSL)-enabled Hadoop cluster.

Version 7.7.0 of the Db2 Big SQL service includes various fixes.

Related documentation:
Db2 Big SQL
Db2 Data Management Console 5.0.0

Version 5.0.0 of the Db2 Data Management Console service includes various fixes.

For details, see What's new and changed in Db2 Data Management Console.

Related documentation:
Db2 Data Management Console
Db2 Warehouse 5.0.0

Version 5.0.0 of the Db2 Warehouse service includes various fixes.

For details, see What's new and changed in Db2 Warehouse.

Related documentation:
Db2 Warehouse
Decision Optimization 9.0.0
This release of Decision Optimization includes the following features:
Easier table selection and configuration options when saving Decision Optimization models for deployment
When you save a model for deployment from the Decision Optimization user interface, you can now review the input and output schema, and more easily select the tables that you want to include. You can also add, modify or delete run configuration parameters, review the environment, and the model files used.

Save model for deployment dialog open with input output schema displayed.

For more information, see Deploying a Decision Optimization model by using the user interface.

Download intermediate solution statistics for Decision Optimization
If you choose to display intermediate solutions in your run configuration, you can now download the statistics when a Decision Optimization solve is completed. You can view these statistics locally and compare them with other model solutions. You can also view the last 3 intermediate solution KPIs in the Explore solution view.

For more information, see Intermediate solutions in a Decision Optimization experiment.

Use pivot tables to display data aggregated in Decision Optimization experiments
You can now use pivot tables to display both input and output data aggregated in the Visualization view in Decision Optimization experiments.

For more information, see Pivot table widgets

Python 3.11 is now available
In addition to Python 3.10, you can now use Python 3.11 in your Decision Optimization environment to run and deploy Decision Optimization models that are formulated in DOcplex in Decision Optimization experiments. Modeling Assistant models also use Python because DOcplex code is generated when models are run or deployed.
You can update your Python version as follows:

Version 9.0.0 of the Decision Optimization service includes various fixes and updates.

For details, see What's new and changed in Decision Optimization.

Related documentation:
Decision Optimization
EDB Postgres 12.18, 13.14, 14.11, 15.6, 16.2

This release of the EDB Postgres service includes various fixes.

For details, see What's new and changed in EDB Postgres.

Related documentation:
EDB Postgres
Execution Engine for Apache Hadoop 5.0.0

Version 5.0.0 of the Execution Engine for Apache Hadoop service includes various fixes.

For details, see What's new and changed in Execution Engine for Apache Hadoop.

Related documentation:
Execution Engine for Apache Hadoop
IBM Knowledge Catalog 5.0.0
This release of IBM Knowledge Catalog includes the following features and updates:
Additional IBM Knowledge Catalog editions
You can continue to use the classic IBM Knowledge Catalog service, or you can choose one of the two new, separately priced editions of IBM Knowledge Catalog:
IBM Knowledge Catalog Standard Cartridge
This edition offers basic governance tooling for cataloging and AI-augmented data enrichment.
IBM Knowledge Catalog Premium Cartridge
This edition offers the full governance framework with data privacy, data quality, cataloging, and enrichment across the data lifecycle with a generative AI layer for enhanced data enrichment.
In addition to governance capabilities as in the classic IBM Knowledge Catalog service, the cartridges provide semantic and AI-augmented data enrichment:
  • Recommend descriptive names for data assets and columns based on the collected metadata and a predefined glossary.
  • Suggest and assign semantic descriptions for data assets and columns that are easy to understand. The descriptions are generated based on the surrounding columns and the context of the data assets.
  • Generate semantic term assignments for data assets and columns.

For details, see IBM Knowledge Catalog.

Import metadata from additional data sources
You can now import metadata and lineage metadata from the following data sources:
MicroStrategy
Use a new connection to import data. For details, see Supported data sources for metadata import, metadata enrichment, and data quality rules.
OpenLineage
Import the data from a .zip file. For details, see Importing ETL jobs and Getting ETL job lineage.
Data quality enhancements
You can now add data assets or columns with the new relationship type Validates data quality of to any type of data quality rule to have the quality score and any data quality issues reported for this item on the Data quality page. With this enhancement, data quality rules with externally managed bindings and SQL-based data quality rules can now also contribute to the quality scores of assets and columns.

For details, see Creating rules from data quality definitions and Creating SQL-based data quality rules.

Data protection rules are no longer enforced in projects
Data protection rules are now only enforced in governed catalogs or by a deep enforcement solution. Assets that are added into projects from a governed catalog no longer have preview, download, or profiling restricted by data protection rules. For more information, see Data protection rules no longer enforced in projects.
Enhanced project list view in catalogs
Now, when you are adding assets from a catalog to a project, you can view more than 100 projects in your project list page and add up to 50 assets at a time to your project. For more information, see Add assets from within the catalog.
Enhancements in governance artifacts
  • You can now make changes to multiple governance artifacts at once. Bulk edits are available when updating tags and stewards. For more information, see Managing governance artifacts.
  • Now you can move any category either to the top level or to any other category as a sub-category. The collaborators are also moved provided they have required permissions on the new parent category. For more information, see Managing categories.
  • You can now add custom properties and relationships for reference data sets. For more information, see Designing reference data sets.
  • Notifications about changes in governance artifacts, for example, when an artifact is added, updated, or deleted, can now be forwarded to external applications or users. For more information, see Forwarding notifications generated by Cloud Pak for Data services.
Knowledge Accelerators
Additional data classes

There are over 20 new data classes that can be used to identify and classify national identifiers, tax identifiers and social security identifiers for the additional jurisdictions of Argentina, Egypt, Finland, Greece, Hong Kong, Ireland, Malaysia, New Zealand, Pakistan, Peru, Romania, Thailand, Turkey, and United Arab Emirates.

These new data classes supplement previously added data classes to provide an enhanced framework for identifying and classifying data of particular relevance to data privacy.

For more information, see Knowledge Accelerators data classes.

Updated business scopes for Relationship Explorer

The Knowledge Accelerators contain a set of predefined business scopes that group the set of business terms that are relevant to a specific business topic. Many of these scopes were reorganized to ensure that they are optimized for viewing in the new Relationship Explorer capability of IBM Knowledge Catalog. Also, new business scopes were added to Financial Services.

In addition, certain term-to-term relationships across the Knowledge Accelerators were simplified to improve clarity when viewing them in Relationship Explorer.

For more information, see Business scopes for Knowledge Accelerators.

Relationship Explorer to visualize your metadata
Relationship Explorer is now available to help better understand your data. This new feature helps you to visualize, explore and govern your metadata. Discover how your governance artifacts and data assets relate with each other in a single view. For more information, see Relationship Explorer.screenshot of relationship explorer
Expand DataStage jobs in the lineage graph
When you are viewing a DataStage job in the lineage graph, you can expand the job to view all its stages. For more information, see Lineage.
Enhanced security for profiling results in Data Virtualization and watsonx.data views
To prevent unexpected exposure to value distributions through the profiling results of a view, all users are denied access to profiling results in Data Virtualization and watsonx.data views in all catalogs and projects.

Version 5.0.0 of the IBM Knowledge Catalog service includes various fixes.

For details, see What's new and changed in IBM Knowledge Catalog.

Related documentation:
IBM Knowledge Catalog
IBM Match 360 4.0.23
This release of IBM Match 360 includes the following features and updates:
Use mapping patterns to avoid manually mapping new assets

Now IBM Match 360 alerts you when a new data asset is similar to an existing data asset that is already mapped to your data model. You can save time and avoid manual mapping by using a mapping pattern to map new data assets that share the same structure as an existing, mapped asset. Mapping patterns are automatically created from the mapped data assets in your system. You can manage and apply mapping patterns within your configuration snapshots.

Figure 1. Using mapping patterns to quickly map assets
The Apply mapping patterns screen lets you review and apply mapping selections from existing data assets.

For information about applying a mapping pattern to data asset, see Adding data and mapping it to your data model. For information about managing mapping patterns and configuration snapshots, see Saving and loading master data configuration snapshots.

Version 4.0.23 of the IBM Match 360 service includes various fixes.

For details, see What's new and changed in IBM Match 360.

Related documentation:
IBM Match 360 with Watson
Informix 8.0.0

Version 8.0.0 of the Informix service includes various fixes.

For details, see Fix list for Informix Server 14.10.xC6 release.

Related documentation:
Informix
MANTA Automated Data Lineage 42.5.4

Version 42.5.4 of the MANTA Automated Data Lineage service includes various fixes.

For details, see What's new and changed in MANTA Automated Data Lineage.

Related documentation:
MANTA Automated Data Lineage
MongoDB 5.0.23-ent, 6.0.12-ent

This release of the MongoDB service include various fixes.

For details, see What's new and changed in MongoDB.

Related documentation:
MongoDB
OpenPages 9.002.1
This release of OpenPages includes the following features:
Oracle as an external database

Now you can use Oracle as an external database with OpenPages when you do a fresh installation of OpenPages.

For details about using Oracle as an external database, see Setting up an external Oracle database for OpenPages.

Version 9.002.1 of the OpenPages service includes various fixes.

For details, see What's new and changed in OpenPages.

Related documentation:
OpenPages
Orchestration Pipelines 5.0.0
This release of Orchestration Pipelines includes the following features:
IBM Watson Pipelines is now IBM Orchestration Pipelines

The new service name reflects the capabilities for orchestrating parts of the AI lifecycle into repeatable flows.

Migrate DataStage dependencies

With one click in the toolbar, you can now download or upload projects that contain DataStage dependencies. See Running and saving pipelines.

Expanded toolkit for annotation styling
You can now apply more style and formatting options to your pipeline comments and annotations. You can specify font, text color, formatting, and more with HTML or CSS styling. You can also use more Markdown attributes such as marked text. HTML or CSS styled annotations in Orchestration Pipelines flows are preserved when exported, imported, or migrated from a DataStage flow. See Getting started with Orchestration Pipelines for more details.
Recognize text as file paths
You can now enter a valid line of text as a file path for pipeline parameter for increased versatility. See Configuring global objects for Orchestration Pipelines.
Create visual impact by resizing nodes
You can now increase or decrease heights or widths of Orchestration Pipelines nodes by dragging the corners with your mouse. See Getting started with Orchestration Pipelines for more details.
Better visualization of conditions with color-coded links

Links now support custom color-coding to view all nodes' status with improved visual organization. See Creating a pipeline.

Easily merge links between nodes
When you delete a node, its previous or sequential links to other nodes are automatically merged or deleted. See Creating a pipeline.
Pipeline assets available to add to folders
Orchestration Pipelines flows are now available as assets that can be added to folders in a Watson Studio project for better organization and access. Folders are in beta and are not yet supported for use in production environments. For more information, see Organizing assets with folders (beta).

Version 5.0.0 of the Orchestration Pipelines service includes various fixes.

Related documentation:
Orchestration Pipelines
Planning Analytics 5.0.0
This release of Planning Analytics includes the following features and updates:
Integration with Cognos Analytics

You can now create data server connections from Cognos Analytics service instances that are running on Cloud Pak for Data. For details, see Support for Planning Analytics as a Service in the Cognos Analytics documentation.

Region and zone selection

You can now select regions and zones for your Planning Analytics service instances to specify the location and distribution of your nodes. For details, see Selecting regions and zones for Planning Analytics service instances.

Updated versions of Planning Analytics software
The 5.0.0 release of the service provides the following software versions:
  • TM1 Version 2.1.1

    For details about this version of the software, see Planning Analytics 2.1.1 in the Planning Analytics documentation.

  • Planning Analytics Workspace Version 2.0.96

    For details about this version of the software, see 2.0.96 - What's new in the Planning Analytics Workspace documentation.

  • Planning Analytics Spreadsheet Services Version 2.0.96

    For details about this version of the software, see 2.0.96 - Feature updates in the TM1 Web documentation.

  • Planning Analytics for Microsoft Excel Version 2.0.96

    For details about this version of the software, see 2.0.96 - Feature updates in the Planning Analytics for Microsoft Excel documentation.

  • Planning Analytics Engine Version 12.3.13

    For details about this version of the software, see What's new in Planning Analytics Engine in the Planning Analytics Engine documentation.

Version 5.0.0 of the Planning Analytics service includes various fixes.

Related documentation:
Planning Analytics
Product Master 6.0.0
This release of Product Master includes the following features and updates:
Single sign-on for Identity Access Management (IAM) authentications
You can now use single sign-on to log in to the Product Master application by using an LDAP or AD federation user in the IAM service.

For more information, see Enabling single sign-on.

Use an external MongoDB instance
To use the Digital Asset Management (DAM) or Machine learning (ML) features of the Product Master service, you must now install MongoDB.

For more information, see Installing MongoDB.

Version 6.0.0 of the Product Master service includes various fixes.

For details, see What's new and changed in Product Master.

Related documentation:
Product Master
RStudio® Server Runtimes 9.0.0
This release of RStudio Server Runtimes includes the following features:
New Runtime 24.1 for R
You can now use Runtime 24.1, which includes the latest data science frameworks on R 4.3, to run your code in RStudio. For details on all the available environments, see RStudio environments.
RStudio with Runtime 23.1 on R 4.2 is supported on IBM Power
The RStudio with Runtime 23.1 on R 4.2 runtime is now supported on the IBM Power (ppc64le) platform. For more information, see RStudio environments

Version 9.0.0 of the RStudio Server Runtimes service includes various fixes.

For details, see What's new and changed in RStudio Server Runtimes.

Related documentation:
RStudio Server Runtimes
SPSS Modeler 9.0.0
This release of SPSS Modeler includes the following features and updates:
SQL pushback is now available for Google BigQuery

You can now use SQL pushback to improve performance when importing data from Google BigQuery.

For more information, see Supported data sources for SPSS Modeler.

Connect to watsonx.data using Presto

You can now use the Presto connector to connect to a Presto engine within IBM watsonx.data.

For more information, see Presto connection.

Connect to more data sources

You can now connect to Dremio and IBM Data Virtualization from SPSS Modeler.

For more information, see Supported data sources for SPSS Modeler.

Define splitting points for decision trees in CHAID nodes

You can now customize the properties for the CHAID node to specify the fields that the CHAID algorithm must choose from when it determines where to split the decision tree. Specifying fields can control how the decision tree grows by reducing the number of possible splitting points present in the data. For more information about the CHAID node, see CHAID node.

You can also set the properties for the CHAID node by using Python scripts in SPSS Modeler or the scripting API for SPSS Modeler. For more information about the node parameters, see chaidnode properties.

Pick actions for nodes from the new context toolbar

A new context toolbar appears when you hover over a node. It shows the most commonly used actions specific to each type of node, such as graph nodes, import nodes, and modeling nodes. More actions are available from the overflow menu.

Script diagnostic messages

You can now create a script that uses the new report API method to generate notifications for error, warning, and information messages about SPSS Modeler flows. These notifications appear in Messages as well as the Run history.

For more information, see Error reporting for flows.

Version 9.0.0 of the SPSS Modeler service includes various fixes.

For details, see What's new and changed in SPSS Modeler.

Related documentation:
SPSS Modeler
Synthetic Data Generator 9.0.0
This release of Synthetic Data Generator includes the following features and updates:
Use differential privacy to protect user data

You can now protect user data from being traced back to individual users in Synthetic Data Generator.

The parameters involved are known as the privacy budget.

For more information, see Using differential privacy.

Connect to watsonx.data using Presto

You can now use the Presto connector to connect to a Presto engine within IBM watsonx.data.

For more information, see Presto connection.

Connect to more data sources

You can now connect to Dremio and IBM Data Virtualization from Synthetic Data Generator.

For more information, see Creating synthetic data from imported data.

Version 9.0.0 of the Synthetic Data Generator service includes various fixes.

Related documentation:
Synthetic Data Generator
Voice Gateway 1.4.0

Version 5.0.0 of the Voice Gateway service includes various fixes.

For details, see What's new and changed in Voice Gateway.

Related documentation:
Voice Gateway
Watson Discovery 5.0.0
This release of Watson Discovery includes the following features and updates:
Quickly understand the extracted data with the new Intelligent Document Processing (IDP) project type

You can now use the new IDP project type to quickly understand what data is extracted from your documents in a rich document preview. If the extracted data does not meet your requirements, you can also apply enrichments to improve the data. For details, see Creating projects.

You can now use an API endpoint to send a webhook event for documents ingested in Watson Discovery

The Create collection and Update collection APIs can now send a webhook event to an external application when the status of ingested documents becomes available or failed. The webhook event allows you to take the next relevant action on your documents, without getting the document status first with the Get document details API. For details, see Document status webhook.

You can now use an API endpoint to annotate documents with a model of your choice

The Create an enrichment API can now connect to an external enrichment application by using a webhook. The new API allows you to use a model of your choice to annotate documents in Watson Discovery. Through a webhook interface, you can use custom models, advanced foundation models, or other third-party models to enrich your documents in a collection. For details, see External enrichment.

Watson Discovery no longer requires increasing the process ID limit

Starting in Cloud Pak for Data 5.0.0, you do not have to increase the process ID limit on your Red Hat OpenShift Container Platform environment for using Watson Discovery.

Version 5.0.0 of the Watson Discovery service includes various fixes.

For details, see What's new and changed in Watson Discovery.

Related documentation:
Watson Discovery
Watson Machine Learning 5.0.0
This release of Watson Machine Learning includes the following features:
Forecast more steps with an AutoAI time series model
You can now increase the prediction horizon for a time series model created with AutoAI. For example, if your model forecasts weather, you can now predict more steps, such as hours or days, with your model. For more information, see Scoring a time series model.
Presto is now available as a data connection for AutoAI models
You can now connect to Presto as a data source for training an AutoAI experiment when deploying an AutoAI model. For more information, see Auto AI overview.
Deploy assets with Runtime 24.1
You can now create assets that use software specifications that are compatible with IBM Runtime 24.1. For more information, see Frameworks and software specifications.
Deploy traditional and generative AI assets with the watsonx.ai Python client library
The Watson Machine Learning Python client library is now part of an expanded library, the watsonx.ai Python client library. Use the watsonx.ai Python client library to work with traditional machine learning and generative AI assets. The Watson Machine Learning library will persist but will not be updated with new features. For more information, see Deploying AI assets programmatically.

Version 5.0.0 of the Watson Machine Learning service includes various fixes.

Related documentation:
Watson Machine Learning
Watson Machine Learning Accelerator 5.0.0
This release of Watson Machine Learning Accelerator includes the following features:
New deep learning libraries
You can now use the following deep learning libraries with Watson Machine Learning Accelerator:
  • Python 3.11.5
  • PyTorch 2.1.2
  • Tensor Flow 2.14.1
  • NVIDIA CUDA Toolkit 12.2.0

If you have existing models, update and test your models to use the latest supported frameworks. For more information, see Supported deep learning frameworks in the Watson Machine Learning Accelerator documentation.

New NVIDIA GPU Operator version
You can now use the following deep learning libraries with Watson Machine Learning Accelerator:
  • Version 24.3.0

Version 5.0.0 of the Watson Machine Learning Accelerator service includes various fixes.

Related documentation:
Watson Machine Learning Accelerator
Watson OpenScale 5.0.0
This release of Watson OpenScale includes the following features and updates:
New quality metric for binary classification models
You can now configure the gini coefficient metric when you run quality evaluations for binary classification models. The gini coefficient metric measures the inequality of model distributions.

For more information, see Quality evaluations.

Version 5.0.0 of the Watson OpenScale service includes various fixes.

Related documentation:
Watson OpenScale
Watson Speech services 5.0.0
This release of Watson Speech services includes the following features and updates:
Acoustic model component scaling
By default, the AM-patcher microservice remains scaled down when no training is in progress, to optimize cluster resource usage and allocation. You can now use the speech-cr custom resource file to scale the component into different sizes: small, medium, or large. See how to update that Custom Resource (CR) file at Sizing for acoustic model training.

Version 5.0.0 of the Watson Speech to Text service includes various fixes.

For details, see What's new and changed in Watson Speech to Text.

Related documentation:
Watson Speech services
Watson Studio 9.0.0
This release of Watson Studio includes the following features:
Tag projects for easy retrieval

You can now assign tags to projects to make them easier to group or retrieve. Assign tags when you create a new project or from the list of all projects. Filter the list of projects by tag to retrieve a related set of projects. For more information, see Creating a project.

Version 9.0.0 of the Watson Studio service includes various fixes.

For details, see What's new and changed in Watson Studio.

Related documentation:
Watson Studio
Watson Studio Runtimes 9.0.0
This release of Watson Studio Runtimes includes the following features:
Runtime 24.1 is now available for use with Python and R
You can now use Runtime 24.1, which includes the latest data science frameworks on Python 3.11 and on R 4.3, to run your code in Watson Studio Jupyter notebooks and in RStudio. For more information about the available environments, see Environments .
A new version of Jupyter notebooks editor is now available
If you're running your notebook in environments that are based on Runtime 23.1 and 24.1, you can now:
  • Automatically debug your code
  • Automatically generate a table of contents for your notebook
  • Toggle line numbers next to your code
  • Collapse cell contents and use side-by-side view for code and output, for enhanced productivity
For more information, see Jupyter notebook editor.
Runtime 23.1 with R is now available on Power platform
Runtime 23.1 with R is now available in RStudio on the Power platform. For more information, see RStudio environments.

Version 9.0.0 of the Watson Studio Runtimes service includes various fixes.

For details, see What's new and changed in Watson Studio Runtimes.

Related documentation:
Watson Studio Runtimes
watsonx.ai 9.0.0
This release of watsonx.ai includes the following features:
Red Hat OpenShift AI is now a prerequisite for watsonx.ai

Watsonx.ai now requires Red Hat OpenShift AI to be installed as a prerequisite foundation layer on the cluster. Red Hat OpenShift AI provides enhanced support for serving generative AI models and improving the efficiency of prompt tuning.

IBM text embedding support for enhanced text matching and retrieval
You can now use the IBM text embeddings API and IBM embedding models for transforming input text into vectors to more accurately compare and retrieve similar text. You can use the following IBM Slate embedding models:
slate-125m-english-rtrvr
A foundation model provided by IBM that generates embeddings for various inputs such as queries, passages, or documents. The training objective is to maximize cosine similarity between a query and a passage.
slate-30m-english-rtrvr
A foundation model provided by IBM that is trained to maximize the cosine similarity between two text inputs so that embeddings can be evaluated based on similarity later. The slate-30m-english-rtrvr model is a distilled version of the slate-125m-english-rtrvr model.

For details, see Text embedding generation.

Use training data from connected data sources in Tuning Studio
You can now train your foundation models in Tuning Studio by importing training data from a separate data source by using a data connection asset. You can use the following data connection types:
  • Presto connection
  • IBM watsonx.data connection
  • IBM Cloud Object Storage connection
For details, see Data formats for tuning foundation models.
Work with new foundation models in Prompt Lab
You can now use the following foundation models for inferencing from the Prompt Lab in watsonx.ai:
allam-1-13b-instruct
A bilingual large language model for Arabic and English provided by the National Center for Artificial Intelligence and supported by the Saudi Authority for Data and Artificial Intelligence. You can use the allam-1-13b-instruct foundation model for general purpose tasks in the Arabic language, such as classification, extraction, question-answering, and for language translation between Arabic and English.
granite-7b-lab
A foundation model from the IBM Granite family that is tuned with a novel alignment tuning method from IBM Research.
llama-3-8b-instruct
An accessible, open large language model provided by Meta that contains 8 billion parameters and is instruction fine-tuned to support various use cases.
llama-3-70b-instruct
An accessible, open large language model provided by Meta that contains 70 billion parameters and is instruction fine-tuned to support various use cases.
merlinite-7b
A foundation model provided by Mistral AI and tuned by IBM. The merlinite-7b foundation model is a derivative of the Mistral-7B-v0.1 model that is tuned with a novel alignment tuning method from IBM Research.
mixtral-8x7b-instruct-v01
A foundation model that is a pre-trained generative sparse mixture-of-experts network provided by Mistral AI.
For details, see Supported foundation models.
Work with InstructLab foundation models in Prompt Lab
InstructLab is an open-source initiative by Red Hat and IBM that provides a platform for augmenting the capabilities of a foundation model. The following foundation models support knowledge and skills that are contributed from InstructLab:
  • New models:
    • granite-7b-lab
    • merlinite-7b
  • Existing models:
    • granite-13b-chat-v2
    • granite-20b-multilingual
For details, see InstructLab-compatible foundation models.
Create detached deployments for external prompt templates
You can now deploy a prompt template for an LLM hosted by a third-party provider, such as Google Vertex AI, Azure OpenAI, or AWS Bedrock. Use the deployment to explore evaluations for the output generated by the detached prompt template. You can also track the detached deployment and detached prompt template in an AI use case as part of your governance solution. See Creating a detached deployment for an external prompt.
Use the Node.js SDK to add generative AI function to your applications
This beta release of the Node.js SDK helps you to do many generative AI tasks programmatically, including inferencing foundation models. For more information, see Node.js SDK.

Version 9.0.0 of the watsonx.ai service includes various fixes.

For details, see What's new and changed in watsonx.ai.

Related documentation:
watsonx.ai
watsonx Assistant 5.0.0
This release of watsonx Assistant includes the following features:
Conversational search
The new conversational search feature has a built-in retrieval-augmented generation(RAG) solution that helps your watsonx Assistant to extract an answer from the highest-ranked query results and returns a text response to the user. For more information, see Conversational search in the watsonx Assistant documentation.
Integration of Elasticsearch to the search feature
You can now integrate Elasticsearch to the search feature in your watsonx Assistant. With Elasticsearch, your watsonx Assistant can perform different types of searches such as metric, structured, unstructured, and semantic with higher accuracy and relevance by making use of enterprise content. The data analytics engine in Elasticsearch expands the scope of search integration to larger data sets in watsonx Assistant. For more information about Elasticsearch search integration, see Elasticsearch search integration setup in the watsonx Assistant documentation.
Behavioral tuning for conversational search
You can now optimize your conversational search behavior with the Tendency to say “I don’t know” option in the conversational search settings. This option can help to reduce Large Language Model (LLM) hallucinations and provide higher fidelity answers for conversational search by tuning your assistant's tendency to fall back to the “I don’t know” answer. For more information, see Behavioral tuning in the watsonx Assistant documentation.
Streaming response for conversational search
You can now use streaming response in your watsonx Assistant for conversational search. With the help of watsonx.ai capabilities, streaming response can provide continuous and real-time responses. For more information, see Streaming response in the watsonx Assistant documentation.
Overwrite all or skip all when you copy actions to another watsonx Assistant
You can now choose to overwrite all references or skip all references when you copy actions from one watsonx Assistant into another. For more information, see Copying an action to another assistant in the watsonx Assistant documentation.
Add a custom result filter for the Watson Discovery search integration
You can now filter your search result in the Watson Discovery search integration by adding custom text strings in the Custom result filter field in Search integration. For more information, see Configure the search for Watson Discovery in the watsonx Assistant documentation.
Configure search routing
You can configure the search routing for your watsonx Assistant when no matches are available for the customer response. For more information, see Configuring the search routing when no action matches in the watsonx Assistant documentation.
Conversational skills
You can now use conversational skills in your watsonx Assistant to begin tasks or workflows. You must register a pro code conversational skill provider on your watsonx Assistant instance and begin building skill-backed actions to fit your use cases. For more information, see Conversational skills API documentation.
Service monitors
Your watsonx Assistant can now use service monitors to monitor the health of your watsonx Assistant instances. For more information, see Installing service monitors.

Version 5.0.0 of the watsonx Assistant service includes various security fixes.

For details, see What's new and changed in watsonx Assistant.

Related documentation:
watsonx Assistant
watsonx.data 2.0.0
This release of watsonx.data includes the following features and updates:
Azure Data Lake Storage Gen2 (ADLS), Azure Blob and Google Cloud Storage
You can now use the following storage types:
  • You can now add Azure Blob, Azure Data Lake Storage Gen2 (ADLS), and Google Cloud Storage to watsonx.data.
  • You can now use Azure Data Lake Storage (ADLS) Gen1 and Gen2 to store your data while submitting Spark applications.

For more information, see Adding a storage-catalog pair.

New Arrow Flight service based data sources

You can now use the following data sources with Arrow Flight service:

  • Greenplum
  • Salesforce
  • MariaDB
  • Apache Derby

For more information, see Arrow Flight service.

New data sources

You can now use the following data sources:

  • Cassandra
  • BigQuery
  • ClickHouse
  • Apache Pinot

For more information, see Adding a database-catalog pair.

New page for Bring Your Own JAR (BYOJ) process for SAP HANA data source
Users can now use a new dedicated section Driver manager under new Configurations page to manage drivers for SAP HANA data source. Each of these drivers undergo a series of validation.

For more information, see SAP HANA.

Apache Ranger policies
IBM watsonx.data now supports Apache Ranger policies to allow integration with multiple governance tools and engines.

For more information, see Apache Ranger policies.

Provision Spark as a native engine
In addition to registering external Spark engines, you can now provision native Spark engine in watsonx.data. With native Spark engine, you can manage Spark engine configuration, manage access to Spark engines, and view applications by using REST API endpoints from watsonx.data.

For more information, see Native Spark engine.

Query Optimizer to improve query performance
You can now use Query Optimizer, to improve the performance of queries that are processed by the Presto (C++) engine. If Query Optimizer determines that optimization is feasible, the query undergoes rewriting; otherwise, the native engine optimization takes precedence.

For more information, see Query Optimizer overview.

New name for Presto engine in watsonx.data
Presto is renamed to Presto (Java).
New engine (Presto C++) in watsonx.data
You can provision a Presto (C++) engine ( version 0.286) in watsonx.data to run SQL queries on your data source and fetch the queried data.

For more information, see Presto (C++) overview.

API Customization feature
You can now use catalog and engine API Customization for Presto (Java) and Presto (C++) engines in watsonx.data.

For more information, see IBM API docs.

Mixed case feature flag for Presto (Java) engine
The mixed case feature flag, which allows to switch between case sensitive and case insensitive behavior in Presto (Java), is available. The flag is set to OFF by default and can be set to ON during the deployment of watsonx.data.

For more information, see Presto mixed-case behavior.

Using CAS proxy to access S3 and S3 compatible buckets
External applications and query engines can access the S3 and S3 compatible buckets managed by watsonx.data through CAS proxy.

For more information, see Using CAS proxy to access S3 and S3 compatible buckets.

Semantic automation for data enrichment
Semantic automation for data enrichment leverages generative AI with IBM Knowledge Catalog to understand your data on a deeper level and enhance data with automated enrichment to make it valuable for analysis.

For more information, see Semantic automation for data enrichment in watsonx.data.

Version 2.0.0 of the watsonx.data service includes various fixes.

Related documentation:
watsonx.data
watsonx.governance 2.0.0
This release of watsonx.governance includes the following features:
Assess use cases for EU AI Act applicability
By using the new EU AI Act applicability assessment, you can complete a simple questionnaire to assess your AI use cases and determine whether they are within the scope of the EU AI Act. The assessment can also help you to identify the risk category that your use cases align to: prohibited, high, limited, or minimal.
For more information, see Applicability Assessment in Solution components in Governance Console.
Create detached deployments for governing prompts for externally hosted large language models (LLMs)
A detached prompt template is a new asset for evaluating a prompt template for an LLM that is hosted by a third-party provider, such as Google Vertex AI, Azure OpenAI, or AWS Bedrock. The inferencing that generates the output for the prompt template is done on the remote model, but you can evaluate the prompt template output by using watsonx.governance metrics. You can also track the detached deployment and detached prompt template in an AI use case as part of your governance solution.
For more information, see:
New metrics for evaluating prompt templates
When you evaluate prompt templates in your watsonx.governance deployment spaces or projects, you can now run generative AI quality evaluations to measure how well your model performs retrieval-augmented generation (RAG) tasks with the following new metrics:
  • Faithfulness
  • Answer relevance
  • Unsuccessful requests
Results from these new evaluations are captured in factsheets in AI use cases.

For more information, see Generative AI quality evaluations.

Version 2.0.0 of the watsonx.governance service includes various fixes.

For details, see What's new and changed in watsonx.governance.

Related documentation:
watsonx.governance
watsonx Orchestrate 2.0.0
This release of watsonx Orchestrate includes the following features and updates:
New AI-assisted conversational skills
Now, you can integrate any of your watsonx Orchestrate skills with a conversational based AI assistant. You can use this conversational skill with input and it starts the action on your requested tasks. Your administrator must connect apps on team skill sets from AI assistants. Then, builders can use the AI assistant builder to create actions to be used in AI assistants. For details, see Conversational skills.
Conversational search
The new conversational search feature is watsonx.ai powered built-in retrieval-augmented generation (RAG) solution that helps your watsonx Orchestrate to extract an answer from the highest-ranked query results and returns a text response to the user. For details, see Conversational search in the watsonx Assistant documentation and watsonx Orchestrate features documentation.
Automatically completing data in tables for skill inputs
When you start a skill, if the skill input is a table, now you can upload an .xls or .csv file to automatically complete the data in the table. For details, see Filling tables in skill inputs automatically.
Number of suggested skills increased to 10
When you add a new skill to a skill flow, you get up to 10 suggested skills of what is the next best skill to add.
Mapping inputs of skills that dynamically generate a list of options
Map which values you want to show as options of a list of options based on the value that you provide for another input field. For example, if you have a skill that lists the projects within a project management application for specific organizations, you can use the x-ibm-ui-extension property to dynamically get the projects and list them as a list of options based on the organization that users input to another field. For details, see Mapping inputs of the skill that generate the list of options.

Version 2.0.0 of the watsonx Orchestrate service includes various fixes.

Related documentation:
watsonx Orchestrate

New services

Category Service Pricing What does it mean for me?
Data governance Data Product Hub Separately priced

Data Product Hub is a self-service solution suitable for organizations to share data products among their teams. A data product contains a governed collection of data assets and is curated to be accessible and reusable. Data Product Hub enables producers to easily create, share, and govern data products with consumers and ensures that teams can quickly access the data that they need.

Accessing data quickly

Consumers can find the content they need by using the governed inventory of data products on Data Product Hub or by requesting a new, unique data product.

Resolving data silos

Producers can use metadata to create data products from both IBM and third-party tools. This integration optimizes accessibility and prevents silos as consumers can access all data on one secure location.

Empowering trust and compliance

All data products are associated with a data contract that outlines the terms and conditions of usage. These contracts provide assurance on data security and compliance for both producers and consumers as data products are shared across many regions and business domains.

Related documentation:
Data Product Hub
Data governance IBM Knowledge Catalog Premium Separately priced
IBM Knowledge Catalog Premium is a generative AI enabled service that includes a complete governance framework with data privacy, data quality, cataloging, and automated metadata enrichment. IBM Knowledge Catalog Premium uses trusted Slate and Granite foundation models to:
  • Recommend descriptive names for tables and columns based on the contents of the tables and columns.
  • Suggest and assign semantic descriptions for the contents of tables and columns based on the context and content of the columns.
  • Complete semantic term assignment for tables and columns.
In addition, IBM Knowledge Catalog Premium includes:
  • Enhanced data protection features to help control risk and support compliance with privacy regulations.
  • Extensive data quality features designed to deliver trusted data to the enterprise and support regulatory compliance requirements.
Related documentation:
IBM Knowledge Catalog
Data governance IBM Knowledge Catalog Standard Separately priced
IBM Knowledge Catalog Standard is a generative AI enabled service that is designed to support foundational data governance use cases. It includes core features such as a glossary, catalogs, workflows, data discovery, natural language search, profiling, and automated metadata enrichment. IBM Knowledge Catalog Standard uses generative AI to:
  • Recommend descriptive names for tables and columns based on the contents of the tables and columns.
  • Suggest and assign semantic descriptions for the contents of tables and columns based on the context and content of the columns.
  • Complete semantic term assignment for tables and columns.
Related documentation:
IBM Knowledge Catalog
AI watsonx Code Assistant for Red Hat Ansible Lightspeed Separately priced
IBM watsonx Code Assistant for Red Hat Ansible Lightspeed is a new generative AI service engineered to help automation teams create, adopt, and maintain Ansible content more efficiently. Use watsonx Code Assistant for Red Hat Ansible Lightspeed to:
  • Write Ansible Playbooks with AI-generated recommendations
  • Use IBM foundational models to get code recommendations in your Visual Studio Code development environment
  • Create task prompts from natural language requests
  • Tune the IBM base code model on your data so that it generates code suggestions that are customized for your enterprise standards
Related documentation:
watsonx Code Assistant for Red Hat Ansible Lightspeed
AI watsonx Code Assistant for Z Separately priced
IBM watsonx Code Assistant for Z is a new service that helps developers to modernize their mainframe applications using a combination of automation and generative AI. Use watsonx Code Assistant for Z to:
  • Analyze applications with IBM Application Discovery and Delivery Intelligence
  • Refactor monolithic COBOL applications into services
  • Generate Java services based on COBOL, including classes and methods
  • Generate JUnit tests to validate that the Java service is semantically equivalent to the COBOL
Related documentation:
watsonx Code Assistant for Z

Installation enhancements

What's new What does it mean for me?
Red Hat OpenShift Container Platform support
You can deploy Cloud Pak for Data Version 5.0 on the following versions of Red Hat OpenShift Container Platform:
  • Version 4.12 or later fixes
  • Version 4.14 or later fixes
  • Version 4.15 or later fixes
Additional deployment environments
Cloud Pak for Data Version 5.0 can be deployed on the following environments:
New options for on-premises deployments
You can now install Cloud Pak for Data on:
  • IBM Storage Fusion HCI System
  • Hosted control planes
    Restriction: Services with the following prerequisites cannot be installed on hosted control planes:
    • Multicloud Object Gateway
    • Graphical Processing Units (GPUs)
  • Single node OpenShift (SNO)
New options for Amazon Web Services (AWS)
You can now install Cloud Pak for Data on:
  • AWS GovCloud (US)
  • Red Hat OpenShift Service on AWS (ROSA) on hosted control planes
    Restriction: Services with the following prerequisites cannot be installed on hosted control planes:
    • Multicloud Object Gateway
    • Graphical Processing Units (GPUs)
New options for Google Cloud
You can now install Cloud Pak for Data on Red Hat OpenShift Dedicated on Google Cloud, the managed-OpenShift offering on Google Cloud.
IPv4/IPv6 dual-stack networking
Cloud Pak for Data Version 5.0 can run on an IPv4/IPv6 dual-stack network. For more information on enabling dual-stack networking, see Converting to IPv4/IPv6 dual-stack networking in the Red Hat OpenShift Container Platform documentation:

If you are upgrading to Cloud Pak for Data Version 5.0, you can enable dual-stack networking before you upgrade. When the pods come back up after upgrade, the pods are dual-stack enabled.

More control over how your usage is reported for licensing
You are required to keep a record of the size of deployments to report to IBM as requested. If you plan to install multiple solutions on a single instance of Cloud Pak for Data, you can use node pinning to ensure that you are compliant with your license terms. Node pinning uses node affinity to determine where the pods for each solution can be placed.

To determine whether node pinning is appropriate for your environment, see Node planning.

Previous releases

Looking for information about previous releases? See the following topics in IBM® Documentation: