Limitations and known issues in Data Virtualization

The following limitations and known issues apply to Data Virtualization.

General issues
The following general limitations and issues apply to Data Virtualization.
Data source issues
The following limitations and issues apply to data sources in Data Virtualization. See also Supported data sources in Data Virtualization.
User and group management issues
The following limitations and issues apply to user and group management in Data Virtualization.
Data governance issues
The following limitations and issues apply to data governance in Data Virtualization.
File issues
The following limitations and issues apply to files in Data Virtualization.
Resolved issues
The following limitations and issues are resolved in Data Virtualization.

See also Troubleshooting the Data Virtualization service.

General issues

Virtualizing many tables simultaneously might fail

When you virtualize many tables, for example more than 15 at the same time, your cart might fail to virtualize and you might see error messages that indicate failure to load tables.

Workaround: Reduce the number of tables to virtualize or specify a schema filter to speed up loading of metadata and tables. For more information, see Filtering data in Data Virtualization.

Applies to: 4.5.1 and later.

When the Data Virtualization service instance is shut down but the service is not, you see an incorrect user interface

If the Data Virtualization service instance is shut down but the service is not, and you navigate to the Data Virtualization instance, you see the provisioning UI status page, which says Hang tight, we're deploying Data Virtualization. The user interface does not show the correct page since the service instance is shut down. To avoid seeing the provisioning user interface, shut down the Data Virtualization service as well.

Applies to: 4.5.0 and later.

The Data sources page might fail to load data sources when remote connectors are added, edited, or removed

If you remove a connection to a remote connector or the remote connector becomes unavailable because credentials expire, the Data sources page fails to load this connection.

Applies to: 4.5.0 and later.

Preview of tables to be virtualized might not be correct when the table contains a column named id

If you try to virtualize a table that contains a column that is named id, preview shows only the column value of the first row in the table. This issue occurs because the keyword id is reserved for the user interface that displays the table.

In addition, if a table contains a column that is named id and the column contains identical values in the id column, these rows display only the values of the first row for each row that has the same value for the id column.

For example, see the following table.
id a2 a3
1 C1-sjis-ok A0_*&
1 abc 123
This table appears as follows because the second row is replaced with the data from the first row.
id a2 a3
1 C1-sjis-ok A0_*&
1 C1-sjis-ok A0_*&

Workaround: Rename the column of the table to a unique name that is not id. If you cannot rename the column name, you must ensure that values in the column are unique.

Applies to: 4.5.0 and later.

Online backup of Data Virtualization can affect cache refreshes

If you schedule an online backup in Data Virtualization and if at the same time a cache refresh operation executes, the cache write operation can fail.

Online backup in Data Virtualization works by quiescing database write and update operations. Cache refresh is a write operation that updates Data Virtualization caches with fresh data from remote data sources. Both online backup and cache refresh operations can be scheduled to execute periodically. These schedules can conflict and every time the cache tries to refresh, an online backup might be in progress or it might start while the cache refresh is in progress. In these scenarios, cache refresh operations can fail with an error similar to the following message. The cache is active but it contains stale data.

The statement failed because a Big SQL component encountered an error. 
Component receiving the error: "BigSQL IO". Component returning the error: "UNKNOWN". 
Log entry identifier: "[BSL-1-b1bf428f]". Reason: "Cannot add block to /warehouse".. 
SQLCODE=-5105, SQLSTATE=58040, DRIVER=4.27.25

Cache auto-refresh might also fail and you can either manually refresh or wait for the next auto-refresh. However, if the online backup is scheduled such that it conflicts with any cache's refresh schedule, then every auto-refresh of those caches will fail.

Workaround: Reschedule the cache's refresh or the online backup schedule to avoid conflicts. You can also refresh the cache manually at any time.

Applies to: 4.5.0 and later.

You cannot use special characters in table names

Not all characters can be part of a table name. For example, the hashtag character (#) cannot be part of table names. You cannot use the hashtag character in table names even if you wrap this character in quotation marks.

Applies to: 4.5.0 and later.

You cannot use special characters such as semicolons in schema names when you virtualize a table in MongoDB on Cloud

When you virtualize a table in a MongoDB on Cloud data source that has a semicolon in the schema name, you will not be able to virtualize the table. The MongoDB on Cloud data source removes the semicolon from the schema name and queries of this virtual table fail with error SQL5105N.

Applies to: 4.5.0 and later.

You cannot use special characters in schema names that are used in remote data sources

You can virtualize a table in the remote data source but querying the virtual table fails with an SQL5105N error. See the following example.

ERROR] SQL error: [IBM][CLI Driver][DB2/LINUXX8664] SQL5105N The statement failed because a Big SQL component encountered an error. Component receiving the error: "DV-FMP". Component returning the error: "Virtualization Connector". Log entry identifier: "GAI-003-NA". SQLSTATE=58040 SQLCODE=-5105 (fvt_sql.py:157)

Applies to: 4.5.0 and later.

Preview of virtualized tables is limited by row size and number of columns
Data Virtualization supports virtualization of tables with a row size up to 1 MB, and up to 2048 columns in a table. However, the number of columns that Data Virtualization can preview depends on many factors, such as the data types of the columns. Currently, preview is limited to 200 columns.

Applies to: 4.5.0 and later.

Virtualizing large tables might be slow or might fail
When you virtualize multiple large tables from data sources, you might see the following messages in the Virtualize objects dialog box.

Error: CREATE OR REPLACE NICKNAME -- DB2 SQL Error: SQLCODE=-1229, SQLSTATE=40504, SQLERRMC=null, DRIVER=4.29.24

Or

Error: CREATE OR REPLACE NICKNAME -- "<schema>"."<table>" already exists in Server "<server>" (use REPLACE=Y to overwrite)

Or

The table <schema>.<table> already exists. Use REPLACE=Y to overwrite CallStack

This issue occurs even when you do not have tables with the same schema name and table name. The root cause is a timeout and retry sequence that is triggered by HAProxy timeouts. To work around this issue, ensure that your HAProxy settings meet the recommend values. For more information, see Configuring network requirements for Data Virtualization and Changing load balancer timeout settings. If virtualization of large files takes longer than 5 minutes, increase the HAProxy settings and retry the virtualization.

Applies to: 4.5.0 and later.

Downloading the Linux® driver package fails in an air-gapped environment

On the Configure connection page, when you click Download Linux driver package, the drivers must be downloaded from an external website. This download is not supported in an air-gapped environment currently.

Applies to: 4.5.0 and later.

Using custom SSL certificate can cause verification errors

When you follow these instructions to use a custom SSL or TLS certificate, you might encounter SSL verification errors, and Data Virtualization might not work. Custom CA-signed certificates are not supported for Data Virtualization. Self-signed certificates do not cause this issue and are supported for Data Virtualization. To work around this issue, revert to the default certificate that was provided with Cloud Pak for Data or use a self-signed certificate.

Applies to: 4.5.0 and later.

Cannot connect to the service in load-balancing environment

If you get a timeout error when you try to connect to Data Virtualization, increase timeout values. To increase timeout values, update the /etc/haproxy/haproxy.cfg file and set the client and server timeout value to 10m. For more information, see Changing load balancer timeout settings.

Applies to: 4.5.0 and later.

Data source issues

Some columns are missing after you virtualize tables in a Microsoft Azure SQL Database in Data Virtualization
Some columns might be missing after you virtualize a table from Microsoft Azure SQL Database database. This is because the virtualization process in Data Virtualization uses a cache to resolve the remote table structure. If this structure is stale and new columns have been added to the remote table, the resulting virtual table contains only the old subset of columns.

Workaround: Reload the remote table's cache by using the refresh button in the List view of the Virtualize page.

Applies to: 4.5.0 or later
Re-adding a connection to a data source in cloud object storage requires the user to edit the connection and update credentials

If you add a connection to a data source in cloud object storage, change your credentials, and then try to re-add the connection, you might see an error that is similar to the following message.

The data source could not be added. From the Platform connections page, modify the parameters of the selected existing connection and try again.

Or, the connection status for the re-added connection shows a status of Invalid. You must reenter and validate your credentials.

Workaround: Edit the connection and enter the correct credentials after the connection is re-added to Data Virtualization. You can reenter credentials after the connection is added to Data Virtualization.

Applies to: 4.5.0 and later.

After you create some connections in Data Virtualization, you cannot edit that connection from Platform connections
The following connection types cannot be edited from the Platform connections page if they are created on the Data sources page in Data Virtualization.
  • Db2®
  • Denodo
  • SAP HANA
  • SAP S/4HANA
  • Spark

Workaround: Edit these connections from the Data sources page only.

Applies to: 4.5.0 and later.

You cannot create more than one connection to the same data source with the same username

When you try to create a connection to a data source that has already been connected to, you see the message The data source could not be added. Ensure that the specified parameters are correct and try again.

Applies to: 4.5.0 and later.

JDBC drivers default to use the TLSv1.3 protocol

Some connections in Cloud Pak for Data 4.5.0 default to use the TLSv1.3 protocol when they initiate SSL connections to remote data sources. If the remote data source does not support TLSv1.3, the SSL handshake fails.

To work around this issue, enter the connection property CryptoProtocolVersion=TLSv1.2 in the Additional properties field when you create a connection to the data source in Data Virtualization.

This change affects the following connection types.

Applies to: 4.5.0 and later.

Connections that use a custom SSL certificate, which has been stored in a vault, must be created in Platform connections first

If a connection uses a custom SSL certificate that has been stored in a vault, the connection must first be created in Platform connections by using the connections service REST API before it can be added on the Data sources page in Data Virtualization with Add connection > Existing platform connection.

The SSL certificate must either be input as a bare certificate in plain text format or be entered by using the following secret reference format as a singleton secret instead of a JSON array as shown in the following example.

"ssl_certificate": "{\"secretRefID\":\"secret_urn:MySSLCert\",\"key\":\"password\"}"
For example, create the connection in Platform connections with the following API command.
POST "${CPDHOST}/v2/connections?test=false&catalog_id=${CATALOG_ID}" \
-H "content-type: application/json" \
-H "Authorization: Bearer ${CPDTOKEN}" \
-d '{ "datasource_type": "${DATASOURCE_TYPE}",
    "name": "${CONNECTION_NAME}",
    "properties": {
        "host": "mydatabase.mycompany.com",
        "port": "5500",
        "connection_mode": "sid",
        "sid": "mydatabase",
        "username": "{\"secretRefID\":\"secret_urn:MyCreds\",\"key\":\"username\"}",
        "password": "{\"secretRefID\":\"secret_urn:MyCreds\",\"key\":\"password\"}",
        "ssl": "true",
        "properties": "CryptoProtocolVersion=TLSv1.2",
        "ssl_certificate": "{\"secretRefID\":\"secret_urn:MySSLCert\",\"key\":\"password\"}"
    },
    "origin_country": "US",
    "flags": [
        "personal_credentials"
    ] } '

Applies to: 4.5.0 and later.

Cannot create connections to Db2 on Cloud data source with API keys that are stored in vaults
When you enter add a connection to a Db2 on Cloud data source through a remote connector or a local agent, and you use an API key that is stored in a vault as your credentials, the action fails with an Error SQL4302N "Exception occurred during connection" message.
To work around this issue for a local agent, enter your API key directly, not as a secret reference in a vault. You cannot work around this issue for Db2 on Cloud data sources that are accessed through a remote connector.

Applies to: 4.5.0 and later.

Unable to add a connection to SAP S/4HANA data source with a SAP OData connection
If you try to connect to an SAP S/4HANA data source that contains many tables, that connection might time out and the connection might fail. Increasing timeout parameters has no impact.
To work around this issue, run the following commands.
db2 connect to bigsql
db2 "call DVSYS.setRdbcX('SAPS4Hana', '<Data source IP:port>', '0', '', 'CreateSchema=ForceNew', '<username>', '<password>', '0', '0', '', '', '<internal connector instance and port>', ?,?,?);"
db2 terminate

Applies to: 4.5.0 and later.

Some queries to SAP HANA data sources do not show correct results

When you query virtualized tables in SAP HANA data sources that select certain columns and use the NOT operator as a filter on those columns, you might see incorrect results.

Applies to: 4.5.0 and later.

Tables in a MongoDB data source might be missing when you virtualize

When you create a connection to MongoDB, you see only tables that were created in the MongoDB data source before the connection was added.

For example, if you have 10 tables in your MongoDB data source when you create a connection, you see 10 tables when you start to virtualize the table. If a user adds new tables into the MongoDB data source after the connection is added and before you click Virtualize, Data Virtualization won't display the new tables under the Virtualize tab.

Workaround: To see all recently added virtualized MongoDB tables, delete the connection to MongoDB and re-create the connection.

Applies to: 4.5.0 and later.

You cannot connect to a MongoDB data source with special characters in a database name

The current MongoDB JDBC Driver does not support connection to database names that contain special characters.

Applies to: 4.5.0 and later.

Previewing tables returns incorrect data for TINYINT in SAP HANA data sources

If you preview SAP HANA data sources that contain data with a data type of TINYINT, you see inaccurate data for some rows of type TINYINT. However, you can virtualize the data source. Only preview is affected.

Applies to: 4.5.0 and later.

When you virtualize data that contains LOB (CLOB/BLOB) or Long Varchar data types, the preview might show the columns as empty

After you virtualize the table, in Virtualized data, the data is available for the columns that contain LOB or Long Varchar data types.

Applies to: 4.5.0 and later.

When you preview tables, LONG VARCHAR and LONG VARCHAR for bit data are mapped to CLOB and BLOB
When you preview tables, LONG VARCHAR and LONG VARCHAR for bit data are now mapped in Data Virtualization as follows.
  • Remote LONG VARCHAR s mapped to Data Virtualization CLOB and fix length is 32700.
  • Remote LONG VARCHAR for bit data is mapped to Data Virtualization BLOB and fix length is 32700.

Applies to: 4.5.0 and later.

Errors when you edit an SSL-based connection

For example, you might see the following message. DB_WARNING: ENGINE_JDBC_CONN_ATTEMPT_ERROR: Failed JDBC Connection attempt in XXX ms for: YYY;sslTrustStoreLocation=/path/.dsTempTrustStore;, cause: .... Message: java.lang.RuntimeException: Unexpected error: java.security.InvalidAlgorithmParameterException: the trustAnchors parameter must be non-empty.

To work around this issue, see Errors when you edit an SSL-based connection in Data Virtualization.

Applies to: 4.5.0 and later.

Support of timestamp data type up to nanoseconds

Data Virtualization supports the timestamp data type up to nanoseconds. When a remote data source's timestamp type is with a scale greater than 9 nanoseconds, Data Virtualization returns truncated timestamp column values up to nanoseconds. Additionally, for timestamp predicates, Data Virtualization compares only up to nanoseconds.

Applies to: 4.5.0 and later.

Cannot preview data from Db2 for i source

When you try to preview data from the Db2 for i, an error message is displayed. Db2 for i might be a remote data source and you don't have the required access permissions. For more information, see the Db2 product documentation.

Applies to: 4.5.0 and later.

Limitations for SSL-based data source connections
The following limitations and known issues apply to SSL-based data source connections in Data Virtualization:
Support of only PEM certificate format for built-in data sources
When you add a built-in data source to Data Virtualization, and you upload an SSL certificate, Data Virtualization supports SSL certificates with PEM format only. If you add a built-in data source to Data Virtualization, and you use an SSL certificate with a non-PEM format, the connection fails.

You can use openssl to convert your SSL certificates to PEM format before you upload the SSL certificate file to add a data source connection. For more information, see Cannot connect to a data source.

Issues to upload SSL certificates for connections that require a third-party driver
When you add a data source connection by uploading a third-party driver and you try to use an SSL certificate, you might encounter an error. It is recommended that you use non-SSL connections only for data sources that require third-party drivers.

Applies to: 4.5.0 and later.

Cannot edit additional properties for data source connections

You can specify any additional JDBC properties when you add data source connections to Data Virtualization. However, you cannot change these additional JDBC properties after you add data source connections.

Applies to: 4.5.0 and later.

Limitations for adding Db2 on Cloud data source connections

You can add Db2 on Cloud data source connections to Data Virtualization. However, you cannot specify any port information when you add these data source connections; by default, port 50001 is used. Additionally, you cannot specify whether the port has SSL enabled, so all Db2 on Cloud connections have SSL enabled by default. If you want to add a Db2 on Cloud connection with a different port or without enabling SSL, specify the Db2 connection type.

Applies to: 4.5.0 and later.

Limitations for adding Db2 Hosted , Db2 Event Store, and IBM Cloud® Compose for MySQL data source connections

When you try to add Db2 Hosted , Db2 Event Store, and IBM Cloud Compose for MySQL data source connections, you cannot specify whether to enable SSL or not. This limitation prevents you from adding these data source connections to Data Virtualization. To solve this issue, if you want to add these connections to Data Virtualization, specify the Db2 connection type for Db2 Hosted data source and the MySQL connection type for the IBM Cloud Compose for MySQL data source.

Applies to: 4.5.0 and later.

Query performance issues against Db2 data sources

When you run a query against Db2 data sources, you might encounter performance issues. For example, these performance issues might appear if your SQL statement uses a string comparable, such as c1 = 'abc', c2=c3, where c1, c2, and c3 are string data types such as char or varchar. To avoid these performance issues, you must modify the collating sequence (COLLATING_SEQUENCE) server option of the data source. For more information about this server option, see Collating sequence.

Applies to: 4.5.0 and later.

Remote data sources - Performance issues when you create data source connection
You try to create a data source by searching a different host, but the process takes several minutes to complete. This performance issue occurs only when these two conditions are met:
  • The remote data source is connected to multiple Cloud Pak for Data clusters.
  • Data Virtualization connects to multiple data sources in different Cloud Pak for Data clusters by using the remote connectors.

To solve this issue, ensure that your Data Virtualization connections are on a single Cloud Pak for Data cluster.

Applies to: 4.5.0 and later.

Remote data sources - Errors in the remote connector upgrade script

When you run the script to upgrade a remote connector on a Windows data source, you see some error messages. For example,

'#' is not recognized as an internal or external command, operable program or batch file. '#export' is not recognized as an internal or external command, operable program or batch file.

You can ignore these error messages as the remote connector was upgraded successfully.

Applies to: 4.5.0 and later.

Remote data sources - Cannot use system junction points
Data Virtualization does not support browsing data on remote data sources by using paths that contain system junction points. System junction points provide compatibility with an earlier version. For example, on Windows 10 C:\Documents and Settings is a system junction point to C:\Users. Thus, when you browse files on a remote Windows data source, you cannot enter a path that contains system junction points, such as C:\Documents and Settings. By default, system junction points are hidden from Windows users.
Note: Data Virtualization does support junction points and symbolic links that are created by Windows users.

Applies to: 4.5.0 and later.

Cannot see list of available tables in the default virtualization mode

In the default virtualization mode (where you can see all tables, irrespective of business term assignments), when you navigate to the Virtualize page, the console appears to be loading the table list for a while when data sources added to Data Virtualization have tables with nonstandard types, such as NULL or OTHER. However, you can wait for the loading to complete to see a list of all tables, and you can preview, add to cart, edit columns, and virtualize any of the listed tables. Refresh is disabled, but you can refresh the page to trigger the reload of the available tables cache.

Applies to: 4.5.0 and later.

Query fails due to unexpectedly closed connection to data source

Data Virtualization does not deactivate the connection pool for that data source when your instance runs a continuous workload against virtual tables from a particular data source. Instead, Data Virtualization waits for a period of complete inactivity before it deactivates the connection pool. The waiting period can create stale connections in the connection pool that get closed by the data source service and lead to query failures.

Workaround: Check the properties for persistent connection (keep-alive parameter) for your data sources. You can try two workarounds:

  • Consider disabling the keep-alive parameter inside any data sources that receive continuous workload from Data Virtualization.
  • You can also decrease the settings for corresponding Data Virtualization properties, RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, as shown in the following examples: 

    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC', '10', '', ?, ?);    -- default 20s, minimum 5s
    CALL DVSYS.SETCONFIGPROPERTY('RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC, '30', '', ?, ?);    -- default 120s, minimum 5s
    Decreasing the RDB_CONNECTION_IDLE_SHRINK_TIMEOUT_SEC and RDB_CONNECTION_IDLE_DEACTIVATE_TIMEOUT_SEC settings might help if there are small gaps of complete inactivity that were previously too short for the Data Virtualization shrink and deactivate timeouts to take effect.

Applies to: 4.5.0 and later

User and group management issues

Attribute-based dynamic group assignments are not refreshed until the user signs in to the Cloud Pak for Data web client

In Cloud Pak for Data, you can define attribute-based dynamic group assignments. There is a known platform limitation where attributes are synchronized with the external active directory (for example, LDAP) only when the user signs in to Cloud Pak for Data using the web client. Until the user signs in, the attributes known to the platform and the attribute-based dynamic group assignments will remain static. Users might get unexpected decision outcomes for the data protection rules that are based on the dynamic groups.

Applies to: 4.5.0 and later.

User group assignment changes might not take effect

A new session (authentication) must be established for the user group assignments to take effect.

Workaround: See User management in Managing users.

Applies to: 4.5.0 and later.

Users and groups must adhere to naming guidelines
Best practices:
  • Group names must be less than or equal to the group name length listed in SQL and XML limits.
  • A username on Windows can contain up to 30 characters.
  • When not using Client authentication, non-Windows 32-bit clients that connect to Windows with a username that is longer than the username length listed in SQL and XML limits are supported when the username and password are specified explicitly.
  • A username must not be USERS, ADMINS, GUESTS, PUBLIC, LOCAL, or any SQL reserved word.
  • A username must not begin with IBM, SQL, or SYS.

Applies to: 4.5.0 and later.

Privileges and authorities that are granted to user groups are not considered when you create views

This limitation is a result of a Db2 limitation on groups. For more information, see Restrictions on the use of group privileges when executing DDL statements or binding packages.

You can also grant public access on your objects for all roles or all Data Virtualization users and then restrict access by using data protection rules that are defined on groups. For more information, see Governing virtual data with data protection rules in Data Virtualization.

See also SQL0727N error when you query view results in Data Virtualization.

Applies to: 4.5.0 and later.

Data governance issues

You cannot apply business terms when you virtualize files in data sources on the Files tab

When you virtualize files in Data > Data virtualization > Virtualize, the Business terms column is not available for data sources on the Files tab. These data sources do not support business terms.

Applies to: 4.5.0 and later.

You cannot see business term assignments or grouped tables in the Explore view
On the Virtualize page, you cannot see business term assignments and you cannot see grouped tables.

Workaround: Switch to the List view to see business term assignments and grouped tables.

Applies to: 4.5.0 and later.

Automatic publishing of virtual objects to the catalog is limited to certain objects

Only objects that are created in the user interface are automatically published to the catalog. Objects that are created using SQL are not published automatically and must be published to the catalog manually or by using the API.

Applies to: 4.5.0 and later.

Data Virtualization always enforces data protection rules

Data Virtualization enforces data protection rules even if the data asset is cataloged in a catalog that does not have the Enforce data protection rules option enabled. This behavior is subject to change in future releases. To ensure a predictable behavior in future releases, add the virtual data assets to the catalogs with the Enforce data protection rules option enabled only.

Applies to: 4.5.0 and later.

Access control issues to preview assets with masked data or filtered data
Tech preview This is a technology preview and is not supported for use in production environments.
When you preview Data Virtualization data assets in Watson™ services in Cloud Pak for Data (for example, Watson Knowledge Catalog, Watson Studio, and Data Refinery), and in cases when data masking or row-level filtering applies, the preview is subject to the data protection rules and catalog or project access control only.

To avoid double masking when you use preview in Watson services, access control in Data Virtualization is not applied when you preview a data asset (table or view) that comes from Data Virtualization. This happens only when data masking or row-level filtering applies to the preview in Watson services. Access control does not apply in this circumstance.

You must define your rules to manage access to the catalogs, projects, data assets, or connections for access control in Watson services.

Applies to: 4.5.0 and later.

Cannot see list of available tables in the strict virtualization mode

In the strict virtualization mode (where you can see tables only if they have at least one column with business term assignment), when you navigate to the Virtualize page, the console appears to be loading the table list for a while without showing any tables. The loading can be much slower compared to the default virtualization mode while the console evaluates the list of eligible tables that can be virtualized, depending on term assignments to data source table and column names.

Applies to: 4.5.0 and later.

Access to a table is denied by policies

You cannot access a table but according to the data policies and authorizations, you are authorized to access this table. This issue occurs only if Watson Knowledge Catalog policy enforcement is enabled in Data Virtualization. To solve this issue, see Access to a table is denied by policies in Data Virtualization.

Applies to: 4.5.0 and later.

Do not use duplicate catalog assets for the same table

The policy service is unable to decide which of the duplicated assets to use for policy enforcement and does not aggregate the rules. You must avoid duplicate assets across governed catalogs as this might lead to issues with policy enforcement behavior in Data Virtualization.

Applies to: 4.5.0 and later.

Cannot see business term that is assigned to data asset

You are virtualizing a data asset in Watson Knowledge Catalog, but you cannot see a business term that is assigned to this data asset. To solve this issue, see Cannot see business term that is assigned to data asset in Data Virtualization.

Applies to: 4.5.0 and later.

A virtualized object cannot be used in Cognos® Dashboards without credentials and an appropriate role
Using a virtualized object in Data Virtualization displays an error message if you do not enter credentials for the Data Virtualization connection or you do not have the correct role.

If you did not enter a username and password when you created the connection to the Data Virtualization data source, you see the following error: Missing personal credentials. If you are not assigned the Admin or Steward role for this object, you see the following error: Unable to access this asset.

To work around this issue, see A virtualized object cannot be used in Cognos Dashboards without credentials and an appropriate role in Data Virtualization.

Applies to: 4.5.0 and later.

File issues

You cannot preview long string values in headings in CSV, TSV, or Excel files
When you use the first row as column headings, the string values in that row must not exceed the maximum Db2 identifier length of 128 characters and cannot be duplicated. If your file has string names in the header row with values that are too long or are duplicated, an error message is displayed when you try to preview your file in Data Virtualization.

400: Missing ResultSet:java.sql.SQLSyntaxErrorException: Long column type column or parameter 'COLUMN2' not permitted in declared global temporary tables or procedure definitions.

Column heading names are case-insensitive and converted to uppercase in the API response, which is exposed by the console. Therefore, a column named ABC is considered the same as a column named abc. However, the columns can be renamed to mixed case when you virtualize your data source.

Workaround: Review the first row in the data file that contains the intended column heading names and make the necessary changes to avoid this limitation.

Applies to: 4.5.0 and later.

You might encounter errors when you virtualize large Excel files
You might experience an error when you preview or virtualize large files with Excel format (XLS):

ERROR 400 'Your InputStream was neither an OLE2 stream nor an OOXML stream' was thrown while evaluating an expression.

Workaround: You can follow these steps:
  1. Load the file data into a table on a supported data source. For more information, see Supported data sources.
  2. Virtualize the table. For more information, see Creating a virtualized table.
  3. Optionally, you can change the format of the file; for example, to CSV format before you virtualize it again.

Applies to: 4.5.0 and later.

Encoding detection override for files data with Japanese characters

For text files exposed by remote connectors, Cloud Pak for Data automatically detects the encoding scheme of flat data files, such as CSV and TSV files. However, to avoid decoding issues, it is recommended that you set the encoding scheme manually for flat data files. For more information, see Setting encoding scheme.

Applies to: 4.5.0 and later.

Only UTF-8 character encoding is supported for CSV, TSV, and JSON files in Cloud Object Storage

For text files in Cloud Object Storage data sources with CSV, TSV, and JSON formats, only UTF-8 character encoding is supported in Data Virtualization. Cloud Object Storage binary formats such as Optimized Row Columnar (ORC) or Parquet are unaffected because they transparently encode character types.

Applies to: 4.5.0 and later.

Resolved issues

Applying filters to a data source does not automatically reload tables in List view
If you apply filters to a data source on the Data sources page and then navigate to the Virtualize page, the filters are not applied automatically. You must reload the tables.

Workaround: On the Virtualize page, click Refresh icon to reload tables.

Applies to: 4.5.0, 4.5.1, and 4.5.2.

Fixed in: 4.5.3.

Assignment to a project fails when the schema name of the virtual object contains periods
If you try to assign virtualized tables to a project that you are a project editor for, the assignment fails if the schema name of the virtual object contains periods. A schema name that contains a period causes an error when it is parsed.

Edit the schema names before you virtualize. Remove periods from the schema name.

Alternatively, go to the project and add the connected asset from within the project by using the Data Virtualization connection.

Applies to: 4.5.1.

Fixed in: 4.5.3.

When tables contain more than 200 business terms, the Virtualize page might crash your browser

When you go to the Virtualize page, Data Virtualization gets related business terms for each table and shows them in the user interface. Your browser might crash when Data Virtualization searches for business terms and the result exceeds 200 tables with business terms in one catalog.

Workaround: Set no more than 200 tables with business terms in one catalog.

Applies to: 4.5.0 and 4.5.1.

Fixed in: 4.5.3.

Limitations of row-level filtering
  • Row-level filtering rules must be defined so that they apply only to those objects on which the filter expression will be valid.

    For example, if there is an asset UserAsset with columns SSN, FirstName, LastName, but someone defines a filter_include rule with predicate COUNTRY = 'US', you will see an error when UserAsset is queried because there is a reference to a column COUNTRY that does not exist in UserAsset.

  • You cannot use a numeric filter on a column of nonnumeric type, for example, CHAR.

Fixed in: 4.5.2.

A user cannot edit connections if the user is not added to the Platform assets catalog as an editor or administrator
If a user with an Admin role adds a user with an Engineer role but does not add the Engineer to the Platform assets catalog, the Engineer can create a connection; however, the Engineer cannot edit the connection.

Workaround: Add the user with the Engineer role to thePlatform assets catalog as an editor or administrator.

Applies to: 4.5.0.

Fixed in: 4.5.1.

Manage access grants to a user's name and not the username

When you grant access to virtualized tables by clicking Manage access, Db2 Data Management Console performs GRANT access on the Name attribute of the user and not username attribute.

Workaround: Run the following queries in Data > Data virtualization > Run SQL where CQWWU@CN.IBM.COMis the username.

Note: The username must be entered in all capital letters.
  • To grant table access to a Data Virtualization Engineer, run a query that is similar to the following example.
    GRANT select ON table ADMIN.CUSTOMER TO user "CQWWU@CN.IBM.COM" with grant option
  • To grant table to a Data Virtualization User, run a query that is similar to the following example.
    GRANT select ON table ADMIN.CUSTOMER TO user "CQWWU@CN.IBM.COM"
If you used Manage access in Db2 Data Management Console to authorize a user to an object, you must remove the authorization that was mistakenly granted to the Name attribute of the user. For example, if the Name of a user is Harry, you must run the following query.
REVOKE SELECT ON table DV_ADMIN1.CONTACTS FROM USER "HARRY";

Applies to: 4.5.0.

Fixed in: 4.5.1.

On the Virtualize page, the Business terms column always shows loading
This issue does not affect virtualization.

Workaround: Add the Access governance artifacts Platform permission for the current user. If you do not have a catalog other than the Platform assets catalog, create a catalog. For more information, see Managing business terms (Watson Knowledge Catalog).

Applies to: 4.5.0.

Fixed in: 4.5.1.