Presto (Java) mixed-case support overview

Case-sensitivity refers to the Presto (Java) engine’s capability to distinguish between uppercase and lowercase letters, treating them as distinct characters. This is important when querying databases, as the case of table and column names can affect the query results.

watsonx.data Developer edition

watsonx.data on Red Hat® OpenShift®

Case-insensitive behavior

Open-source Presto (Java) is case-insensitive. It does not distinguish between uppercase and lowercase characters for schema names, table names, and column names. Presto (Java) converts identifiers to lowercase. The following queries are considered as the same.

SELECT * FROM catalog1.schema1.table;
SELECT * FROM catalog1.schema1.TABLE;
SELECT * FROM catalog1.schema1.TaBle;

Case-sensitive behavior

Presto (Java) engine behavior in IBM® watsonx.data was case-insensitive till version 1.0.3. Case sensitivity was introduced in version 1.1.0. All Presto (Java) engine versions from 1.1.0 and above are case-sensitive by default. The table names in the following examples are stored and fetched separately.

SELECT * FROM catalog1.schema1.table;
SELECT * FROM catalog1.schema1.TABLE;
SELECT * FROM catalog1.schema1.TaBle;

Mixed-case feature flag

From IBM watsonx.data version 2.0.0, a new feature is available to switch between both case-sensitive and case-insensitive behavior in Presto (Java) by using a mixed-case feature flag. The mixed-case feature flag is set to OFF in Presto (Java) by default. The flag can be set to ON or OFF as required during deployment of the Presto (Java) engine. It is advised not to toggle between ON and OFF configurations after the deployment, as it may result in inconsistent system behavior.

To configure the flag, you can either configure it by using the Customization API or reach out to the support team.

Note: You can use the following curl command to set the flag as true or false:

{
  "engine_properties": {
    "global": {
      "enable-mixed-case-support": "true"
    }
  },
  "engine_restart": "force"
}

The following are the two scenarios to illustrate mixed-case support behavior:

Scenario 1: Mixed-case feature flag ON

A. Create multiple tables in the mixed-case feature flag ON cluster for both lowercase and mixed case table names.

The user can access all the tables.

B. Change the setting for mixed-case feature flag from ON to OFF.

If there are multiple tables with same name but in mixed cases, then such tables may not be accessible or can cause data discrepancy depending on the connector used.

Scenario 2: Mixed-case feature flag OFF

A. Create multiple tables in the mixed-case feature flag OFF cluster.

For duplicate table names, only the table that is created first will be fetched.
For unique table names, all tables are created and fetched.

B. Change the setting for mixed-case feature flag from OFF to ON.

The user can access all the tables.

For more information on mixed-case behavior, see Mixed-case behavior based on connectors.