Syncing rules

Tables that are created, altered, or dropped by Hive clients can have the associated catalog changes automatically applied to Db2® Big SQL. Specific rules determine whether or not catalog syncing for a Hive table will occur.

Tables in the Hive INFORMATION_SCHEMA or in schemas whose name begins with SYS cannot be synced. The Hive DEFAULT schema is not synced unless a custom syncing rule has been created to enable automatic syncing on this schema. Use the Db2 Big SQL configuration utility to define additional syncing rules by setting the Db2 Big SQL configuration parameters bigsql.catalog.sync.exclude and bigsql.catalog.sync.include. For more information, see Db2 Big SQL configuration utility. The default value for both of these parameters is the null value. Set these parameters by providing a case-insensitive list of comma- or whitespace-delimited schema or table names. Restart the Db2 Big SQL service so that the changed settings can take effect.

bigsql.catalog.sync.exclude
Contains a list of schemas or tables that should not be synced (in addition to those already excluded). For example, to disable automatic syncing on all tables in schemas A, B, and C and on tables T1 and T2 in schema X, set this property to A, B, C, X.T1, X.T2.

You can use wildcard patterns to concisely specify multiple schemas and tables. Use ? to match any single character and * to match any number of characters. If * or *.* is used to match all schemas and tables, nothing will be synced unless there is an explicit rule to indicate otherwise.

bigsql.catalog.sync.include
Contains a list of schemas or tables that should be synced, but that might otherwise be excluded because of another syncing rule. For example, to enable automatic syncing on tables in the DEFAULT schema, set this property to DEFAULT.

You can use wildcard patterns to concisely specify multiple schemas and tables. Use ? to match any single character and * to match any number of characters.

Syncing rules are evaluated in the following order. The first matching rule determines whether or not a table is synced.
  1. If the schema is SYS, INFORMATION_SCHEMA, or begins with SYS, do not sync the table.
  2. If the table name appears in the exclude list, do not sync the table.
  3. If the table name appears in the include list, sync the table.
  4. If the table name matches a pattern in the exclude list, do not sync the table.
  5. If the table name matches a pattern in the include list, sync the table.
  6. If the schema name appears in the exclude list, do not sync the table.
  7. If the schema name appears in the include list, sync the table.
  8. If the schema name matches a pattern in the exclude list, do not sync the table.
  9. If the schema name matches a pattern in the include list, sync the table (unless the schema name is DEFAULT).
  10. If the schema is the Hive DEFAULT schema, do not sync the table.
  11. If the exclude list contains the pattern * or *.*, do not sync the table.
  12. Otherwise, attempt to sync the table.
The following rules determine the evaluation sequence when conflicts arise. In each of these examples, the table (named schema.table) will not be synced.
  • A rule that specifies an explicit table name will override a more general conflicting rule. For example:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, schema.table"
    bigsql-config -set "bigsql.catalog.sync.include=schema.tab*"
  • A table pattern rule will override a schema rule. For example:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, schema.tab*"
    bigsql-config -set "bigsql.catalog.sync.include=schema"
  • A rule that specifies an explicit schema name will override a pattern that matches multiple schemas. For example:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, schema"
    bigsql-config -set "bigsql.catalog.sync.include=sch*"
  • If a table name matches a pattern (or regular expression) in both lists, the exclude rule will apply. For example:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, s*.t*"
    bigsql-config -set "bigsql.catalog.sync.include=schem?.tab*"

Examples

  1. Sync everything except system schemas and tables in the DEFAULT schema.
    
    bigsql-config -set "bigsql.catalog.sync.exclude="
    bigsql-config -set "bigsql.catalog.sync.include="
  2. Sync nothing.
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*"
    bigsql-config -set "bigsql.catalog.sync.include="
  3. Sync only schemas abc and def and table ghi.xyz.
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*"
    bigsql-config -set "bigsql.catalog.sync.include=abc, def, ghi.xyz"
  4. Sync only schemas that start with “user” (for example, user1) and tables in a schema named “xyz” (excluding “xyz.table99”).
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, xyz.table99"
    bigsql-config -set "bigsql.catalog.sync.include=user*, xyz.table?, xyz.table??"
    If additional flexibility is required, it is also possible to use full regular expressions by enclosing the schema and table name parts in parentheses:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=(.*), (xyz).(table99)"
    bigsql-config -set "bigsql.catalog.sync.include=(user.*), (xyz).(table[0-9]{1,2})"
    For greater conciseness, use regular expression syntax only where necessary:
    
    bigsql-config -set "bigsql.catalog.sync.exclude=*, xyz.table99"
    bigsql-config -set "bigsql.catalog.sync.include=user*, xyz.(table[0-9]{1,2})"