IBM Netezza Replication Services, Version 1.6

Non-deterministic SQL

An SQL statement is non-deterministic in a replication set if it does not return the same result when executed on all replication nodes in the set. If an SQL statement contains a non-deterministic expression, by-value replication of the expression is performed.

Updates to replicated data that use any form of statement time stamp (for example, calls to the CURRENT_TIMESTAMP function and TIMETZ function) are supported. The statement time stamps are replicated as string literals that are evaluated on the master NPS® system that executes the query. Because the literals are interpreted in the time zone of the master, if the master and subordinate time zones differ, the two sites have identical display times that represent different absolute or UTC times. For example, if the time zone of the master is set to EDT, the value is recorded in EDT. Assume that the value is 4:18 p.m. If the time zone of the subordinate is PDT, the time is three hours earlier than that of the master, or 1:18 p.m. Handling time by value, the master sends, and the subordinate receives, the value of 4:18 p.m., which is different from the subordinate’s absolute time.

The following items are always replicated by value:
  • The LIMIT keyword
  • The NEXT VALUE IN <sequence> expression
  • User-defined aggregates (UDAs) and user-defined table functions (UDTFs)
  • Special columns (DatasliceID, RowID, CreateTxID, DeleteTxID, _extentID, and _ pageID), except when the query is selecting from a versioned table
  • User-visible system functions that were detected as non-deterministic
  • The Current_DBID, Current_SID, Current_USERID, and Current_USEROID functions

UDFs, UDAs, and non-determinism

User-defined functions (UDFs) are processed by using the deterministic setting (--nondet) that you specified during UDF registration or for the CREATE FUNCTION or ALTER FUNCTION command. If the function is deterministic, by-SQL replication is applied for that function; if non-deterministic, by-value replication is chosen for the entire transaction. Use of a user-defined aggregate (UDA) in a transaction that modifies non-temporary tables in a replicated database always triggers by-value replication. A by-SQL transaction becomes a by-value transaction if the software detects any single non-deterministic function call or non-replicated data reference.

If you specify that a UDF is deterministic (by not using the --nondet option) and it is not, by-SQL replication is applied. However, the master and subordinate might experience data divergence when the UDF executes differently on the subordinate than on the master. In that case, backup and restore actions are required to correct the problem. For this reason, if you are unsure whether a function contains non-deterministic SQL, the best course of action is to apply the --nondet setting upon registration. For complete details and examples, see IBM Netezza User-Defined Functions.

Non-deterministic view settings

Postgres function information is accessible from the _v_function system view. The following flags within the _v_function view are relevant to non-deterministic SQL detection and handling.
Table 1. _v_function view flags related to determinism
Name Description
display Indicates whether the function is user visible. The nondeterministic SQL check is done only on user-visible functions.
deterministic Indicates whether the function is deterministic. The NPS optimizer uses this flag to choose an evaluation algorithm. The non-deterministic SQL check algorithm references, but is not entirely based on, this flag. A function might pass this test but fail on some other check (for example, when used in combination with another function). If non-deterministic, the function is handled by value.
The following other behaviors apply:
  • All user-visible functions with the deterministic flag set to true, except for CURRENT_DBID, CURRENT_SID, CURRENT_USERID, and CURRENT_USEROID, are deterministic in a replication environment. The exception applies for these four functions because it is not guaranteed that the master and subordinate have the same object IDs.
  • All user-visible functions with the deterministic flag set to false, except for time stamp-related functions, are non-deterministic in a replication environment.
  • All user-visible functions that return a portion or a transformation of the statement time stamp are replicated and deterministic in a replication environment, regardless of the deterministic flag.


Feedback