SQL and DBCS

The basic symbols of keywords and operators in the SQL language are single-byte characters that are part of all character sets supported by the IBM® relational database products. Characters of the language are classified as letters, digits, or special characters.

SQL host identifiers and double-byte characters

A host-identifier is a name declared in the host program. The rules for forming a host-identifier are the rules of the host language, except that DBCS characters cannot be used.

SQL character subtypes and double-byte characters

Each character string is further defined as follows:

  • Bit data:

    Data that is not associated with a coded character set and is never converted. The CCSID for bit data is 65535.

  • SBCS data:

    Data in which every character is represented by a single byte. Each SBCS data character string has an associated CCSID. If necessary, an SBCS data character string is converted before it is used in an operation with a character string that has a different CCSID.

  • Mixed data:

    Data that contains a mixture of characters from a single-byte character set (SBCS) and a double-byte character set (DBCS). Each mixed data character string has an associated CCSID. If necessary, a mixed data character string is converted before an operation with a character string that has a different CCSID. If mixed data contains a DBCS character, it cannot be converted to SBCS data.

The database manager does not recognize subclasses of double-byte characters, and it does not assign any specific meaning to particular double-byte codes. However, if you choose to use mixed data, then two single-byte EBCDIC codes are given special meanings:

  • X'0E', the shift-out character, is used to mark the beginning of a sequence of double-byte codes.
  • X'0F', the shift-in character, is used to mark the end of a sequence of double-byte codes.

In order for the database manager to recognize double-byte characters in a mixed data character string, the following condition must be met:

  • Within the string, the double-byte characters must be enclosed between paired shift-out and shift-in characters.

    The pairing is detected as the string is read from left to right. The code X'0E' is recognized as a shift out character if X'0F' occurs later; otherwise, it is invalid. The first X'0F' following the X'0E' that is on a double-byte boundary is the paired shift-in character. Any X'0F' that is not on a double-byte boundary is not recognized.

    There must be an even number of bytes between the paired characters, and each pair of bytes is considered to be a double-byte character. There can be more than one set of paired shift-out and shift-in characters in the string.

The length of a mixed data character string is its total number of bytes, counting two bytes for each double-byte character and one byte for each shift-out or shift-in character.

When the job CCSID indicates that DBCS is allowed, CREATE TABLE will create character columns as DBCS-Open fields, unless FOR BIT DATA, FOR SBCS DATA, or an SBCS CCSID is specified. The SQL user will see these as character fields, but the system database support will see them as DBCS-Open fields.