UNICODE_STR scalar function

The UNICODE_STR function returns a string in Unicode UTF-8 or UTF-16, depending on the specified option. The string represents a Unicode encoding of the input string.

Read syntax diagramSkip visual syntax diagram UNICODE_STR ( string-expression UTF8UTF16 )

The schema is SYSIBM.

string-expression
An expression that returns a value of a built-in character or graphic string.
A character string must not be bit data.
Values that are preceded by a backslash ('\') are treated as Unicode UTF-16 characters, for example, '\0041' is the Unicode UTF-16 representation for 'A').
A double backslash '\\' indicates a backslash in the string.
A backslash that is not part of a double backslash must be following by four hexadecimal digits (SQLSTATE 42815).
A partial surrogate character in the expression is replaced with a blank.
The argument can also be a numeric data type.
The numeric argument is implicitly cast to a VARCHAR data type.
UTF8 or UTF16
Specifies the Unicode encoding of the result.
  • If UTF8 is specified, the result is returned as a Unicode UTF-8 character string.
  • If UTF16 is specified, the result is returned as a Unicode UTF-16 graphic string.
UTF8 is the default.

Result

The result of the function depends on the second argument:
  • If UTF8 is specified, the result is VARCHAR.
  • If UTF16 is specified, the result is VARGRAPHIC.
The length attribute of the result depends on the second argument (UTF8 or UTF16).
  • If the second argument is UTF8, the length attribute of the result is MIN(n,32672) OCTETS, where n depends on the length attribute and string units of the string-expression as follows:
    • If the string-expression contains the string unit OCTETS, n is the length attribute of the input string.
    • If the string-expression contains the string unit CODEUNITS16, n is twice the length attribute of the input string.
    • If the string-expression contains the string unit CODEUNITS32, n is four times the length attribute of the input string.
  • If the second argument is UTF16, the length attribute of the result is MIN(n,16336) CODEUNITS16, where n depends on the length attribute and string units of the string-expression as follows:
    • If the string-expression contains the string units OCTETS or CODEUNITS16, n is the length attribute of the input string.
    • If the string-expression contains the string unit CODEUNITS32, n is twice the length attribute of the input string.

Notes

As a syntax alternative, you can specify UNISTR as a synonym for UNICODE_STR.

The following example sets the host variable HV1 to a VARCHAR value that represents the Unicode UTF-8 string that corresponds to the argument:

SET :HV1 = UNICODE_STR('Hi, my name is \5CF0');

HV1 is assigned a Unicode UTF-8 string with the following value 'Hi, my name is 峰'.