String Operations

The string operations are shown in the following table.

Table 1. String Operations
Operation Traditional Syntax Free-Form Syntax
Concatenate CAT (Concatenate Two Strings) + operator
Concatenate strings with a separator   %CONCAT (Concatenate with Separator)
Concatenate array elements with a separator   %CONCATARR (Concatenate Array Elements with Separator)
Convert to lower case   %LOWER (Convert to Lower Case)
Convert to upper case   %UPPER (Convert to Upper Case)
Check CHECK (Check Characters) %CHECK (Check Characters)
Check Reverse CHECKR (Check Reverse) %CHECKR (Check Reverse)
Create   %STR (Get or Store Null-Terminated String)
Replace   %REPLACE (Replace Character String)
Scan SCAN (Scan String) %SCAN (Scan for Characters)
Scan Reverse   %SCANR (Scan Reverse for Characters)
Scan and Replace   %SCANRPL (Scan and Replace Characters)
Split a string   %SPLIT (Split String into Substrings)
Substring SUBST (Substring)
Translate XLATE (Translate) %XLATE (Translate)
Trim Blanks  
Return the leftmost characters of a string   %LEFT (Get Leftmost Characters)
Return the number of bytes for alphanumeric or double bytes for UCS-2 and graphic   %LEN (Get or Set Length)
Return the number of natural characters   %CHARCOUNT (Return the Number of Characters)
Return the rightmost characters of a string   %RIGHT (Get Rightmost Characters)
Work with a null-terminated string   %STR (Get or Store Null-Terminated String)

The string operations include concatenation, scanning, substringing, translation, and verification. String operations can only be used on character, graphic, or UCS-2 fields.

The CAT operation concatenates two strings to form one.

The CHECK and CHECKR operations verify that each character in factor 2 is among the valid characters in factor 1. CHECK verifies from left to right and CHECKR from right to left.

The SCAN operation scans the base string in factor 2 for occurrences of another string specified in factor 1.

The SUBST operation extracts a specified string from a base string in factor 2. The extracted string is placed in the result field.

The XLATE operation translates characters in factor 2 according to the from and to strings in factor 1.
Note: Figurative constants cannot be used in the factor 1, factor 2, or result fields. No overlapping in a data structure is allowed for factor 1 and the result field, or factor 2 and the result field.

In the string operations, factor 1 and factor 2 may have two parts. If both parts are specified, they must be separated by a colon. This option applies to all but the CAT, CHECK, CHECKR, and SUBST operations (where it applies only to factor 2).

If you specify P as the operation extender for the CAT, SUBST, or XLATE operations, the result field is padded from the right with blanks after the operation.

See each operation for a more detailed explanation.

When using string operations on graphic fields, all data in factor 1, factor 2, and the result field must be graphic. When numeric values are specified for length, start position, and number of blanks for graphic characters, the values represent double byte characters.

When using string operations on UCS-2 fields, all data in factor 1, factor 2, and the result field must be UCS-2. When numeric values are specified for length, start position, and number of blanks for UCS-2 characters, the values represent double byte characters.

String operations for data with different character sizes

By default, when using string operations on the graphic part of mixed-mode character data, values such as the start position, length and number of blanks represent single bytes. Preserving data integrity is the user's responsibility.

A similar issue exists for UTF-8 data, UTF-16 data, and mixed SBCS/DBCS ASCII data. See Character Data Type

You can process string data using the natural size of each character by working in CHARCOUNT NATURAL mode. However, the traditional-syntax operation codes listed above are not supported for this mode of processing strings.

Note: Some string operands always operate in STDCHARSIZE mode.
  • %LEN always works with the number of bytes for alphanumeric expressions and the number of double bytes for UCS-2 and graphic expressions.
  • The length operand for %STR always refers to the number of bytes.

Right-adjusted assignment of data with different character sizes

For EVALR and parameter passing with OPTIONS(*RIGHTADJ), data is assigned right-adjusted. When data has characters of different sizes, the compiler handles any necessary truncation of data differently depending on the CHARCOUNT mode.

When the value to be assigned is too large, the compiler truncates the initial data that fits in the result.

If the result of the assignment is a data type and CCSID that has characters of different sizes, such as UTF-8, the compiler handles the truncation differently depending on the CHARCOUNT mode.
  • With CHARCOUNT NATURAL mode, the compiler ensures that the first character assigned to the result is complete.
  • With CHARCOUNT STDCHARSIZE mode, the data that is assigned to the result may begin with a partial character.

Assignment of data with different character sizes

When data has characters of different sizes, the compiler handles any necessary truncation of data differently depending on the CHARCOUNT mode.

When the value to be assigned is too large, the compiler truncates the final data that fits in the result.

If the result of the assignment is a data type and CCSID that has characters of different sizes, such as UTF-8, the compiler handles the truncation differently depending on the CHARCOUNT mode.
  • With CHARCOUNT NATURAL mode, the compiler ensures that the final character assigned to the result is complete.
  • With CHARCOUNT STDCHARSIZE mode, the data that is assigned to the result may end with a partial character.

Concatenation of data with different character sizes

The temporary variable used by the compiler for a concatenation may not be big enough to hold the entire result of the concatentation expression, if the operands of the concatenation are very large, or if there are many operands.

When the result of the concatenation is too large, the compiler truncates the final operand that fits in the temporary result, and then it does not append any further operands into the temporary result.

The truncation is handled according to the rules for assignment.

For information on how the compiler determines the type and CCSID of the temporary result of the concatenation, see Determining the Common Type of Multiple Operands.