Characters

The most basic and indivisible unit of the COBOL language is the character. The basic character set includes the letters of the Latin alphabet, digits, and special characters.

In the COBOL language, individual characters are joined to form character-strings and separators. Character-strings and separators, then, are used to form the words, literals, phrases, clauses, statements, and sentences that form the language.

The basic characters used in forming character-strings and separators in source code are shown in Table 1.

For certain language elements, the basic character set is extended with the EBCDIC Double-Byte Character Set (DBCS).

DBCS characters can be used in forming user-defined words.

The content of alphanumeric literals, comment lines, and comment entries can include any of the characters in the computer's compile-time character set, and can include both single-byte and DBCS characters.

Runtime data can include any characters from the runtime character set of the computer. The runtime character set of the computer can include alphanumeric characters, DBCS characters, and national characters. National characters are represented in UTF-16, a 16-bit encoding form of Unicode.

When the NSYMBOL (NATIONAL) compiler option is in effect, literals identified by the opening delimiter N" or N' are national literals and can contain any single-byte or double-byte characters, or both, that are valid for the compile-time code page in effect (either the default code page or the code page specified for the CODEPAGE compiler option). Characters contained in national literals are represented as national characters at run time.

For details, see User-defined words with DBCS characters, DBCS literals, and National literals.

Start of change
Table 1. Basic COBOL character set. This table lists basic COBOL character set.
Character Meaning Use Example
  Space Punctuation character
01 WS-A PIC X(10).
+ Plus sign Arithmetic operator
COMPUTE WS-A = WS-B + WS-C.
Editing character
01 WS-A PIC +9(3).
- Minus sign or hyphen Arithmetic operator
COMPUTE WS-A = WS-B - WS-C.
Editing character
01 WS-A PIC -9(3).
Continuation character

  01 WS-VAR  PIC X(27) VALUE 
-      'THIS MULTI-LINE TEXT'. 
COBOL word element
01 WS-A PIC 9(3).
* Asterisk Arithmetic operator
COMPUTE WS-A = WS-B * WS-C.
Editing character
01 WS-A PIC **9.
Comment character
* THIS IS COMMENT LINE. 
/ Forward slash or solidus Arithmetic operator
COMPUTE WS-A = WS-B / WS-C.
Editing character
01 WS-DATE PIC 99/99/99.
Continuation character

/01 WS-VAR  PIC X(27) VALUE 
/     'THIS MULTI-LINE TEXT'. 
= Equal sign Assignment character
COMPUTE WS-A = WS-B / WS-C.
Relation character
IF WS-A = 10
$ Currency sign1 Editing character
01 WS-DATE PIC $$99.
, Comma Editing character
01 WS-DATE PIC 99,999.
Punctuation character
MOVE 10 TO WS-A, WS-B.
; Semicolon Punctuation character
MOVE 10 TO WS-A; WS-B.
. Decimal point or period Editing character
01 WS-DATE PIC 99.999.
Punctuation character
MOVE 10 TO WS-A, WS-B.
" Quotation mark2 Punctuation character
01 WS-VAR PIC X(5) VALUE "HELLO".
' Apostrophe Punctuation character
01 WS-VAR PIC X(5) VALUE 'HELLO'.
( Left parenthesis Punctuation character
IF (WS-A = 10) AND (WS-B = 5)
) Right parenthesis Punctuation character
IF (WS-A = 10) AND (WS-B = 5)
> Greater than Relation character
IF WS-A > 10
< Less than Relation character
IF WS-A < 10
: Colon Relation character
MOVE WS-VAR(1:10) TO WS-VAR1.
_ Underscore User-defined word element
01 WS_VAR PIC X(10).
A - Z Alphabet (uppercase) Alphabetic characters /
a - z Alphabet (lowercase) Alphabetic characters /
0 - 9 Numeric characters Numeric characters /
  1. The currency sign is the character with the value X'5B', regardless of the code page in effect. The assigned graphic character can be the dollar sign or a local currency sign.
  2. The quotation mark is the character with the value X'7F'.
End of change