Overview of DBCS

Some languages have too many symbols for all characters to be represented using single byte codes. For example, the English language can be defined within a single byte range from 1-256, or x'00' through x'FF'. The Korean and other ideographic languages contain several thousand characters. To create these coded character sets, two bytes are needed for each character.

The IBM® Connect:Direct® Double-byte Character Set (DBCS) support provides a mechanism for translating ASCII and EBCDIC DBCS data. DBCS support translates Single-byte Character Set (SBCS) and DBCS data in the form that is supported on the requested platform.

DBCS character representation differs between operating systems. Specifically, a mainframe represents data in 8-bit EBCDIC code and a PC represents data in 7-bit ASCII code. For the mainframe environment, DBCS can be used exclusively within a file or be mixed with SBCS characters. Special character indicators exist to tell the difference between SBCS and DBCS characters. The special character indicators are shift-out (SO) and shift-in (SI), or x'0E' and x'0F' respectively for IBM mainframes. Shift-out denotes shifting from SBCS to DBCS mode and shift-in denotes shifting from DBCS to SBCS mode. SO/SI combinations are not required if DBCS is exclusive within a file. For the PC, the SO/SI characters are not recognized. In this environment, DBCS is represented by setting the high order bit of the ASCII code. See the table in RULES for correct mapping of DBCS characters by language.

Note: A DBCS table can be extremely large and complex. Use the sample tables in this documentation as a reference only. They do not successfully translate all characters.