IBM-943 and IBM-932

Each of the Japanese IBM® PC code sets are an encoding consisting of single-byte and multibyte coded characters. The encoding is based on the IBM PC code set and places the JIS characters in shifted positions. This is referred to as Shift-JIS or SJIS.

IBM-943 is a newer code set for the Japanese locale than IBM-932. IBM-943 is a compatible code set for the Japanese Microsoft Windows environment. This code set is known as 1983 ordered shift-JIS. The differences between IBM-932 and IBM-943 are as follows:

  • Previous JIS sequence (1978 ordered) is applied for IBM-932 while newer JIS sequence (1983 ordered) is applied for IBM-943.
  • NEC selected characters are added to IBM-943.
  • NEC's IBM selected characters are added to IBM-943.

The IBM-932 code set consists of the following character sets:

Character set Description
JISCII JISX0201 Graphic Left character set
JISX0201.1976 Katakana/Hiragana Graphic Right character set
JISX0208.1983 Kanji level 1 and 2 character sets
IBM-udcJP IBM user-definable characters

The IBM-943 code set consists of the following character sets:

Character set Description
JISCII JISX0201 Graphic Left character set
JISX0201.1976 Katakana/Hiragana Graphic Right character set
JISX0208.1990 Kanji level 1 and 2 character sets
IBM-udcJP IBM user-definable characters and NEC's IBM selected characters and NEC selected characters

The first byte of each character is used to determine the number of bytes for a given character. The values 0x20-0x7e and 0xa1-oxdf are used to encode JISX0201 characters, with exceptions. The positions 0x81-0x9f and 0xe0-0xfc are reserved for use as the first byte of a multibyte character. The JISX0208 characters are mapped to the multibyte values starting at 0x8140. The second byte of a multibyte character can have any value. The Shift-JIS table shows where these characters are located on the code set.

Character Encoding Code Point Description Count
000xxxxx 00–1f Controls 32
00100000 20 Space 1
0xxxxxxx 21–7E 7-bit ASCII 94
01111111 7F Delete 1
10000000 80 Undefined 1
100xxxxx 01xxxxxx [81–9F] [40–7E] Double byte 1953
100xxxxx 1xxxxxxx [81–9F] [80–FC] Double byte 3975
10100000 A0 Undefined 1
1xxxxxxx A1–DF 7-bit single byte 63
111xxxxx 01xxxxxx [E0–FC] [40–7E] Double byte 1827
111xxxxx 1xxxxxxx [E0–FC] [80–FC] Double byte 3625
11111101 FD Undefined 1
11111110 FE Undefined 1
11111111 FF Undefined 1

The following table shows the DBCS portion of IBM-943.

Code Point Description
[81–84] [40–7E] and [81–84] [80–F0] JIS X 0208 (Non-Kanji)
[87] [40–7E] and [87] [80–F0] NEC selected characters
[89–98] [40–7E] and [88] [9F-F0], [89–97] [80–F0], [98] [80–9F] JIS X0208 (Level-1 Kanji)
[99–9F] [40–7E] and [98] [9F-F0], [99–9F] [80–F0] JIS X0208 (Level-2 Kanji)
[E0–EA] [40–7E] and [E0–EA] [80–F0] JIS X0208 (Level-2 Kanji)
[ED–EE] [40–7E] and [ED–EE] [80–F0] NEC IBM selected characters
[F0–F9] [40–7E] and [F0–F9] [80–F0] User-defined characters
[FA] [40–5C] IBM selected characters (non-Kanji)
[FA] [5C-7E], [FB-FC] [40–7E] and [FA-FC] [80–F0] IBM selected characters (Kanji)

The following table shows the DBCS portion of IBM-932.

Code Point Description
[81–98] [40–7E] and [81–97] [80–FC], [98] [80–9F] JIS X 0208 (Level-1 Kanji)
[99–9F] [40–7E] and [98] [9F-FC], [99–9F] [80–FC] JIS X 0208 (Level-2 Kanji)
[E0–EF] [40–7E] and [E0–EF] [80–FC] JIS X 0208 (Level-2 Kanji)
[F0–F9] [40–7E] and [F0–F9] [80–FC] User-defined characters
[FA–FC] [40–7E] and [FA–FC] [80–FC] IBM selected characters