IT17612: MQ-JMS: An unexpected byte-order-mark character is visible in messages decoded from CCSID 17584

APAR status

Closed as program error.

Error description

A WebSphere MQ classes for JMS V7.5.0.5 application consumes a
message from a queue, which had been generated and put to the
queue by a Siebel application.  The message is put to the queue
with the following declared character encoding configuration:

  MQMD Format:  MQSTR  (MQFMT_STRING)
  MQMD CodedCharSetId: 17584
  MQMD Encoding: 564 (0x222)

The body of the message consists of XML character data.  When
the message is consumed by the receiving application, and its
character content is passed to an XML parser, the XML parser
throws a parsing error.

Previously, the application had been using the WebSphere MQ
classes for JMS V7.0.1.11 where the problem was not seen, and
the XML parser was able to process the message successfully.

Examining the byte sequence at the start of the message body on
the queue before being consumed by the JMS application, the
bytes of the message body were of the form:


  0 1 2 3  4 5 6 7  8 9 A B  C D E F
  fffe3c00 3f007800 6d006c00 20007600 : ..<.?.x.m.l. .v.
  65007200 73006900 6f006e00 3d002200 : e.r.s.i.o.n.=.".
  31002e00 30002200 20006500 6e006300 : 1...0.". .e.n.c.
  6f006400 69006e00 67003d00 22005500 : o.d.i.n.g.=.".U.

Local fix

Configure the message producing application to generate a
message body which is encoded in an alternative character
encoding scheme, such as:

  UTF-8  (CCSID 1208)

Problem summary

****************************************************************
USERS AFFECTED:
This issue affects users of the IBM MQ classes for JMS who have
applications that are consuming messages where the message body
is declared to be of type MQSTR, with the character encoding
declaration:

  CCSID:  1200, 13488, 17584
  Encoding: 564 (0x222)

where the message body contains a byte-order-mark at the start
of the data.


Platforms affected:
MultiPlatform

****************************************************************
PROBLEM DESCRIPTION:
With the code change associated with MQ APAR IV40180:

http://www.ibm.com/support/docview.wss?uid=swg1IV40180

when an IBM MQ classes for JMS application consumes a message
which is declared to be character encoded using CCSID 1200,
13488 or 17584, and the Encoding field is declared to have
little-endian integer encoding (0x222), the IBM MQ classes for
JMS map this character encoding scheme to the Java Charset
named:

    UTF-16LE

This differs to the currently defined IBM global standards
(external to IBM MQ), where these CCSID values are all declared
to be big-endian encoded, as per the IBM Globalization
documentation:

1200:
https://www.ibm.com/software/globalization/ccsid/ccsid13488.html
Name: "UTF-16 BE with IBM PUA"
      "Data is big endian order"

13488:
https://www.ibm.com/software/globalization/ccsid/ccsid13488.html
Name: "Unicode 2.0, UTF-16 BE with IBM PUA"
      "Data is big endian order"

17584:
https://www.ibm.com/software/globalization/ccsid/ccsid17584.html
Name: "Unicode 3.0, UTF-16 BE with IBM PUA"
      "Data is big endian order"


It was observed that when viewing the message on the queue prior
to consumption by the JMS application, this bytes of this
particular message's body also started with a byte-order-mark:

    '0xFF 0xFE'

The Java Charset 'UTF-16LE' does not permit a byte-order-mark
character to be present in the document, which results in this
message's byte-order-mark being interpreted as a visible
character at the start of the message document which was added
to the "java.lang.String" object returned to the application as
a result of the JMS method call:

  javax.jms.TextMessage.getText()

This in turn resulted in the application's XML parser failing to
correctly parse the XML document.


Prior to the MQ APAR IV40180, a message's character data
declared to be encoded using CCSID 17584 (with message Encoding
value 0x222) would be decoded using the Java Charset name:

  'UnicodeLittle'

which permitted the byte-order-mark to be present in the bytes
of the message body.  The issue with this Java Charset is that
there was no mapping present in the IBM MQ classes for Java/JMS
to map it back to an IBM CCSID, which meant that while messages
could be received and decoded using this Java Charset, those
same messages could then not be sent back to the queue manager
using the IBM MQ classes for JMS API.

The code change associated with APAR IV40180 was included in the
MQ versions:

  7.0.1.12
  7.1.0.4
  7.5.0.3

resulting in the observed change of behaviour going from any of
the IBM MQ classes for JMS versions prior to the above fixpack
level.

By mapping CCSID 17584 to "UTF-16LE" as APAR IV40180 did, a
byte-order-mark present in the message on the queue would be
interpreted as a printable character into the Java String
object, which is incorrect, although it should be noted that
CCSID 17584 is currently officially declared as always being
big-endian ordered without a byte-order-mark.


In this same scenario, when the JVM system property was defined:

  -Dcom.ibm.mq.cfg.CCSID.MapUtf16ByteOrderByCCSID=YES

then all the byte ordering was reversed, resulting in corrupted
character data as the IBM MQ classes for JMS mapped CCSID 17584
to the encoding scheme CCSID 1200, resulting in the use of the
big-endian Java Charset "UTF-16".

Problem conclusion

The default encoding mapping for CCSIDs:

  1200
  13488
  17584

where the message's integer "Encoding" value is defined to use
little-endian encoding (0x222), has now been mapped to the Java
Charset named:

  x-UTF16LE-BOM

In addition, due to the complexity of the use of CCSID
1200/13488/17584 with IBM MQ, a new property has been defined
which controls which Java Charset the data will be decoded from
the bytes of the message on the queue, irrespective of the
integer encoding value is specified on the message.

This property has the name:

    com.ibm.mq.cfg.CCSID.MapCcsid1200ToSpecificCharset

and can be set as a JVM argument.

For example, if the IBM MQ classes for JMS are to be configured
to interpret a CCSID 1200 message's bytes using the Java
'UnicodeLittle' encoding, you would use the command line JVM
argument:


-Dcom.ibm.mq.cfg.CCSID.MapCcsid1200ToSpecificCharset=UnicodeLitt
le

Note that this property has no effect when sending a message
from the IBM MQ classes for JMS back to MQ, so care is needed
when using it.  If you use this property, and specify a Java
Charset name which your running JVM recognises but is not one
which the IBM MQ classes for JMS recognise, your application
will be able to receive the message, but not send it back to MQ,
as the IBM MQ classes for JMS will not be able to map the
message's declared "JMS_IBM_Character_Set" property back into an
CCSID value.

---------------------------------------------------------------
The fix is targeted for delivery in the following PTFs:

Version    Maintenance Level
v7.5       7.5.0.9
v8.0       8.0.0.9
v9.0 CD    9.0.5
v9.0 LTS   9.0.0.4

The latest available maintenance can be obtained from
'WebSphere MQ Recommended Fixes'
http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006037

If the maintenance level is not yet available information on
its planned availability can be found in 'WebSphere MQ
Planned Maintenance Release Dates'
http://www-1.ibm.com/support/docview.wss?rs=171&uid=swg27006309
---------------------------------------------------------------

Temporary fix

Comments

APAR Information

APAR number
IT17612
Reported component name
WMQ BASE MULTIP
Reported component ID
5724H7241
Reported release
750
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2016-10-23
Closed date
2018-02-13
Last modified date
2018-02-13

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
WMQ BASE MULTIP
Fixed component ID
5724H7241

Applicable component levels

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSDEZSF","label":"IBM WebSphere MQ Managed File Transfer for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5","Edition":"","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
31 March 2023

Tips

IT17612: MQ-JMS: An unexpected byte-order-mark character is visible in messages decoded from CCSID 17584

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

Document Information

Share your feedback

Need support?