APAR status
Closed as program error.
Error description
Ideally users need to set make sure top level OSH options are set to suit their client character set and set NLS_LANG accordingly. If they set some top level OSH options and do not set NLS_LANG or don't set appropriately, they are bound to land in trouble. All Oracle knows about is NLS_LANG. Oracle uses client character set specified by NLS_LANG and assumes certain defaults if NLS_LANG is not specified. If the default character set assumed by Oracle and set by user in OSH does not match, it is plain trouble. Currently we render metadata as ustring whenever database character set is Multi byte. Fix to the stage code is to take note of client character set in deciding schema (see bellow) Thus if user has no NLS_LANG or one with a single byte character set, we no longer need to load data in Unicode. Data remains in client character set and will be handled by Oracle as appropriate. Consider the test case where the input data is in UTF-8 already but OSH top level option does not mention it. Unfixed, we try to Unicode data to load and string data input may fail to convert to UTF-8. as exporter dosen't know data is already in UTF-8 and renders bad output. The result could be longer bytes than column length and hence error. If the data is in some string character set, the top level options of OSH are properly set, exporter should be able to do justice and we shouldn't see any error. In customer's case the data is corrupt and exporter output is more gibberish and we got error. Here this is how the fix relates both client and db character sets to metadata: Database character set Client character set Schema to use Explanation Single byte Single byte string No need to use Unicode. Single byte Multi byte string Oracle gives single bytes. We can send Unicode though. Multi byte Single byte string No need to use Unicode. Multi byte Multi byte ustring read/write multi byte needs ustring. Of particular interest here is the point, we don't need to use unicode if client is using a single byte character set. This is the central point of the fix. This does not alter any thing for those who handle NLS_LANG correctly.
Local fix
This fix is included in 8.0.1 fix pack 3
Problem summary
**************************************************************** USERS AFFECTED: all users, particularly those not needing unicode strings. **************************************************************** PROBLEM DESCRIPTION: No need to use unicode for char and varchar2 columns if the client is using a single byte character set. The metadata should be decided solely on the merit of the clients character set property whether it is a multibyte one or not. The server's character set, whether sinlge byte or multibyte does not play a role here. **************************************************************** RECOMMENDATION: apply patch. This change is included in 8.1 Fix Pack 1. ****************************************************************
Problem conclusion
Now, the stage will use ustrings for character schema where characters are potentially multibyte. This means, NCHAR and NVARCHAR2 columns use ustring schema type. CHAR and VARCHAR2 will use ustring only if the client character set is a multibyte one.
Temporary fix
Comments
APAR Information
APAR number
JR32541
Reported component name
WIS DATASTAGE
Reported component ID
5724Q36DS
Reported release
753
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt
Submitted date
2009-03-30
Closed date
2009-05-18
Last modified date
2010-12-09
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Fixed component name
WIS DATASTAGE
Fixed component ID
5724Q36DS
Applicable component levels
R753 PSN
UP
[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSVSEF","label":"IBM InfoSphere DataStage"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"7.5.3","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
09 December 2010