Before upgrading from Rational® Synergy 7.0
or 7.1, install and run the Illegal Character Detection tool on your
database. This tool finds characters that might be modified incorrectly
during the conversion to UTF-8.
Before you begin
Verify that your current version of the software requires
the completion of this task.
Current®Rational Synergy version |
Conversion required |
7.0 |
Yes |
7.1 |
Yes |
7.1a |
No |
Rational Synergy 7.1a
already uses UTF-8 encoding, so no encoding conversion is performed;
users upgrading from 7.1a can skip this section.
About this task
When a
Rational Synergy database
is upgraded from 7.0 or 7.1 to Rational Synergy
7.2 or later, all text metadata (type and object names, string and
text attribute values, and similar items) in that database is converted
from the Windows CP1252 encoding
used in previous releases to the UTF-8 encoding used in
7.2 or later.
No
changes are made to the contents of files controlled in the Rational Synergy database;
only the text metadata stored in Informix® or
Oracle is re-encoded.
In releases 7.0 and 7.1, Rational Synergy expects
text data to be encoded in CP1252 (or its Latin-1 subset). However,
it is possible that some characters might have been entered in other
encodings, perhaps using the Classic clients where encoding was not
checked.
The following hexadecimal byte values are undefined
in CP1252:
If any of these byte values are encountered in text metadata
during database upgrade, they are converted into the sequence “\x”
followed by the hexadecimal value as a string. For example, if the
byte value 0x81 is encountered during upgrade, it is converted to
the string “\x81”. Each such byte value encountered during upgrade
is noted in the upgrade log, and a list of such occurrences is stored
in the database for later retrieval.
Illegal byte value |
Converted string literal |
0x81 |
“\x81“ |
0x8D |
“\x8D“ |
0x8F |
“\x8F“ |
0x90 |
“\x90“ |
0x9D |
“\x9D“ |
By default, this tool scans for and reports any
of the byte values noted in the table. If users in a database have
run the Classic clients in encodings other than Windows CP1252 or ISO Latin-1, you must identify
the code points that differ between CP1252 and your encoding. Those
code points must also be scanned as instructed in step
6 below. For instance, suppose your
database contains data in Latin-2 (ISO-8859-2), the following table
contains some code points that would differ between CP-1252 and ISO-8859-2.
You must include these code points during the detection scan tool
since such data is not converted to the correct UTF-8 values during
upgrade to 7.2 or later.
Table 1. Code points that differ between
CP-1252 and ISO-8859-2Code point |
CP-1252 |
ISO-8859-2 |
0xB1 |
± |
ą |
0xB3 |
³ |
ł |
0xB6 |
¶ |
ś |
0xC0 |
À |
Ŕ |
Procedure
Before upgrading to Rational Synergy
7.2 or later, use the Illegal Character Detection tool:
- Download the detection library db_illegal.a.
Download the library from https://www.ibm.com/support/docview.wss?uid=swg27021595.
- In Rational Synergy
7.0 or 7.1, start a Classic CLI session.
- Run the following command to load the library: ccm
load –a detection_library_location For example: ccm load -a /tmp/db_illegal.a
- Define the command for the Illegal Character Detection
tool: ccm define detection db_illegal_detection cmd
- Run the Illegal Character Detection tool to begin scanning:
ccm detection html_output_file_location For example: ccm detection /tmp/database1.html
- Optional: You
can add your own set of illegal byte values by adding the -a or -additional option
followed by one or more hexadecimal values.
Note: Do not
include spaces between multiple hexadecimal values.
For example, to add two Latin-2 characters, use the command:
ccm detection –a B1B3B6C0 /tmp/database1_scan.html This command detects the five illegal CP1252 characters and
the listed Latin-2 characters.
Results
The time taken for the scan depends on the size of your database
and speed of your system. It is not unusual for the scan to take
several hours on a large database. If possible, run the scan when
the database is less busy. The scan is read only and does not require
shutting down the server or protecting the database.
The output
of the Illegal Character Detection tool is in HTML and can be viewed
in a browser. Each object containing illegal data as defined during
the command is listed in a header block. Click the header block to
view the attributes containing illegal values. The illegal characters
are highlighted in red. Undefined CP1252 characters show as their
hexadecimal value surrounded by angle brackets. For example, if the
character hexadecimal value 81 is encountered, the report shows “<81>”
in a large red font.
What to do next
After running the Illegal Character Detection tool, review
the report to see if your database contains illegal text data. If
it does, inspect the objects and their attribute values in an appropriate Rational Synergy or
Change interface. You might decide to remove or correct the data
manually, write a script to fix a repeating error, or leave the data
without making further edits.
During database upgrade to Rational Synergy 7.2 or later,
any remaining byte values that are not legal CP1252 characters are
converted to the “\xNN” sequence described earlier. All other text
data is assumed to be in the CP1252 encoding and is converted from
that encoding to UTF-8.