Converting prefixes and suffixes
When a single token can be composed of two distinct entities, a CONVERT-type action can be used to separate and standardize both parts.
An example of this is in German addresses, where the suffix STRASSE can be concatenated onto the proper name of the street, such as HESSESTRASSE.
If a list of American addresses has a significant error rate, you might need to check for occurrences of dropped spaces such as in MAINSTREET. To handle cases such as these, you can use the CONVERT_P or CONVERT_PL action to examine the token for a prefix and CONVERT_S for a suffix.
Like CONVERT_P, the CONVERT_PL action examines the token for a prefix. However, CONVERT_P takes the first prefix that matches a value in the lookup table and CONVERT_PL takes the longest prefix that matches.
For example, assume that a lookup table contains entries for NORTH and NORTHWEST. For the token NORTHWESTPOINT, the CONVERT_P action takes the prefix NORTH and the CONVERT_PL action takes the prefix NORTHWEST.
CONVERT_P, CONVERT_PL, and CONVERT_S use almost the same syntax as CONVERT. The first difference is that you must use a lookup table with these actions. The second difference is that you have an optional fifth argument.
CONVERT_P source @table_name TKN | TEMP retype1 retype2
CONVERT_PL source @table_name TKN | TEMP retype1 retype2
CONVERT_S source @table_name TKN | TEMP retype1 retype2
Argument | Description |
---|---|
source | Can be either an operand, a dictionary field, or a user variable. |
retype1 | Refers to the token class that you want assigned to the prefix with a CONVERT_P or CONVERT_PL action or the suffix with a CONVERT_S. This argument is optional. |
retype2 | Refers to the token class that you assigned to the remainder of the token after the conversion, if the source is an operand. |