Data Privacy Transformation Library Functions

The data privacy transformation library functions allow you to mask personal data such as social security numbers, credit card numbers, and e-mail addresses. A data privacy license, separate from the Optim™ product license, is needed to use these functions.

You can generate transformed data that is valid and unique. When a transformation library function is called, all destination and source columns are available except those destination columns that are yet to be assigned a value by an exit or transformation library function. The parameters in a transformation library function are validated at run time.

TRANS SSN

Use the TRANS SSN function to generate a valid and unique U.S. Social Security Number (SSN). By default, TRANS SSN algorithmically generates a consistently altered destination SSN based on the source SSN. TRANS SSN can also generate a random SSN when the source data does not have an SSN value or when there is no need for transforming the source SSN in a consistent manner.

An SSN is made of 3 subfields. The first 3 digits (area) represent an area generally determined by the state in which the SSN is issued. The next 2 digits (group) define a group number corresponding to the area number. The last 4 digits (serial) are a sequential serial number. Regardless of the type of processing, default or random, TRANS SSN will generate an SSN with a group number appropriate to the area number.

The default processing method generates an SSN that includes the source area number as well as altered group and serial numbers based on the source SSN.

The random processing method generates an SSN that can include the source area number and uses a group number most recently issued by the Social Security Administration for the destination area number. Serial numbers begin with 0001 and are incremented by 1 for each additional SSN generated for the area number. When the serial number exceeds 9999, the serial number will be reset to 0001 and the group number preceding the number most recently issued for the area number will be used.

The syntax of TRANS SSN is:

TRANS SSN [‘[=flags] [sourcecol]']

flags
You can specify one or more case-insensitive processing option flags.
n
Generate a random SSN that is not based on a source value.
i
Ignore invalid source values and copy them to the destination. Invalid values include the following:
  • values less than 9 characters
  • values with an area number not used by the Social Security Administration
  • values containing non-numeric characters in CHAR or VARCHAR columns

For more information about invalid source values, see Data Types Allowed and Skipped Rows.

m
Use the maximum group of all SSN area values, including values from 773 through 899, and excluding invalid area numbers.
r
Generate a random SSN that includes the source area number.
v
Validate the source group number by comparing it with numbers used by the Social Security Administration.
-
The destination SSN should include dashes separating the fields (e.g., 123-45-6789). Requires a character-type destination column at least 11 characters long.
sourcecol
The source column name. If a source column name is not specified, the destination column name will be used. If a source column name is not specified and the destination column name does not match a column name in the source table, an error will occur during processing.

Data Types Allowed

The following source and destination data types are permitted:

BIGINT
32 bits of the value used for the source and destination.
CHAR
The length of data in the column must be from 9 to 256 characters.
DECIMAL
The precision of the column must be 9 and the scale 0.
INTEGER
No restrictions.
VARCHAR
The length of data in the column must be from 9 to 254 characters.

If a source or destination column does not adhere to these restrictions, an error will occur during processing.

Destination Processing Rules

The following rules apply to the destination SSN value, according to the destination data type or value:

CHAR
If the source value is 0, spaces, or a zero-length VARCHAR, the source value will be copied, unchanged, to the destination.

If a source value is 11 characters or more and includes embedded dashes (-), or if the ‘-' flag is specified, the destination value will include dashes if the destination column length is 11 characters or more.

DECIMAL, INTEGER
If the source value is 0, spaces, or a zero-length VARCHAR, the destinations value will be 0.
VARCHAR
If the source value is 0, spaces, or a zero-length VARCHAR, the source value will be copied, unchanged, to the destination.

If a source value is 11 characters or more and includes embedded dashes (-), or if the ‘-' flag is specified, the destination value will include dashes if the destination column length is 11 characters or more.

NULL
If the source value is NULL, the destination value will be NULL.

Skipped Rows

The following conditions may cause a source row to be skipped and not written to the destination:

  • The source value is NULL, and the destination column does not allow a NULL value.
  • The source column is CHAR or VARCHAR, and the source value is less than 9 characters, contains a non-numeric character (other than dashes between the 3 subfields), or is too large.
  • The source area number has not been used by the Social Security Administration.
  • The source group number has not been used with the area number by the Social Security Administration (only if the ‘v' flag has been specified).
  • The source serial number is 0000, or the SSN is a reserved value not issued by the Social Security Administration (e.g., 078-05-1120).
  • The source value cannot be converted to a format TRANS SSN supports.

Error Messages

The following error messages may be issued:

SSN01
Parm on Col ccccc ("ppp") is invalid

The parameter ppp, specified on destination column ccccc, is not valid. Check the processing option flags specified.

SSN02
Col ccccc not on source

The column name entered as the sourcecol parameter or the destination column name (if sourcecol was omitted) was not found on the source table.

SSN03
Source Col ccccc-aaa invalid

The format of the source column is not supported because the attribute aaa is invalid. The value of aaa will be Type, Length, Precision, or Scale.

SSN04
Dest Col ccccc-aaa invalid

The format of the destination column is not supported because the attribute aaa is invalid. The value of aaa will be Type, Length, Precision, or Scale.

SSN05
Get col ccccc data-rc=nnn

A return code of nnn was returned when TRANS SSN tried to get the value of source column ccccc.

SSN08
Put col ccccc data-rc=nnn

A return code of nnn was returned when TRANS SSN tried to set the value of destination column ccccc.

If any other errors occur, contact Optim Technical Support.

Example 1

The following example uses a source column name that matches the destination column and generates a random SSN that is not based on the source value:

TRANS SSN ‘=n'

Example 2

The following example uses a source column name (SOCIAL) that differs from the destination column and generates an SSN using the default processing method and including dashes:

TRANS SSN ‘=- SOCIAL'

TRANS CCN

Use the TRANS CCN function to generate a valid and unique credit card number (CCN). By default, TRANS CCN algorithmically generates a consistently altered CCN based on the source CCN. TRANS CCN can also generate a random value when the source data does not have a CCN value or when there is no need for transforming the source CCN in a consistent manner.

A CCN, as defined by ISO 7812, consists of a 6-digit issuer identifier followed by a variable length account number and a single check digit as the final number. The check digit verifies the accuracy of the CCN and is generated by passing the issuer identifier and account numbers through the Luhn algorithm. The maximum length of a CCN is 19 digits.

The default processing method generates a CCN by including the first 4 digits of the issuer identifier from the source CCN and altering the remaining 2 digits of the issuer identifier number and the account number based on the source CCN. A valid check digit is also assigned.

The random processing method generates a CCN that can include the first 4 digits of the source issuer identifier number or an issuer identifier number assigned to American Express®, Discover, MasterCard, or VISA. A valid check digit is also assigned. If the first four digits of a source issuer identifier number are included, the first account number based on those digits will begin with 1, and for each additional CCN that uses those digits, the account number will be incremented by 1.

The syntax of TRANS CCN is:

TRANS CCN [ (‘[=flag] [sourcecol] [ preserve=invalid ] ' ) ]

flag
Specify an option flag to generate a random CCN.
n
Generate a random CCN that is not based on a source value and includes an issuer identifier number assigned to American Express, Discover, MasterCard, or VISA.
r
Generate a random CCN that includes the first 4 digits of the source issuer identifier number.
sourcecol
The source column name. If a source column name is not specified, the destination column name is used.

If a source column name is not specified and the destination column name does not match a column name in the source table, an error will occur during processing.

preserve=invalid
If the source column contains an invalid CCN, do not replace it with a generated value. The source column value will be used in the destination column.

Data Types Allowed

The following source and destination data types are permitted:

CHAR
The column length must be from 13 to 256 characters.
VARCHAR
The column length must be from 13 to 254 characters.
PACKED DECIMAL
The precision of the column must be from 13 to 254 and the scale must be zero.

If a source or destination column does not adhere to these restrictions, an error message occurs.

Destination Processing Rules

The following rules apply to the destination CCN value, according to the destination data type or value:

CHAR
If the source value is spaces or a zero-length VARCHAR, the destination value will be set to spaces.
VARCHAR
If the source value is spaces or a zero-length VARCHAR, the destination length will be 0.
PACKED DECIMAL
If the source value is zero, the destination value will be zero.
NULL
If the source value is NULL, the destination value will be NULL.

Skipped Rows

The following conditions may cause a source row to be skipped and not written to the destination:

  • The source value is NULL, and the destination column does not allow a NULL value.
  • The source value is less than 13 characters, contains a non-numeric character, is too large, or has an incorrect check digit.
  • The source value length is not valid for the credit card issuer.
  • The source value cannot be converted to a format TRANS CCN supports.

Error Messages

The following error messages may be issued:

CCN01
Parm on Col ccccc ("ppp") is invalid

The parameter ppp, specified on destination column ccccc, is not valid. Check the processing option flags specified.

CCN02
Col ccccc not on source

The column name entered as the sourcecol parameter or the destination column name (if sourcecol was omitted) was not found on the source table.

CCN03
Source Col ccccc-aaa invalid

The format of the source column is not supported because the attribute aaa is invalid. The value of aaa will be Type or Length.

CCN04
Dest Col ccccc-aaa invalid

The format of the destination column is not supported because the attribute aaa is invalid. The value of aaa will be Type or Length.

CCN05
Get col ccccc data-rc=nnn

A return code of nnn was returned when TRANS CCN tried to get the value of source column ccccc.

CCN08
Put col ccccc data-rc=nnn

A return code of nnn was returned when TRANS CCN tried to set the value of destination column ccccc.

If any other errors occur, contact Optim Technical Support.

Example 1

The following example uses a source column name (CREDIT) that differs from the destination column and generates a random CCN not based on the source value:

TRANS CCN ‘=n CREDIT'

Example 2

The following example uses a source column name (CREDIT) that differs from the destination column and generates a CCN using the default processing method:

TRANS CCN ‘CREDIT'

TRANS EML

Use the TRANS EML function to generate an e-mail address. An e-mail address consists of two parts, a user name followed by a domain name, separated by ‘@'. For example, user@domain.com.

TRANS EML generates an e-mail address with a user name based on either destination data or a literal concatenated with a sequential number. The domain name can be based on an e-mail address in the source data, a literal, or randomly selected from a list of large e-mail service providers. The e‑mail address can also be converted to uppercase or lowercase.

TRANS EML can generate a user name based on the values in one or two destination table columns (usually containing the name(s) of a user). Processing options allow you to use only the first character of the value in the first column (e.g., the initial letter of a first name) and separate the values from both columns using either a period or an underscore.

If the user name is based on a single destination column value or a literal, the name will be concatenated with a sequential number. If a user name is based on values in two destination table columns and a separating period or underscore is not used, the values are concatenated. If a parameter is not provided for the user name, the name will be formed by the literal “email” concatenated with a sequential number. Sequential numbers for user names are suffixes that begin with 1 and are incremented by 1.

The syntax of TRANS EML is:

TRANS EML [‘[=flags] ,[{sourcecol | “domain” | , } [{name1col[name2col] | “userpfx”}] ]']

flags
You can specify one or more case-insensitive processing option flags.
n
Generate a random domain name from a list of large e-mail service providers.
.
Separate the name1col and name2col values with a period.
_
Separate the name1col and name2col values with an underscore.
i
Use only the first character of the name1col value.
l
Convert the e-mail address to lowercase.
u
Convert the e-mail address to uppercase.
sourcecol
The source column name with e-mail addresses used to provide the domain name.

If neither the ‘n' flag nor the domain parameter are defined, the domain name in the source column is used. (If sourcecol is not defined, the source column name is based on the destination column name.)

If a source column name is not specified and the destination column name does not match a column name in the source table, an error will occur during processing.

domain
A literal, up to 31 characters, that forms the domain name.
,
A comma is required if neither a sourcecol nor a domain parameter are defined and you define either a literal or column name(s) for the domain name.
name1col
A destination table column name with values used to form the first (or only) part of the user name.
name2col
A destination table column name with values used to form the second part of the user name.
userpfx
A literal, up to 31 characters, that is concatenated with a sequential number to form the user name.

Data Types Allowed

The following source and destination data types are permitted:

CHAR
The column length must be from 3 to 256 characters.
VARCHAR
The column length must be from 3 to 254 characters.

If a source or destination column does not adhere to these restrictions, an error message will be issued.

Destination Processing Rules

The following rules apply to the destination e-mail value, according to the destination data type or value:

CHAR
If the source value is spaces or a zero-length VARCHAR, the destination value will be set to spaces.
VARCHAR
If the source value is spaces or a zero-length VARCHAR, the destination length will be 0.
NULL
If the source value is NULL, the destination value will be NULL.

Skipped Rows

The following conditions may cause a source row to be skipped and not written to the destination:

  • The source value is NULL, and the destination column does not allow a NULL value.
  • The source value is a VARCHAR less than 3 characters long.
  • The source e-mail value does not contain a ‘@'.
  • The source value cannot be converted to a format TRANS EML supports.

Error Messages

The following error messages may be issued:

EML01
Parm on Col ccccc ("ppp") is invalid

The parameter ppp, specified on destination column ccccc, is not valid. Check the processing option flags specified.

EML02
Col ccccc not on source

The column name entered as the sourcecol parameter or the destination column name (if sourcecol was omitted) was not found on the source table.

EML03
Source Col ccccc-aaa invalid

The format of the source column is not supported because the attribute aaa is invalid. The value of aaa will be Availability, Type, or Length.

EML04
Dest Col ccccc-aaa invalid

The format of the destination column is not supported because the attribute aaa is invalid. The value of aaa will be Availability, Type, or Length.

EML05
Get col ccccc data-rc=nnn

A return code of nnn was returned when TRANS EML tried to get the value of source column ccccc.

EML08
Put col ccccc data-rc=nnn

A return code of nnn was returned when TRANS EML tried to set the value of destination column ccccc.

EML09
Domain literal sssss too long

The string sssss, specified as the domain name literal (domain), exceeds 31 characters.

EML10
User literal sssss too long

The string sssss, specified as the user name literal (userpfx), exceeds 31 characters.

EML11
Name1 Col ccccc not on dest

The name1col column name was not found on the destination table.

EML12
Name1 Col ccccc-aaa invalid

The format of the name1col column is not supported because the attribute aaa is invalid. The value of aaa will be Availability, Type, or Length.

EML13
Name2 Col ccccc not on dest

The name1col column name was not found on the destination table.

EML14
Name2 Col ccccc-aaa invalid

The format of the name2col column is not supported because the attribute aaa is invalid. The value of aaa will be Availability, Type, or Length.

If any other errors occur, contact Optim Technical Support.

Example 1

The following example uses a literal (optim.com) to form the domain name and two destination table columns (FIRST_NAME and LAST_NAME) to form a user name that includes an underscore:

TRANS EML ‘=_ “optim.com” FIRST_NAME LAST_NAME'

Example 2

The following example uses a domain name from the source column and a literal (OptimUser) to form a user name that will be suffixed with a sequential number:

TRANS EML ‘, “OptimUser” '