Email privacy provider

Use the email privacy provider to generate an email address. An email address consists of two parts, a user name and a domain name, which is separated by '@'. For example, user@domain.com.

The email privacy provider generates an email address with a user name based on either source data or a literal that is concatenated with a sequential number. The user name can be formed with data from two columns (name1 and name2). The domain name can be based on an email address in the source data, a literal, or randomly selected from a list of large email service companies. The email address can also be converted to uppercase or lowercase.

If the USERPFX and PARTS parameters are not specified, the user name is formed by the literal "email" concatenated with a sequential number.

The provider offers two hashing algorithms, a default algorithm and the SHA-256 (secure hashing algorithm 256 bits) algorithm. The default algorithm usually outperforms the SHA-256 algorithm in processing time. The SHA-256 algorithm is a cryptographic-strength hashing algorithm that offers stronger masking than the default algorithm and ensures that masked values cannot be revealed by reverse engineering. The HMAC (hash-based message authentication code) algorithm, which is based on the SHA-256 algorithm, is also available with a user-supplied exit only and provides stronger encryption than the SHA-256 algorithm alone by using an encryption key that is provided by the exit. The provider prevents unauthorized access to the key value by not keeping the key in memory.

The provider offers different methods for producing a masked email address: repeatable, hash, and random. The repeatable and hash methods produce masked email addresses that are based on the source value. The repeatable and hash methods produce repeatable masked email addresses, while the random method is non-repeatable. The repeatable and random methods allow you to specify the format and parts of the user name. The repeatable method allows you to specify a domain name. The random and hash methods allow you to use a domain name from a list of large email providers.

Examples

Repeatable

The following example uses the repeatable method and preserves invalid values from the source field. As the user name and domain parameters were not specified for the repeatable method, the provider will generate user names with the literal "email" concatenated with a sequential number, and the provider will use the source domain name for the output.

trans pro=eml, mtd=rep, wheninv=pre, flddef1=(name=emlvarchar, dt=varchar_szvarchar)

This example uses the following parameters:

MTD=REP
This parameter generates masked email addresses in a repeatable manner. The output values are based on the source values.
WHENINV=PRE
The parameter copies invalid source values to the destination field.
Hash

The following example uses the hash method and a registered domain name.

trans pro=eml, mtd=hash, hashdom=reg, flddef1=(name=emlvarchar, dt=varchar_szvarchar)

This example uses the following parameters:

MTD=HASH
This parameter hashes the source value to generate a user name that is composed of alphanumeric characters.
HASHDOM=REG
This parameter selects domain names from a list of large email service companies.
User name parts

The following example uses the repeatable method and generates a lowercase email address with a two-part user name that is separated by a period. The user name is taken from fields that are identified by the PARTS parameter.

trans pro=eml, mtd=rep, case=low, parts="(emlfirstname, emllastname)", sep=dot,  
flddef1=(name=emlnamecol, dt=varchar_szvarchar), flddef2=(name=emlfirstname, dt=varchar_sz), 
flddef3=(name=emllastname, dt=varchar_sz)

This example uses the following parameters:

MTD=REP
This parameter generates masked email addresses in a repeatable manner. The output values are based on the source values.
CASE=LOW
This parameter generates email addresses in lower case.
PARTS="(emlfirstname,emllastname)"
This parameter forms a two-part user name from the specified fields.
SEP=DOT
This parameter separates the two-part user name with a period.
HMAC seed

The following example uses the hash method, the SHA-256 algorithm, and a seed value that is provided by a user exit. The domain name is provided by a list of large email service companies.

trans pro=eml, mtd=hash, algo=sha256, seed=hmac, hashdom=reg, flddef1=(name=emlvarchar, dt=varchar_szvarchar)

This example uses the following parameters:

MTD=HASH
This parameter hashes the source value to generate a user name that is composed of alphanumeric characters.
ALGO=SHA256
The SHA-256 algorithm is a cryptographic-strength hashing algorithm and ensures that masked values cannot be revealed by reverse engineering.
SEED=HMAC
This parameter uses a seed value to influence the masking process that is provided by the user exit. Use the same value each time the provider is used to produce the same masked values for a given source value.
HASHDOM=REG
This parameter selects domain names from a list of large email service companies.

Syntax

The email privacy provider uses the following syntax:

Masking parameters
 PROVIDER = EML  , 
	[ METHOD = { REPEATABLE | RANDOM | HASH }  ] ,
	[ ALGORITHM = { SHA256 | DEFAULT } ] ,
User name and formatting parameters
	[  CASE = { UPPER  | LOWER  } ]  ,
	[ SEPARATOR = { DOT | . | UNDERSCORE | _ | }    ]  ,      
	[ USERPREFIX = username-prefix ] ,   
	[ PARTS = “ ( {name1fld-index [ ,  name2fld-index ] | name1fld-name [ ,  name2fld-name ] }  ) “ ]  ,
	[  FCPART1 = { Y | N } ]  ,
MTD=HASH parameters
	[ SEED = { “seed-value” | HMAC } ]  ,
	[ HASHDOMAIN = { KEEP | REGISTERED | UNREGISTERED  } ] ,
MTD=REP parameters
	[ DOMAIN = domain-name ] ,
Processing parameters
	[ WHENINVALID = PRESERVE ]  ,
Data definition parameters
	FLDDEFn = ( NAME = field-name,    
		DATATYPE = datatype-value, 
		[ PRECISION = field-precision-value ], 
		[ SCALE = field-scale-value ],    
		[ LENGTH = field-length-value ],
		[ CODEPAGE = codepage-value ],
		[ CPTYPE = { DB2ZOS |DB2LUW | ORACLE |SYBASE |ODBC | INFORMIX |NETEZZA |SQLSERVER |TERADATA |ANY |NONE } ] ) ,
	[ CODEPAGE = codepage-value ]  ,
	[ CPTYPE = { DB2ZOS | DB2LUW | ORACLE | SYBASE | ODBC | INFORMIX |
		     NETEZZA | SQLSERVER | TERADATA | ANY | NONE  } ]

Masking parameters

Parameters that determine how to mask data.

PROVIDER (or PRO)
Required. Enter the provider name, EML.
Note: The PRO parameter must be first in the masking string. All other parameters can appear in any order.
METHOD (or MTD)
Required. The masking method to use.

If MTD=REP or MTD=RAN and a user name parameter (USERPFX or PARTS) is not included, the user name is formed by the literal "email" concatenated with a sequential number. The sequential numbers begin with 1 and are incremented by 1.

Enter one of the following options:

REPEATABLE (or REP)
Default. Generates an email address in a repeatable manner. You can format and specify the parts of the user name, and you can specify a domain name. The output values are based on the source values.

If MTD=REP and the DOM parameter is not included, the source domain name is used.

RANDOM (or RAN)
Generates a random domain name from a list of large email service companies. You can format and specify the parts of the user name.

MTD=RAN is not compatible with the parameters WHENINV=PRE, DOMAIN, HASHDOMAIN, ALGORITHM, and SEED.

HASH
Generates an email address by using a hashing algorithm. The user name is composed of alphanumeric characters. Use the HASHDOM parameter to generate an email address that includes the source domain name, a random domain name, or domain names from a list of large email service companies. The output values are based on the source values.

MTD=HASH is required for the parameters HASHDOM, ALGO, and SEED.

MTD=HASH is not compatible with the parameters PARTS, SEP, FCPART1, USERPREFIX, DOMAIN, and CASE.

ALGORITHM (or ALGO)
If MTD=HASH, specifies the type of hashing algorithm to use.

ALGO is not compatible with the parameters PARTS, SEP, FCPART1, USERPREFIX, DOMAIN, and CASE.

Enter one of the following options:

SHA256
Specifies to use the SHA-256 (secure hashing algorithm 256 bits) algorithm, or if a user-supplied exit is available, the HMAC (hash-based message authentication code) algorithm.
DEFAULT (or DEF)
Specifies to use the default ODPP hash algorithm.

User name and formatting parameters

Parameters that determine how to manage user names and formatting.

CASE
Indicates whether the output email address is generated in uppercase or lowercase.

CASE is not compatible with the parameters MTD=HASH, HASHDOM, ALGO, and SEED.

Enter one of the following options:

UPPER (or UP)
Convert the output email address to uppercase.
LOWER (or LOW)
Convert the output email address to lowercase.
SEPARATOR (or SEP)
The separator character between the name1 and name2 values.

SEP is not compatible with the parameters USERPREFIX, MTD=HASH, HASHDOM, ALGO, and SEED.

Enter one of the following options:

DOT (or .)
Separate the values with a period (.). For example, name1.name2@ibm.com.
UNDERSCORE (or _)
Separate the values with an underscore (_). For example, name1_name2@ibm.com.
USERPREFIX (or USERPFX)
A literal, up to 512 characters, that is concatenated with a sequential number to form the user name.

USERPFX is not compatible with the parameters MTD=HASH, PARTS, SEP, FCPART1, HASHDOM, ALGO, and SEED.

PARTS
The fields that provide the name1 and name2 values of the user name. Within enclosing double quotation marks, enter either the field names or the FLDDEFn numeric values that are assigned to the fields. Enter the name1 value first. For example, if name1 is defined in FLDDEF1 and name 2 is defined in FLDDEF3, enter the following parameter: PARTS="(1,3)".

PARTS is not compatible with the parameters USERPFX, MTD=HASH, HASHDOM, ALGO, and SEED.

FCPART1
Indicates whether only the first character of the name1 field value is used. Enter one of the following options:
Y
Use only the first character of the name1 field.
N
Use all the characters of the name1 field.

MTD=HASH parameters

Parameters for use with MTD=HASH only.

HASHDOMAIN (or HASHDOM)
If MTD=HASH, indicate how the domain name is formed.

HASHDOM is not compatible with the parameters PARTS, FCPART1, USERPREFIX, DOMAIN, ALGO, and CASE.

Enter one of the following options:

KEEP
Use the source domain name.
REGISTERED (or REG)
Use a domain name from a list of registered email service providers.
UNREGISTERED (or UNREG)
Generate an alphanumeric domain name.
SEED
If MTD=HASH, enter a value to manage the hashing process. To generate repeatable masked values for the same input, use the same SEED value.

SEED is not compatible with the parameters PARTS, SEP, FCPART1, USERPREFIX, DOMAIN, and CASE.

Enter one of the following options:

seed value
Enter a literal seed value, up to 31 characters and within enclosing double quotation marks, to use for hashing.

If ALGO=SHA256, the seed value must be a numeric value in the range 0 - 2,000,000,000.

HMAC
Specifies to use the seed value that is provided by the user exit. A user exit is required.

SEED=HMAC is not compatible with the parameter ALGO=DEF.

MTD=REP parameters

Parameters for use with MTD=REP only.

DOMAIN (or DOM)
A literal, up to 31 characters, that forms the domain name.

If MTD=REP and this parameter is not specified, the source domain name is used for the output.

DOM is not compatible with the parameters MTD=RAN, MTD=HASH, HASHDOM, ALGO, and SEED.

Processing parameters

Parameters for managing provider processes.

WHENINVALID (or WHENINV)
Determines what to do with invalid source values. If this parameter is omitted, invalid source values are not copied to the destination field and rows that contain these values are skipped.

Enter the following option:

PRESERVE (or PRE)
Copy invalid source values to the destination field.

Data definition parameters

Parameters for defining source and target data. For further information see, supported data types.

FLDDEFn
Required. Specifies the attributes of input values to use for processing. See Field definition parameter.
CODEPAGE (or CP)
An integer value that specifies the codepage or character-set identifier of the source fields. The default is UTF-8. The CP parameter within the FLDDEFn parameter overrides this value.
CPTYPE (or CPT)
The codepage type of the source fields. The CPT parameter within the FLDDEFn parameter overrides this value.

When the origin of the data is DBMS-specific but not tied to any one DBMS, specify the value as ANY. When the origin of the data is from a non-DBMS source, specify the value as NONE. As there are no DBMS-specific code pages for Netezza®, a specification of NONE is implied when Netezza is specified.

Enter one of the following values:

Value Description
DBZ (or DB2zOS) DB2® for z/OS®
DB2 (or DB2LUW) DB2 for Linux®, UNIX, and Windows
IFX (or INFORMIX) Informix®
MSS (or SQLSERVER) Microsoft SQL Server
NZ or NETEZZA Netezza
ODBC ODBC
ORA (or ORACLE) Oracle
SYB (or SYBASE) Sybase
TD or TERADATA Teradata
ANY Any DBMS
NONE No DBMS

Supported data types

The email privacy provider supports the following data types for source and destination fields:

DB2 data type ODPP equivalent Description
CHAR CHAR Fixed size character data that is left justified and space padded.
VARCHAR VARCHAR Character data starting with a short integer value that indicates the length, in bytes, of the character data to follow.