Data elements

Each column within a table definition can have a data element assigned to it. A data element specifies the type of data a column contains, which in turn determines the transforms that can be applied in a Transformer stage.

The use of data elements is optional. You do not have to assign a data element to a column, but it enables you to apply stricter data typing in the design of server jobs. The extra effort of defining and applying data elements can pay dividends in effort saved later on when you are debugging your design.

You can choose to use any of the data elements supplied with IBM® InfoSphere® DataStage®, or you can create and use data elements specific to your application. For a list of the built-in data elements, see "Built-In Data Elements".

Application-specific data elements allow you to describe the data in a particular column in more detail. The more information you supply to InfoSphere DataStage about your data, the more InfoSphere DataStage can help to define the processing needed in each Transformer stage.

For example, if you have a column containing a numeric product code, you might assign it the built-in data element Number. There is a range of built-in transforms associated with this data element. However, all of these would be unsuitable, as it is unlikely that you would want to perform a calculation on a product code. In this case, you could create a new data element called PCode.

Each data element has its own specific set of transforms which relate it to other data elements. When the data elements associated with the columns of a target table are not the same as the data elements of the source data, you must ensure that you have the transforms needed to convert the data as required. For each target column, you should have either a source column with the same data element, or a source column that you can convert to the required data element.

For example, suppose that the target table requires a product code using the data element PCode, but the source table holds product data using an older product numbering scheme. In this case, you could create a separate data element for old-format product codes called Old_PCode, and you then create a custom transform to link the two data elements; that is, its source data element is Old_PCode, while its target data element is PCode. This transform, which you could call Convert_PCode, would convert an old product code to a new product code.

A data element can also be used to "stamp" a column with SQL properties when you manually create a table definition or define a column definition for a link in a job.