Defining wrapped stages

You define a Wrapped stage to specify a UNIX command to be run by a DataStage stage.

About this task

You define a wrapper file that handles arguments for the UNIX command and inputs and outputs. DataStage® provides an interface that helps you define the wrapper. The stage will be available to all jobs in the project in which the stage was defined. You can export the project that contains this stage along with other assets that are defined in the project in the form of a .zip folder. You can add the stage to your job palette by using palette customization features in DataStage.

When you define a Wrapped stage, you provide the following information:

  • Details of the UNIX command that the stage will run.
  • Description of the data that will be input to the stage.
  • Description of the data that will be output from the stage.
  • Definition of the environment in which the command will run.

The UNIX command that you wrap can be a built-in command, such as grep, a third-party utility, or your own UNIX application. The only limitation is that the command must be `pipe safe' (to be pipe safe, a UNIX command reads its input sequentially, from beginning to end).

You need to define metadata for the data that is input to and output from the stage. You also need to define how the data will be input or output. UNIX commands can take their inputs from standard in, or another stream, a file, or from the output of another command by way of a pipe. Similarly, data is output to standard out, or another stream, to a file, or to a pipe to be input to another command. You specify what the command expects.

DataStage handles data that is input to the Wrapped stage and will present it in the specified form. If you specify a command that expects input on standard in, or another stream, DataStage will present the input data from the jobs data flow as if it was on standard in. Similarly, it will intercept data output on standard out, or another stream, and integrate it into the job's data flow.

You also specify the environment in which the UNIX command will be run when you define the Wrapped stage.

Note: You cannot use a Wrapped stage as a source in a DataStage flow.

Procedure

To define a Wrapped stage:

  1. From a project, on the Assets tab, click New asset+ > DataStage component > Wrapped stage. Then, click Next.
  2. Enter a name and optional description for the stage. Then, click Create.
  3. Go to the Assets tab and open the stage that you just created.
  4. Complete the fields on the General tab as follows:
    • Wrapper Name. The name of the wrapper file DataStage will generate to call the command. By default the Wrapper name will take the same name as the Stage type name.
    • Execution mode. Choose the default execution mode. This mode will appear in the Advanced tab on the stage editor. You can override this mode for individual instances of the stage as required, unless you select Parallel only or Sequential only.
    • Command. The name of the UNIX command to be wrapped, plus any required arguments. The arguments that you enter here are ones that do not change with different invocations of the command. Arguments that need to be specified when the Wrapped stage is included in a job are defined as properties for the stage.
    • Description. Optionally enter a description of the stage.
  5. Go to the Properties tab. You specify the arguments that the UNIX command requires as properties that appear in the stage Properties tab.

    Complete the fields as follows:

    • Property name. The name of the property that will be displayed on the Properties tab of the stage editor.
    • Data type. The data type of the property. Choose from:

      Boolean

      Float

      Integer

      String

      Pathname

      List

      Input Column

      Output Column

      If you choose Input Column or Output Column, when the stage is included in a job a list will offer a choice of the defined input or output columns.

      If you choose list you should open the Extended Properties dialog box from the grid menu to specify what appears in the list.

    • Prompt. The name of the property that will be displayed on the Properties tab of the stage editor.
    • Default Value. The value the option will take if no other is specified.
    • Required. Set this property to True if the property is mandatory.
    • Repeats. Set this property true if the property repeats (that is you can have multiple instances of it).
    • Conversion. Specifies the type of property as follows:

      -Name. The name of the property will be passed to the command as the argument value. This will normally be a hidden property, that is, not visible in the stage editor.

      -Name Value. The name of the property will be passed to the command as the argument name, and any value that is specified in the stage editor is passed as the value.

      -Value. The value for the property that is specified in the stage editor is passed to the command as the argument name. Typically used to group operator options that are mutually exclusive.

      Value only. The value for the property that is specified in the stage editor is passed as it is.

  6. Go to the Wrapped tab. You specify information about the command to be run by the stage and how it will be handled.

    The Interfaces tab is used to describe the inputs to and outputs from the stage, specifying the interfaces that the stage will need to function.

    Details about inputs to the stage are defined on the Inputs subtab:

    • Link. The link number, which is assigned for you and is read-only. When you use your stage, links will be assigned in the order in which you add them. In this example, the first link will be taken as link 0, the second as link 1 and so on. You can reassign the links by using the stage editor's Link Ordering tab on the General page.
    • Table Name. The metadata for the link. You define the table name by loading a table definition from the Repository. Type in the name, or browse for a table definition. Alternatively, you can specify an argument to the UNIX command, which specifies a table definition. In this case, when the Wrapped stage is used in a job design, the designer will be prompted for an actual table definition to use.
    • Stream. Here you can specify whether the UNIX command expects its input on standard in, or another stream, or whether it expects it in a file. Click the browse button to open the Wrapped Stream dialog box.

      In the case of a file, you should also specify whether the file to be read is given in a command line argument, or by an environment variable.

      Details about outputs from the stage are defined on the Outputs subtab:

    • Link. The link number, which is assigned for you and is read-only. When you use your stage, links will be assigned in the order in which you add them. In this example, the first link will be taken as link 0, the second as link 1 and so on. You can reassign the links by using the stage editor's Link Ordering tab on the General page.
    • Table Name. The metadata for the link. You define the table name by loading a table definition from the Repository. Type in the name, or browse for a table definition.
    • Stream. Here you can specify whether the UNIX command will write its output to standard out, or another stream, or whether it outputs to a file. Click the browse button to open the Wrapped Stream dialog box.

      In the case of a file, you should also specify whether the file to be written is specified in a command line argument, or by an environment variable.

      The Environment tab gives information about the environment in which the command will run.

      Set the following items on the Environment tab:

    • Environment. Specify environment variables and settings that the UNIX command requires to run.
    • All Exit Codes Successful. By default IBM® DataStage treats an exit code of 0 as successful and all others as errors. Select this checkbox to specify that all exit codes should be treated as successful other than those specified in the Failure codes grid.
    • Exit Codes. The use of the exit codes depends on the setting of the All Exits Codes Successful checkbox.

      If All Exits Codes Successful is not selected, enter the codes in the Success Codes grid, which will be taken as indicating successful completion. All others will be taken as indicating failure.

      If All Exits Codes Successful is selected, enter the exit codes in the Failure Code grid that will be taken as indicating failure. All others will be taken as indicating success.

  7. When you have completed the details in all the pages, click Generate to generate the stage. Then, click Save.