Creating a DataStage flow
DataStage® flows are the design-time assets that contain data integration logic.
You can create an empty DataStage flow and add connectors and stages to it or you can import an existing DataStage flow from an ISX file.
- Data sources that read data
- Stages that transform the data
- Data targets that write data
- Links that connect the sources, stages, and targets

DataStage flows and their associated objects are organized in projects. To start, open an existing project or create a new project.
Creating a DataStage flow by individually adding connectors and stages
To create a DataStage flow by individually adding connectors and stages, complete the following steps.
- Open an existing project or create a project.
- Click Add to project + and select DataStage flow or select New DataStage flow + in the DataStage flows section of the page.
- Add a name and optional description for the new flow on the New tab of the New DataStage flow page.
- Drag connectors or stages from the palette onto the canvas as nodes and arrange them as you
like. Connect these nodes on the canvas by clicking the arrow icon on a node and dragging it to the
node you want to connect to.
This action creates a link between the nodes.
Note: The connections that you add to the flow must be created already in the project that you are working in. For more information, see Adding connections to projects. - Double-click a node to open up its Details card, where you can specify configurations and settings for the node.
- Click Run when you are done setting up the flow.
The flow is automatically saved, compiled, and run. You can view logs for both the compilation and job run.
Editing a DataStage flow
You can use the following actions to edit a DataStage flow.
- Drag a stage or connector and drop it on a link between two nodes that are already on the canvas. Links are automatically added for the new node and columns are automatically propagated. Click Run again to see the results.
- Manually detach and reattach links from nodes on the canvas by hovering your pointer over them and clicking the end points of the links.
- Drag a stage or connector from the palette and drop it onto a link that is already on the canvas. The stage or connector is automatically linked to the node on either side of it and the columns in the DataStage flow automatically propagated.
Writing and reading persistent data
Use persistent storage mounted at /px-storage whenever writing data from a stage to ensure all parallel processes running on the conductor or compute pods can access the data. Paths that are local to individual pods such as /tmp are not recommended.
Creating a new DataStage Component
You can collect a set of stages and connectors to reuse in DataStage flows by creating a new DataStage component. Use subflow components to collect a set of stages and connectors to reuse in DataStage flows and jobs.
- Open an existing project or create a project.
- Click Add to the project + and select DataStage component from the available asset types.
- Select Subflow as the DataStage component type.
You can manage all your DataStage components from the Assets tab.