DataStage connectors

A DataStage® connector is a palette node that provides data connectivity and metadata integration for external data sources, such as relational databases, public cloud storage services, or messaging software.

Connectors for remote data sources
Other types of connector components

Connectors for remote data sources

For connectors for remote data sources, you need to create a project connection asset for the associated DataStage connector before you can load data to or read from it in DataStage. A connection contains the information necessary to connect to the data source.

To create the connection asset or the "optimized" connection asset: From the project page, click Add to project > Connection. For more information, see Adding connections to projects.
Open DataStage and add the associated connector to the canvas. Go to Properties > Connection. Double-click the connector node on the canvas to open its Details card.
Select the connection from the Details card. Open the Stage tab, and go to Properties > Connection, and select the connection.
Optional: The connectors are listed on the DataStage palette so that you can build your flow and add the connection asset later.

The "(optimized)" version of a connection gives you increased performance and more features such as before and after SQL statements, and sparse lookup and rejects links. However, you cannot use the "(optimized)" connection with other tools. You can use the connections that are available to other tools (for example, Salesforce.com), if you already created the connection, and you want to reuse it in DataStage.

Data sources are also available from the ODBC connection, which is also optimized for DataStage. Select the ODBC connection in the Add connection page, and then select a data source in the Create connection page.

Use the Generic JDBC connection to connect to a data source that has no connector defined for Cloud Pak for Data.

Connection	Optimized version	Available in the ODBC connection
Amazon RDS for PostgreSQL		PostgreSQL data source in ODBC
Amazon Redshift
Amazon S3 In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors.
Apache Cassandra	Apache Cassandra (optimized)*	Apache Cassandra data source in ODBC
Apache HDFS
Apache Hive In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors. Supports source connections only.		Apache Hive data source in ODBC
	Apache Kafka*
Databases for PostgreSQL
FTP
Generic JDBC In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors. Restriction: The CREATE statement is not supported for connecting to a MongoDB database.
Google BigQuery
Google Cloud Pub/Sub
Google Cloud Storage
Greenplum		Greenplum data source in ODBC
HTTP Supports source connections only.
IBM Cloud Object Storage You must create the Cloud Object Storage credentials with the Hash-based Message Authentication Code (HMAC) option. See Using HMAC credentials.
IBM Data Virtualization
IBM Data Virtualization Manager for z/OS
IBM Db2	Db2 (optimized)*	IBM Db2 data source in ODBC
IBM Db2 Big SQL		IBM Db2 data source in ODBC
IBM Db2 Event Store		IBM Db2 data source in ODBC
IBM Db2 for i		IBM Db2 data source in ODBC
IBM Db2 for z/OS		IBM Db2 data source in ODBC
IBM Db2 Hosted		IBM Db2 data source in ODBC
IBM Db2 on Cloud		IBM Db2 data source in ODBC
		IBM Db2 on iSeries (AS400) data source in ODBC
		IBM Db2 on Linux on System z data source in ODBC
IBM Db2 Warehouse		IBM Db2 data source in ODBC
IBM Informix		IBM Informix data source in ODBC
IBM Netezza (PureData System for Analytics)	Netezza (optimized)*	IBM Netezza data source in ODBC
		Impala data source in ODBC
Microsoft Azure Blob Storage
Microsoft Azure Data Lake Store
Microsoft Azure File Storage
Microsoft SQL Server		Microsoft SQL Server data source in ODBC
		MongoDB data source in ODBC
MySQL		MySQL data source in ODBC
ODBC Select the ODBC connection in the Add connection page, and then select a data source in the Create connection page.
Oracle	Oracle (optimized)*	Oracle data source in ODBC
PostgreSQL		PostgreSQL data source in ODBC
Salesforce.com Supports source connections only.	Salesforce.com (optimized)*
SAP ASE Supports source connections only.		SAP ASE data source in ODBC Supports source and target connections
		SAP IQ data source in ODBC
SAP OData
Snowflake In the Details card, select Use DataStage properties to access the DataStage-specific properties. The DataStage-specific properties provide more features and granular control of the flow execution, similar to DataStage "optimized" connectors. For example, for Snowflake, DataStage properties have explicit options for Create and Append operations.
Teradata

* Denotes a project connection that is for DataStage only.

Other types of connector components

These entries in the palette do not require that you create a connection asset in the project.