Connecting to data sources

There are several ways that you can connect to your data sources in IBM® Cloud Pak for Data.

Connecting to data sources at the platform level

You can create connections that can be used by various services across the platform. Any user who has access to the platform can see these connections. However, only users with the credentials for the data source can use a connection.

These platform-level connections are available from the Platform connections page. However, this page is available only if the Cloud Pak for Data common core services are installed.

Currently, the following services can use connections from the Platform connections page:
  • Cognos® Analytics
  • Data Virtualization
  • Watson™ Knowledge Catalog
  • Watson Studio

    Many of the tools that work with Watson Studio can use data from these connections after the connection is added to a project.

Restriction: Not all services support the same types of connections. Most services support a subset of the connections that are supported by the platform. For information about which data sources a service can connect to, see Supported data sources.

The Platform connections page is a specialized view of the Platform assets catalog. (The connections that are defined on the Platform connections page are also included in the Platform assets catalog.)

The Platform connections page shows the list of connections that can be used by various services on the platform. At a minimum, all users have the Viewer role on the catalog, which means that they can see the connections that are defined. For details, see Managing collaborators on platform connections.

Required permissions
To create a platform-level connection, you must be an Editor or Administrator on the Platform connections catalog.
Tip: Work with your data source administrator to ensure that you have the correct information to connect to your data source.

Watch this video to see how to create a platform-level connection:

To create a platform-level connection:

  1. Log in to the Cloud Pak for Data web client.
  2. From the navigation menu, select Data > Platform connections.
  3. Click New connection.
  4. Select the type of data source that you want to connect to.

    If you want to connect to an unsupported data source by creating a Generic JDBC connection, a Cloud Pak for Data administrator must upload the JDBC drivers for that data source. For details, see Importing JDBC drivers for data sources.

    If you want to connect to an external NFS server or persistent volume claim by creating a Storage volume connection, a user with the Create service instances permissions must add the volume to Cloud Pak for Data. For details, see Managing storage volumes.

  5. Enter a name and description for the connection.
  6. Enter the details for the connection. The type of connection that you are creating determines the information that you must specify. Typically, a connection requires either:
    • A host name and port number
    • A URL

    You might also need to specify the database that you want to connect to.

  7. Enter your credentials for the connection. The type of connection that you are creating determines the format of the credentials. Typically, a connection requires one or more of the following:
    • Username and password

      If the data source is a service that is deployed on the same instance of Cloud Pak for Data where you are creating the connection, you can use your Cloud Pak for Data credentials to authenticate to the data source. When you select Use your Cloud Pak for Data Credentials to authenticate to the data source, the username and password fields are disabled.

    • API key
    • Secret key

    The credentials that you supply are accessible only from your account. Other users must supply their own credentials to use the connection.

    Some data sources allow you to connect anonymously.

  8. If applicable, specify the SSL information required to connect to your data source.

    Some data sources require you to use SSL for secure communication. Other data sources support it but do not require it. Ensure that you understand what information you need to provide to communicate securely with your data source:

    • If you specified a port number that is configured to accept SSL connections, ensure that you select The port is configured to accept SSL connections
    • If the data source uses a self-signed certificate, you must specify the contents of the certificate to enable secure communication between Cloud Pak for Data and the data source.
    • If your data source uses chained certificates, you can specify the contents of multiple certificates.

Connecting to data sources at the service level

Typically, if you create a connection at the service level, the connection is accessible only from the service where it is created.

Service Learn more
Cognos Dashboards You can use CSV files, database connections, connected data assets, and Data Virtualization assets as data sources for a dashboard. All of these data sources must be added to a project first before they can be used as a data source.

Data sources are added to a dashboard by selecting Add data from the analytics dashboard menu.

For a detailed list of supported data sources, see the Data Format section in Visualizing Data with Cognos Dashboards.

DataStage® You can create connections from the following locations:
  • The Connections page of the data transformation project
  • The job canvas

To use data from local files, add the file to the job canvas.

For details, see Creating a data transformation job.

Data Virtualization You can create connections that can be used to virtualize data from the following locations:
  • The Platform connections page
  • The Data sources page in the Data Virtualization service

For details, see Adding data sources (Data Virtualization).

Watson Knowledge Catalog You can create connections that can be used in the catalog and connections that can be used to curate data.

Add connections that can be used in a catalog from the catalog Overview page. You can create new connections or pick from existing platform-level connections.

For details, see Adding a connection asset to a catalog (Watson Knowledge Catalog).

When you publish a data asset to a catalog, the connection is published along with it, unless the connection already exists in the catalog.

For connections that can be used to curate data, you can create connections as follows:
  • From the Platform connections page. You an pick from those platform-level connections when you set up a discovery job.
    Important: If a platform-level connection is edited, only changes to the connection description are synced to the discovery connection. Any other updates, such as a updated credentials or a changed database name, are not reflected.
  • When you set up a new discovery job from the Goverance > Data discovery page.
For details, see Discovering assets (Watson Knowledge Catalog).
Watson Studio Ideally, you should use data that is already in a catalog. Search for the data you want in a catalog and add it to an analytics project.

Alternatively, you can create connections that can be used in analytics projects from the following locations:

  • The Platform connections page
  • The Assets page of the analytics project

You can also add data from files. To add data from files, go to the Assets page of the analytics project.

For details, see Adding data to an analytics project.