January 13, 2020 By Torsten Steinbach 3 min read

A quick overview of a new secure mechanism in IBM Cloud SQL Query to pass credentials for accessing, processing, and analyzing data service securely via SQL statements. 

The gold standard

The best practice for allowing your analytic jobs to access data in the cloud is to leverage IBM Cloud Identity and Access Management (IAM) as the central identity and access policy store. The combination of SQL and Object Storage is a perfect example for this in IBM Cloud: A single IAM identity is used to submit a SQL query on data in one or multiple Object Storage buckets and Single-Sign-On (SSO) semantics are employed across the SQL service itself (as well as across all accessed Object Storage buckets).

The problem

  1. Not all cloud data services have fully integrated IAM in that way. Often, they require you to create custom credentials to be used when accessing them through the database’s APIs.
  2. Even in situations where IAM-based authentication is technically possible (such as when accessing IBM Db2 on Cloud service instances of the Enterprise plan), it is often a big process hurdle to get the administrator of that database to invite your IAM user into the account where the instance is located. That is, however, a prerequisite for granting the IAM identity any access to the database instance. See “How to Use an API Key or Access Token to Connect to IBM Db2 on Cloud” for using API keys with Db2.
  3. Data services external to IBM Cloud—such as Db2 databases deployed on VMs and managed by customers or data services on third-party clouds—do not integrate with the IAM of IBM Cloud, and the only possibility of accessing them is with some custom credentials created and handed out by the administrator of that data service.

The solution

To run analytic jobs like SQL Queries on data services without IAM support requires a mechanism to pass specific credentials to the analytic service on a per-data-service basis. However, passing plain text passwords or API keys as parameters inside an SQL statement is not a good practice, as it opens the gates for leaking of this sensitive information.

For this reason, IBM Cloud SQL Query has now introduced a secure mechanism for storing and passing custom credentials for data services to be accessed by an SQL statement. It allows you to use IBM Key Protect and IBM Hyper Protect Crypto as a trusted brokers for custom credentials.

The customer can set up their secrets—such as a password or an API key—in their own Key Protect or Hyper Protect Crypto instance and then assign access on that key to those users who should be allowed to use that credential inside SQL statements. That key access management is then again standard IAM. So this mechanism, in fact, provides a proper mapping of non-IAM credentials to IAM-managed keys. This means that the SQL statements using these keys retain the simplicity of IAM-based SSO,  even though the accessed data services themselves do not directly support IAM identities.

The following image shows that mechanism and usage flow:

The user provides a reference to the key to be used for a certain data source as part of the SQL statement. This reference is in Cloud Resource Name (CRN) format—the standard mechanism in IBM Cloud for unique resource identification.

Examples

Below you can find examples of how to set up and use custom credentials for SQL Query for a Db2 database.

Storing a password string “passw0rd” as custom key in a specific Key Protect instance:

ibmcloud kp create "mike's password" -i 78d63c14-c92b-4275-807e-a5f72c1b4445 -s -k `echo -ne "passw0rd" | base64`

Storing an API key string as a custom key in a specific Key Protect instance:

ibmcloud kp create "Our API key" -i 78d63c14-c92b-4275-807e-a5f72c1b4445 -s -k `echo -ne "YNiuREBMTfQsQzKrVYhQJElNXnUFEgpQ7qVVTkDK3_Ze" | base64`

Listing the stored keys in Key Protect:

ibmcloud kp list -c -i 78d63c14-c92b-4275-807e-a5f72c1b4445


Retrieving keys...


SUCCESS


Key ID                                 Key Name            CRN   
0ed2e19c-e86e-450e-bbb1-b60cc4b2e321   mike's password     crn:v1:bluemix:public:kms:us-south:a/d86af7367f70fba4f306d3c19c938f2f:78d63c14-c92b-4275-807e-a5f72c1b4445:key:0ed2e19c-e86e-450e-bbb1-b60cc4b2e321
f83884a4-c019-4ba7-87bf-8f87c454dfa3   Our API key      crn:v1:bluemix:public:kms:us-south:a/d86af7367f70fba4f306d3c19c938f2f:78d63c14-c92b-4275-807e-a5f72c1b4445:key:f83884a4-c019-4ba7-87bf-8f87c454dfa3

Writing a data set on Cloud Object Storage to a Db2 database with user name mike and his securely provided password:

SELECT * FROM cos://us-geo/sql/customers.csv STORED AS CSV
INTO db2://db2w-vqplkwx.us-south.db2w.cloud.ibm.com/CUSTOMERS
USER mike
PASSWORD crn:v1:bluemix:public:kms:us-south:a/d86af7367f70fba4f306d3c19c938f2f:78d63c14-c92b-4275-807e-a5f72c1b4445:key:0ed2e19c-e86e-450e-bbb1-b60cc4b2e321

Writing a data set on Cloud Object Storage to a Db2 database with custom API key:

SELECT * FROM cos://us-geo/sql/customers.csv STORED AS CSV
INTO db2://db2w-vqplkwx.us-south.db2w.cloud.ibm.com/CUSTOMERS
APIKEY crn:v1:bluemix:public:kms:us-south:a/d86af7367f70fba4f306d3c19c938f2f:78d63c14-c92b-4275-807e-a5f72c1b4445:key:f83884a4-c019-4ba7-87bf-8f87c454dfa3

Conclusion

In this article you have learned about a secure way to pass sensitive credential data for accessing your data sources. By relying on IBM Key Protect and Hyper Protect Crypto, you can be sure that your credentials cannot be leaked to any undesired person. It can only be used for the desired purpose of running the SQL statement on your data source. 

Go and try it out yourself today. If you do not have provisioned IBM Cloud SQL Query instance yet, go and get yourself our Lite plan today. It just takes a few seconds to provision Object Storage, SQL Query, and Key Protect instance. Enjoy!

More from Analytics

IBM acquires StreamSets, a leading real-time data integration company

3 min read - We are thrilled to announce that IBM has acquired StreamSets, a real-time data integration company specializing in streaming structured, unstructured and semistructured data across hybrid multicloud environments. Acquired from Software AG along with webMethods, this strategic acquisition expands IBM's already robust data integration capabilities, helping to solidify our position as a leader in the data integration market and enhancing IBM Data Fabric’s delivery of secure, high-quality data for artificial intelligence (AI).  According to a Forrester study conducted on behalf of…

Fine-tune your data lineage tracking with descriptive lineage

4 min read - Data lineage is the discipline of understanding how data flows through your organization: where it comes from, where it goes, and what happens to it along the way. Often used in support of regulatory compliance, data governance and technical impact analysis, data lineage answers these questions and more.  Whenever anyone talks about data lineage and how to achieve it, the spotlight tends to shine on automation. This is expected, as automating the process of calculating and establishing lineage is crucial to…

Reimagine data sharing with IBM Data Product Hub

3 min read - We are excited to announce the launch of IBM® Data Product Hub, a modern data sharing solution designed to accelerate data-driven outcomes across your organization. Today, we're making this product generally available to our clients across the world, following its announcement at the IBM Think conference in May 2024. Data sharing has become the lifeblood of modern organizations, fueling growth and driving innovation. But traditional approaches to data sharing can often be a bottleneck constricting the seamless sharing of data.…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters