CREATE EXTERNAL DATASOURCE command

You can use the CREATE EXTERNAL DATASOURCE command to create external data sources.

External data sources are used to create connections from Netezza Performance Server to remote storage locations, like AWS S3, Azure Blob/ADLSv2, IBM Cloud Object Storage, Minio, and other compatible sources. They are an intermediate step to creating external tables, and are used to store connection details, including remote location and credentials. See CREATE EXTERNAL TABLE command for more details.

External data sources are used to establish connection between storage accounts, and support the first use of data virtualization and data loading.

To create an external data source, you must set ENABLE_EXTERNAL_DATASOURCE to 1.
SET ENABLE_EXTERNAL_DATASOURCE = 1;

Syntax

To create a data source, run:
CREATE EXTERNAL DATASOURCE external data source name
ON source
USING (
   external data source options
);
Select a source from one of the following:
  • AWSS3 - supports all S3-compatible storage, including IBM Cloud Object Storage and Minio.
  • AZUREBLOB - supports Blob and ADLSv2.
Note: For using AZUREBLOB with parquet external tables, enable it by running the following command.
SET ENABLE_AZURE_DATALAKE_SUPPORT TRUE;
To create an external data source, run the following command:
CREATE EXTERNAL DATASOURCE DATASOURCE
ON AWSS3 
USING (
   ACCESSKEYID ACCESSKEYID 
   SECRETACCESSKEY SECRETACCESSKEY 
   BUCKET BUCKET 
   REGION REGION
);
Tip:
  1. After you create an external datasource definition, you can use ALTER statements to modify the external data source columns or SHOW statements to view the external data source column values.
  2. You can use the verbose option with the SHOW command. When you run the command, the column values are combined to form the cloud connection string for that external data source.
    SHOW EXTERNAL DATASOURCE NYCTAXIS3 VERBOSE;
    CONN      | REMOTESOURCE 
    ------------------------
    ACCESSKEYID=ACCESS KEY ID:BUCKETURL=URL:DEFAULTREGION=REGION:SECRETACCESSKEY=SECRET ACCESS KEY | S3
    (1 row)

External data source options

For S3-compatible sources (AWS S3)
Option Valid formats Data type Distribution Description
region Text String Mandatory Bucket location
bucket Text String Mandatory Bucket name
accesskeyid Text String Optional AWS/IBM COS key
secretaccesskey Text String Optional AWS/IBM COS secret access key
multipartsizemb Text, fixed String Optional Multipart size; 8 MB - 5 GB
endpoint Text String Optional Connect to S3-compatible endpoint
Note: If any of the following parameters are provided, ensure that the other one is also provided.
  • accesskeyid
  • secretaccesskey
Endpoint can be used to connect to S3-compatible remote storage location other than AWS S3.
For Azure Blob/ADLSv2 (AZUREBLOB)
Option Valid formats Data type Distribution Description
account Text String Mandatory Storage account name
container Text String Mandatory Storage container name
blobtype Text String Optional One of BLOCK, PAGE, mandatory for purposes other than parquet external tables
key Text String Optional Account access key
sastoken Text String Optional Azure SAS token
tenantid Text String Optional Azure tenant ID
clientid Text String Optional Azure client ID
clientsecret Text String Optional Azure client secret
Note: The following authentication methods are supported for parquet external tables:
  • Storage account access key (key).
  • SAS token (sastoken).
  • Azure Entra ID OAuth (tenantid, clientes, clientsecret).

Anonymous access is also supported.

Examples

See External data source examples.