Google Cloud Storage connection

To access your data in Google Cloud Storage, create a connection asset for it.

Google Cloud Storage is an online file storage web service for storing and accessing data on Google Cloud Platform infrastructure.

Create a connection to Google Cloud Storage

Common connectivity

To create the connection asset for common connectivity, you need the following connection details:

  • Project ID: The ID of the Google project.
  • Server proxy (optional)

Select Server proxy to access the Google Cloud Storage data source through an HTTPS proxy server. Depending on its setup, a proxy server can provide load balancing, increased security, and privacy. The proxy server settings are independent of the authentication credentials and the personal or shared credentials selection. A SSL certificate can be provided for added security. The proxy server settings cannot be stored in a vault.

  • Proxy host: The hostname or IP addess of the proxy server. For example, proxy.example.com or 192.0.2.0.
  • Proxy port: The port number to connect to the proxy server. For example, 8080 or 8443.
  • Proxy username and Proxy password.
  • Proxy protocol: The proxy server protocol. Select either HTTP or HTTPS.

Credentials

You have specific authentication methods based on your deployment:

Common connectivity

Authentication methods for common connectivity:

Account key (full JSON snippet)

  • Credentials: The contents of the Google service account key JSON file.

Client ID, Client secret, Access token, and Refresh token

  • Client ID: The OAuth client ID.
  • Client secret: The OAuth client secret.
  • Access token: An access token that can be used to connect to Google Cloud Storage.
  • Refresh token: A refresh token to be used to refresh the access.

With workload identity federation
You use an external identity provider (IdP) for authentication. An external identity provider uses Identity and Access Management (IAM) instead of service account keys. IAM provides increased security and centralized management. You can use workload identity federation authentication with an access token or with a token URL.

You can configure a Google BigQuery connection for workload identity federation with any identity provider that complies with the OpenID Connect (OIDC) specification and that satisfies the Google Cloud requirements that are described in Prepare your external IdP. The requirements include:

  • The identity provider must support OpenID Connect 1.0.
  • The identity provider's OIDC metadata and JWKS endpoints must be publicly accessible over the internet. Google Cloud uses these endpoints to download your identity provider's key set and uses that key set to validate tokens.
  • The identity provider is configured so that your workload can obtain ID tokens that meet these criteria:
    • Tokens are signed with the RS256 or ES256 algorithm.
    • Tokens contain an aud claim.

For examples of the workload identity federation configuration steps for Amazon Web Services (AWS) and Microsoft Azure, see

Workload Identity Federation with access token

  • Access token: An access token from the identity provider to connect to BigQuery.

  • Security Token Service audience: The security token service audience that contains the project ID, pool ID, and provider ID. Use this format:

    //iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID
    

    For more information, see Authenticate a workload by using the REST API.

  • Service account email: The email address of the Google service account to be impersonated. For more information, see Create a service account for the external workload.

  • Service account token lifetime (optional): The lifetime in seconds of the service account access token. The default lifetime of a service account access token is one hour. For more information, see URL-sourced credentials.

  • Token format: Text or JSON with the Token field name for the name of the field in the JSON response that contains the token.

  • Token field name: The name of the field in the JSON response that contains the token. This field appears only when the Token format is JSON.

  • Token type: AWS Signature Version 4 request, Google OAuth 2.0 access token, ID token, JSON Web Token (JWT), or SAML 2.0.

Workload Identity Federation with token URL

  • Security Token Service audience: The security token service audience that contains the project ID, pool ID, and provider ID. Use this format:

    //iam.googleapis.com/projects/PROJECT_NUMBER/locations/global/workloadIdentityPools/POOL_ID/providers/PROVIDER_ID
    

    For more information, see Authenticate a workload using the REST API.

  • Service account email: The email address of the Google service account to be impersonated. For more information, see Create a service account for the external workload.

  • Service account token lifetime (optional): The lifetime in seconds of the service account access token. The default lifetime of a service account access token is one hour. For more information, see URL-sourced credentials.

  • Token URL: The URL to retrieve a token.

  • HTTP method: HTTP method to use for the token URL request: GET, POST, or PUT.

  • Request body (for POST or PUT methods): The body of the HTTP request to retrieve a token.

  • HTTP headers: HTTP headers for the token URL request in JSON or as a JSON body. Use format: "Key1"="Value1","Key2"="Value2".

  • Token format: Text or JSON with the Token field name for the name of the field in the JSON response that contains the token.

  • Token field name: The name of the field in the JSON response that contains the token. This field appears only when the Token format is JSON.

  • Token type: AWS Signature Version 4 request, Google OAuth 2.0 access token, ID token, JSON Web Token (JWT), or SAML 2.0.

For Credentials, you can use secrets if a vault is configured for the platform and the service supports vaults. For information, see Using secrets from vaults in connections.

Federal Information Processing Standards (FIPS) compliance

This connection is FIPS-compliant and can be used on a FIPS-enabled cluster.

Supported file types

The Google Cloud Storage connection supports these file types:  Avro, CSV, Delimited text, Excel, JSON, ORC, Parquet, SAS, SAV, SHP, and XML.

Table formats

The Google Cloud Storage connection supports these Data Lake table formats: Delta Lake and Iceberg.

Learn more