Applying cluster HTTP proxy settings to IBM Cloud Pak for Data

If you use HTTP proxy to manage inbound traffic to and outbound traffic from your Red Hat® OpenShift® Container Platform cluster, you can use the cpd-cli to apply the same HTTP proxy settings to an instance of IBM Cloud Pak for Data.

Who needs to complete this task?

To complete this task, you must have one of the following roles:

  • Cluster administrator
  • Instance administrator
When do you need to complete this task?

Complete this task if you want to apply your cluster-level HTTP proxy settings to an instance of Cloud Pak for Data.

Repeat as needed Repeat this task in the following situations:
  • You have multiple instances of Cloud Pak for Data on the cluster where you want to apply the cluster-level HTTP proxy settings.
  • You install a new service on an instance of Cloud Pak for Data that is configured to use the cluster-level HTTP proxy settings.
  • You create a new service instance on an instance of Cloud Pak for Data that is configured to use the cluster-level HTTP proxy settings.

Before you begin

Important: The cpd-cli does not set up a proxy server for use with IBM Cloud Pak for Data. You must have an existing proxy server for the HTTP proxy configuration to take effect.

A cluster administrator must install and enable the resource specification injection (RSI) webhook before you can apply the cluster-level HTTP proxy settings to Cloud Pak for Data

To check whether the RSI webhook is installed, run the following command:

oc get mutatingwebhookconfiguration -n ${PROJECT_CPD_INST_OPERANDS} | grep rsi-webhook-cfg
Best practice: You can run many of the commands in this task exactly as written if you set up environment variables for your installation. For instructions, see Setting up installation environment variables.

Ensure that you source the environment variables before you run the commands in this task.

About this task

If you have a cluster-wide proxy configuration, you can use the create-proxy-config command and enable-proxy commands to apply your HTTP configuration to an instance of Cloud Pak for Data.

The create-proxy-config command generates a configuration that supports the following types of connections:

  • httpProxy
  • httpsProxy
  • noProxy

The following services support the cluster-wide HTTP proxy configuration:

Service Supports HTTP proxy configuration Details
AI Factsheets
  • 5.0.0 Not supported
Anaconda Repository for IBM Cloud Pak for Data Not applicable.  
Analytics Engine powered by Apache Spark  
Cognos Analytics  
Cognos Dashboards  
Data Gate No.  
Data Privacy  
Data Product Hub  
Data Refinery No.  
Data Replication No.  
DataStage  
Data Virtualization  
Db2 Not applicable. This service does not have outbound connections.
Db2 Big SQL  
Db2 Data Management Console  
Db2 Warehouse Not applicable. This service does not have outbound connections.
Decision Optimization  
EDB Postgres No.  
Execution Engine for Apache Hadoop No.  
IBM Knowledge Catalog No.  
IBM Knowledge Catalog Premium No.  
IBM Knowledge Catalog Standard No.  
IBM Match 360 with Watson  
Informix Not applicable. This service does not have outbound connections.
MANTA Automated Data Lineage No.  
MongoDB No.  
OpenPages  
Orchestration Pipelines  
Planning Analytics No.  
Product Master  
RStudio® Server Runtimes  
SPSS Modeler  
Synthetic Data Generator  
Voice Gateway No.  
Watson Discovery  
Watson Machine Learning  
Watson Machine Learning Accelerator  
Watson OpenScale  
Watson Speech services No.  
Watson Studio  
Watson Studio Runtimes  
watsonx Assistant No.  
watsonx.ai  
Watsonx Code Assistant for Red Hat Ansible® Lightspeed Not applicable. This service does not have outbound connections.
watsonx Code Assistant for Z Not applicable. This service does not have outbound connections.
watsonx.data  
watsonx.governance  
watsonx Orchestrate No.  

Procedure

  1. Log the cpd-cli in to the Red Hat OpenShift Container Platform cluster:
    ${CPDM_OC_LOGIN}
    Remember: CPDM_OC_LOGIN is an alias for the cpd-cli manage login-to-ocp command.
  2. Create the proxy configuration resources:

    Run the appropriate command based on the information that you need to specify to connect to your proxy server:

    Proxy servers with no authentication

    Hostname only
    cpd-cli manage create-proxy-config \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --proxy_host=${PROXY_HOST}

    Hostname and port number
    cpd-cli manage create-proxy-config \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --proxy_host=${PROXY_HOST} \
    --proxy_port=${PROXY_PORT}

    Proxy servers that require authentication

    Hostname only
    cpd-cli manage create-proxy-config \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --proxy_host=${PROXY_HOST} \
    --proxy_user=${PROXY_USER} \
    --proxy_password=${PROXY_PASSWORD}

    Hostname and port number
    cpd-cli manage create-proxy-config \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --proxy_host=${PROXY_HOST} \
    --proxy_port=${PROXY_PORT} \
    --proxy_user=${PROXY_USER} \
    --proxy_password=${PROXY_PASSWORD}

  3. Apply the proxy configuration to the instance of Cloud Pak for Data.
    Important: When you run this command, StatefulSets, ReplicaSets, ReplicaControllers, Jobs, and CronJobs are restarted.

    If necessary, run this command during a maintenance window to prevent unexpected downtime.

    cpd-cli manage enable-proxy \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS}
  4. Wait for all of the pods to restart:
    oc get deployment,replicaset,job,cronjob,statefulset,replicationcontroller \
    -n=${PROJECT_CPD_INST_OPERANDS} \
    -o=json \
    | jq '.items[].metadata|select(.annotations."resourcespecinjector.ibm.com/injection_status"=="patch in progress")|.name' \
    | wc -l

    Wait for the command to return 0. Do not complete any tasks in Cloud Pak for Data before the pods are restarted.

    If any pods are stuck, delete the pods to restart them.

  5. Confirm that no pods are in the Error state:
    oc get pod -n ${PROJECT_CPD_INST_OPERANDS} | \
    egrep -v '0/0|1/1|2/2|3/3|4/4|5/5|6/6|7/7|8/8|9/9|Complete'

What to do next

Watsonx.data users only If your Cloud Pak for Data deployment is in a restricted network, you must add the URL of the embedded MinIO object store to the allowlist for your proxy server.

This configuration enables Hive Metastore (HMS) and Presto to communicate.

For more information on obtaining the MinIO object store URL, see Accessing the MinIO console in the Watsonx.data product documentation.