Connecting to a Presto (Java) server

Presto (Java) CLI provides a terminal-based interactive shell to run queries. You can connect to Presto (Java) server either through Presto (Java) CLI installed as part of the ibm-lh-client package or through Presto (Java) CLI installed separately.

watsonx.data on Red Hat® OpenShift®

To connect to a Presto (Java) Server from a client program or CLI, the following items are required:

  • Hostname and port for the Presto (Java) server or workstation where the IBM® watsonx.data developer or stand-alone is installed.
  • Certificates served by the Presto (Java) server to establish trust.
  • Authorized user credentials to access the Presto (Java) server.
Note: It is important to connect to the Presto (Java) server that uses its hostname and not its IP address. Because the TLS certificate that is served by the Presto (Java) Server is associated with a fully qualified host and domain-name (FQDN). Client programs, typically, cannot establish trust by using the IP address. With OpenShift Ingress in particular, DNS entries, based on hostnames, play an important role in routing to the intended Kubernetes Service.
Tip: To confirm there is network access from your client workstation that needs to connect a Presto (Java) server you can test the access by using one of the following commands:
 curl -ki https://<presto-hostname>:<presto-portnumber>
nc -v <presto-hostname> <presto-portnumber>
  • The Presto (Java) server exposes a HTTPs (TCP) port. Therefore, you can use any convenient HTTP or TCP-based utility to ensure that the Presto (Java) server can be accessed from your network.
  • When you run the curl command, the server may return HTTP/1.1 401 Unauthorized response. This is expected as the server is secured by authentication.
  • When you run the nc command, the server returns a success message.

In watsonx.data, you can connect to the Presto (Java) server in multiple ways based on the platform and utilities you are using. See the following sections for more details:

Important: You must specify the location when creating schema using CLI. For example,
location = s3a://<bucket-name>/

Using built-in presto-cli in the developer edition

To connect to the built-in presto-cli in the developer edition:

  1. Open a command prompt in the workstation where IBM watsonx.data developer is installed.
  2. Go to the /bin folder of the installation directory and run the presto-cli.sh command.
  3. Inspect available catalogs in the watsonx.data.
    presto> show catalogs;
        Catalog    
    ---------------
     iceberg_minio 
     jmx           
     system        
     tpcds         
     tpch          
    (5 rows)
    
    Query 20230308_195425_00000_hy4cs, FINISHED, 1 node
    Splits: 19 total, 19 done (100.00%)
    0:02 [0 rows, 0B] [0 rows/s, 0B/s]
    

Using presto-cli and presto-run in the ibm-lh-client package

For details and instructions related to using presto-cli and presto-run in the ibm-lh-client package, see Installing ibm-lh-client .

Using presto-cli executable (remote) – Developer edition

This section provides instructions to connect Presto (Java) server in watsonx.data developer edition from a remote presto-cli.

Note:

You can download the presto-cli executable from https://prestodb.io/docs/current/installation/cli.html.

  1. Get the hostname and port of the server or workstation where the watsonx.datadeveloper package is installed.
  2. Import the certificate of the Presto (Java) server to establish trust. You can import the certificate either by using the cert-mgmt utility available in a watsonx.data client installation or manually. Based on your setup, complete one of the following steps:

    1. From your client machine, point or copy the truststore or certificate file present in the ibm-lh-client installation directory. The certificates are at ibm-lh-client/localstorage/volumes/infra/tls/. In this location, you can find:
      • truststore.jks - can be used for Java/JDBC based applications.
      • cabundle.crt - can be used for non-Java applications.
      • aliases/ - is a subdirectory with individual certificates. For example, aliases/<presto server hostname>: 8443.crt refers to the certificate for a specific Presto (Java) server only.

      For example, bin/cert-mgmt --op=add --host=<hostname of the Presto (Java) server> --port=8443

    2. Manually import certificates into the truststore. For instructions, see Importing self-signed certificates from a Presto (Java) server to a Java truststore.

  3. Authenticate with the Presto (Java) server.

    ./presto --server <https://hostname:port> --keystore-path /<directory>/<certificate file>.jks --keystore-password <keystore password> --truststore-path /<directory>/lh-ssl-ks.jks --truststore-password <truststore password> --catalog iceberg_data --user <username> --password <password>
    Note: The default credential for watsonx.data developer edition username - ibmlhadmin with password - password. However, if a different password was used while installing the developer edition or later changed by using the ibm-lh-dev/bin/user-mgmt change-password command, use the appropriate username and password.

    The ibm-lh-dev/bin/user-mgmt utility can be used to add additional users and ibm-lh-dev/bin/role-mgmt can be used to grant such users either an Admin or User Role.

    Provide the credentials values when connecting to Presto (Java) server in the developer installation from a client. No API Keys or tokens are required or available.

Using presto-cli executable (remote)- stand-alone

This section provides instructions to connect Presto (Java) server in watsonx.data stand-alone edition from a remote presto-cli.

Before you get started, ensure the following:
  • the Red Hat OpenShift Project namespace administrator or cluster administrator has exposed a route to the Presto (Java) server. The administrators should refer to Exposing secure route to Presto (Java) server for instructions.

  • you have received the hostname and port (by default 443) associated with the specific Red Hat OpenShift route from the administrator.

  1. Import the certificate of the Presto (Java) server to establish trust. You can import the certificate either by using the cert-mgmt utility or manually. Based on your setup, complete one of the following steps:

    Note: Presto (Java) servers in watsonx.data use TLS for securing access from clients. Sometimes, the Presto (Java) server may be associated with a Self-signed certificate instead of Certificate Authority (CA) signed certificate. In such cases, the client programs that connect via the HTTPs (TLS) protocol must be configured to trust the self-signed certificate configured in the Presto (Java) Server.
    1. If you are connecting from the watsonx.data utilities, go to bin/cert-mgmt and run the following command:
      bin/cert-mgmt --op=add --host=<hostname of the Presto (Java) server> --port=443
      You see the following output:
      /mnt/infra/tls/truststore.jks updated
      	/mnt/infra/tls/cabundle.crt updated
      Note: The ibm-lh-client installation includes utilities and packages to connect to Presto (Java) engines in watsonx.data software deployments. The client installation includes the bin/cert-mgmt executable that simplifies fetching the certificate that is used with a Presto (Java) server. It stores such certificates in a Java (JKS) format truststore for Java based clients and also in PEM format for use with nonjava clients.

      The presto-cli and presto-run watsonx.data (ibm-lh-client) utilities are Java-based and automatically use this JKS truststore. JDBC programs (such as those run via the bin/dev-sandbox utility) must be configured to explicitly use this JKS truststore. The cabundle.crt must be used by non-Java programs, such as Python scripts.

    2. If you are connecting from programs or utilities that run outside the watsonx.data client installation, do one of the following:

      i. From your client machine, point or copy the truststore or certificate file present in the ibm-lh-client installation directory. The certificates are at ibm-lh-client/localstorage/volumes/infra/tls/. In this location, you can find:

      truststore.jks- can be used for Java/JDBC based applications.

      cabundle.crt - can be used for non-Java applications.

      aliases/ - is a subdirectory with individual certificates. For example, aliases/<Presto (Java) server hostname>:443.crt refers to the certificate for a specific Presto (Java) server only.

      ii. Manually import certificates into the truststore. For instructions, see Importing self-signed certificates from a Presto (Java) server to a Java truststore.

  2. Authenticate with the Presto (Java) server.

    ./presto --server https://<route_name> --truststore-path /<directory>/<certfilename/truststore>.jks --truststore-password changeit --catalog iceberg_data --user admin --password
    Note:
    • The default out-of-the-box user for watsonx.data stand-alone is ‘admin’ with a generated password that is accessible by the ibm-lakehouse-manage get-cpd-instance-details command after installation. See Installing the stand-alone watsonx.data for more details.

    • -—truststore-path and -—truststore-password arguments are optional if the certificates were imported into the default JDK/JRE trust store already

    The Administrator of the watsonx.data instance can configure different Identity Providers to authenticate users or grant them appropriate Admin or User Roles. See the “Managing Users” topic for more details.

    In addition to directly using their credentials as username and password properties to connect to the Presto (Java) server, users can choose an API Key as well. See Generating API keys for authentication for details on how to generate such API Keys after signing in to watsonx.data from a browser. See, Generating API keys for authentication for details for how to use API Keys or Tokens from client applications instead of credentials.

  3. When connecting to the watsonx.data Presto (Java) servers, the credential properties to set would then be:

    username: username, password: <password>

    Or

    username: ibmlhapikey, password:<base64 of username:apikey>

    Or

    username: ibmlhtoken, password:<platform token> or < instance token >

Using Java/JDBC – Developer edition

This section provides instructions to connect to a Presto (Java) server from a Java JDBC application.

Before you get started, get the hostname and port of the server or workstation where the watsonx.data developer package is installed.

Note: The default Presto (Java) port number for the developer edition is 8443. However, if a different port number was picked during the installation, use the appropriate port number.

To connect with watsonx.data Presto (Java) server from a Java JDBC appliation, complete the following steps:

  1. If required, complete the instructions in Using presto-cli executable (remote) – Developer to get the hostname, port, server certificate, authentication, and test the connectivity.

  2. Download and install the presto-jdbc-0.286.jar (or later) on the client machine.

  3. Add presto-jdbc-0.286.jar (or later) to the class path of your Java application.

  4. Create a Java application by using a JDBC interface. Following is a sample JDBC code.

    Note:
    • presto_url - Identifies the jdbc URL to the Presto (Java) server

    • SSLTrustStorePath and SSLTrustStorePassword properties are set to identify the location of the truststore and its password (if necessary).

    • Values for user and password properties are the watsonx.data developer edition username and password.

    Example: Java snippet

    import java.sql.*;
    import java.util.Properties;
    
    public class PrestoJdbcSample
    {
    
        public static void main(String[] args) throws Exception
        {
    
            /**
             * example of fetching the location and credentials needed to connect, from environment variables
             **/
            String username = System.getenv("ENG_USERNAME");
            String password = System.getenv("ENG_PASSWORD");
            String hostname = System.getenv("ENG_HOST");
            String portnumber = System.getenv("ENG_PORT");
    
            String presto_url = "jdbc:presto://" + hostname + ":" + portnumber;
    
            String ts_location =  System.getenv("TRUSTSTORE");
                // example: “ibm-lh-client/localstorage/volumes/infra/tls/truststore.jks";
    
            String ts_password =  System.getenv("TRUSTSTORE_PASSWORD");
            // example: "changeit";
    
            Connection connection = null;
            Statement statement = null;
            ResultSet resultSet = null;
    
            try
            {
                /** load the Presto (Java) JDBC Driver class **/
                String driverClass = "com.facebook.presto.jdbc.PrestoDriver";
                Class.forName(driverClass);
    
                /** Set the connection properties**/
                Properties properties = new Properties();
                properties.setProperty("user", username);
                properties.setProperty("password", password);
                properties.setProperty("SSL", "true");
    
                /**
                 * identify where the java truststore is located.
                 * skip if the default JDK truststore already has the Presto (Java) Server's certificate
                 * or if the Presto (Java) server has signed certificates
                 **/
                properties.setProperty("SSLTrustStorePath", ts_location);
                properties.setProperty("SSLTrustStorePassword", ts_password);
    
                /** Connect **/
                connection = DriverManager.getConnection(presto_url, properties);
                /** Issue a Query **/
                String query = "SELECT * FROM tpch.tiny.customer LIMIT 10";
                statement = connection.createStatement();
                resultSet = statement.executeQuery(query);
    
                /** iterate through the results **/
                while (resultSet.next())
                {
                    String phone = resultSet.getString("phone");
                    String name = resultSet.getString("name");
                    System.out.println("phone = " + phone + ", name = " + name);
                }
            }
            catch (Exception e)
            {
                e.printStackTrace();
            }
            finally
            {
                /** clean up at the end always **/
                if (resultSet != null)
                {
                    resultSet.close();
                }
                if (statement != null)
                {
                    statement.close();
                }
                if (connection != null)
                {
                    connection.close();
                }
            }
    
        }
    }
    

Using Java/JDBC – stand-alone

This section provides instructions to connect to a Presto (Java) server from a Java JDBC application.

Before you get started, ensure the following:
  • the Red Hat OpenShift Project namespace administrator or cluster administrator has exposed the route to the Presto (Java) server. The administrators should refer to Exposing secure route to Presto (Java) server for instructions.

  • you have the received the hostname and port (by default 443) associated with that specific Red Hat OpenShift route from the administrator.

To connect with watsonx.data Presto (Java) server from a Java JDBC application, complete the following steps:

  1. If required, complete the instructions in Using presto-cli executable (remote) - stand-alone to get the hostname, port, server certificate, authentication, and test the connectivity.

  2. Download and install the presto-jdbc-0.286.jar (or later) on the client machine.

  3. Add presto-jdbc-0.286.jar (or later) to the class path of your Java application.

  4. Create a Java application by using JDBC interface. Following is a sample JDBC code.

    Note:
    • presto_url - Identifies the jdbc URL to the Presto (Java) server

    • SSLTrustStorePath and SSLTrustStorePassword properties are set to identify the location of the truststore and its password (if necessary).

    • Values for user and password properties are the watsonx.data stand-alone edition username and password.

    import java.sql.*;
    import java.util.Properties;
    
    public class PrestoJdbcSample
    {
    
        public static void main(String[] args) throws Exception
        {
    
            /**
             * example of fetching the location and credentials needed to connect, from environment variables
             **/
            String username = System.getenv("ENG_USERNAME");
            String password = System.getenv("ENG_PASSWORD");
            String hostname = System.getenv("ENG_HOST");
            String portnumber = System.getenv("ENG_PORT");
    
            String presto_url = "jdbc:presto://" + hostname + ":" + portnumber;
    
            String ts_location =  System.getenv("TRUSTSTORE");
                // example: “ibm-lh-client/localstorage/volumes/infra/tls/truststore.jks";
    
            String ts_password =  System.getenv("TRUSTSTORE_PASSWORD");
            // example: "changeit";
    
            Connection connection = null;
            Statement statement = null;
            ResultSet resultSet = null;
    
            try
            {
                /** load the Presto (Java) JDBC Driver class **/
                String driverClass = "com.facebook.presto.jdbc.PrestoDriver";
                Class.forName(driverClass);
    
                /** Set the connection properties**/
                Properties properties = new Properties();
                properties.setProperty("user", username);
                properties.setProperty("password", password);
                properties.setProperty("SSL", "true");
    
                /**
                 * identify where the java truststore is located.
                 * skip if the default JDK truststore already has the Presto (Java) Server's certificate
                 * or if the Presto (Java) server has signed certificates
                 **/
                properties.setProperty("SSLTrustStorePath", ts_location);
                properties.setProperty("SSLTrustStorePassword", ts_password);
    
                /** Connect **/
                connection = DriverManager.getConnection(presto_url, properties);
                /** Issue a Query **/
                String query = "SELECT * FROM tpch.tiny.customer LIMIT 10";
                statement = connection.createStatement();
                resultSet = statement.executeQuery(query);
    
                /** iterate through the results **/
                while (resultSet.next())
                {
                    String phone = resultSet.getString("phone");
                    String name = resultSet.getString("name");
                    System.out.println("phone = " + phone + ", name = " + name);
                }
            }
            catch (Exception e)
            {
                e.printStackTrace();
            }
            finally
            {
                /** clean up at the end always **/
                if (resultSet != null)
                {
                    resultSet.close();
                }
                if (statement != null)
                {
                    statement.close();
                }
                if (connection != null)
                {
                    connection.close();
                }
            }
    
        }
    }
    

Using Python scripts – Developer edition

This section provides instructions to connect to a Presto (Java) server from a Python client.

Before you get started, get the hostname and port of the server or workstation where the watsonx.data developer package is installed.

Note: The default Presto (Java) port number for the developer edition is 8443. However, if a different port number was picked during the installation, use the appropriate port number.

To connect with watsonx.data Presto (Java) server from a Python client, complete the following steps:

  1. Install python 3.x (3.10 or later recommended) & pip3 on your client workstation.

  2. Install common python modules to connect to Presto (Java).

    pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
  3. If required, complete the instructions in Using presto-cli executable (remote) – Developer edition to get the hostname, port, server certificate, authentication, and test the connectivity.

  4. Start the sandbox container for the registered Presto (Java) engine.

    ibm-lh-client/bin/dev-sandbox --engine=demo-b
  5. In the bash prompt, install the prestodb module.

    export HOME=/tmp
    pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
    Important: Steps 4 and 5 are required only if you are not using the Prestodb module in ibm-lh-client. Also, currently due to an issue with the Prestodb module in ibm-lh-client, you must complete steps 4 and 5.
  6. Use the DBAPI interface to query the Presto (Java) server. Following is a sample Python script:

    Note:
    • See how the "ENG_" environment variables are examples of how the relevant Presto (Java) server locations can be used with the dbapi.connect function call.

    • An example of how to point to the location of the Presto (Java) server’s certificate file is represented by the cert_location variable.

    • The conn._http_session.verify = cert _location property is used to point to the certificate location (for self-signed cases only).

    import os
    import prestodb
    
    username=os.environ["ENG_USERNAME"]
    password=os.environ["ENG_PASSWORD"]
    hostname=os.environ["ENG_HOST"]
    portnumber=os.environ["ENG_PORT"]
    cert_location='./ibm-lh-client/localstorage/volumes/infra/tls/aliases/' + hostname + ':' + portnumber + '.crt'
    
    with prestodb.dbapi.connect(
    host=hostname,
    port=portnumber,
    user=username,
    catalog='tpch',
    schema='tiny',
    http_scheme='https',
    auth=prestodb.auth.BasicAuthentication(username,password)
    ) as conn:
      conn._http_session.verify = cert_location
      cur = conn.cursor()
      cur.execute('select * from tpch.tiny.customer limit 10')
      rows = cur.fetchall()
      print(rows)

Using Python scripts– stand-alone

This section provides instructions to connect to a Presto (Java) server from a Python client.

Before you get started, ensure the following:
  • the Red Hat OpenShift Project namespace administrator or cluster administrator has exposed the route to the Presto (Java) Server. The administrators should refer to Exposing secure route to Presto (Java) server for instructions.

  • you have the received the hostname and port (by default 443) associated with that specific Red Hat OpenShift route from the administrator.

To connect with watsonx.data Presto (Java) server from a Python client, complete the following steps:

  1. Install python 3.x (3.10 or later recommended) and pip3 on your client workstation.

  2. Install common python modules to connect to Presto (Java).

    pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
  3. If required, complete the instructions in Using presto-cli executable (remote)- stand-alone to get the hostname, port, server certificate, authentication, and test the connectivity.

  4. Start the sandbox container for the registered Presto (Java) engine.

    ibm-lh-client/bin/dev-sandbox --engine=demo-b
  5. In the bash prompt, install the prestodb module.

    export HOME=/tmp
    pip3 install SQLAlchemy 'pyhive[presto]' presto-python-client
    Important: Steps 4 and 5 are required only if you are not using the Prestodb module in ibm-lh-client. Also, currently due to an issue with the Prestodb module in ibm-lh-client, you must complete steps 4 and 5.
  6. Use the DBAPI interface to query the Presto (Java) server. Following is a sample Python script:

    Note:
    • See how the "ENG_" environment variables are examples of how the relevant Presto (Java) server locations can be used with the dbapi.connect function call.

    • An example of how to point to the location of the Presto (Java) server’s certificate file is represented by the cert_location variable.

    • The conn._http_session.verify = cert _location property is used to point to the certificate location (for self-signed cases only).

    import os
    import prestodb
    
    username=os.environ["ENG_USERNAME"]
    password=os.environ["ENG_PASSWORD"]
    hostname=os.environ["ENG_HOST"]
    portnumber=os.environ["ENG_PORT"]
    cert_location='./ibm-lh-client/localstorage/volumes/infra/tls/aliases/' + hostname + ':' + portnumber + '.crt'
    
    with prestodb.dbapi.connect(
    host=hostname,
    port=portnumber,
    user=username,
    catalog='tpch',
    schema='tiny',
    http_scheme='https',
    auth=prestodb.auth.BasicAuthentication(username,password)
    ) as conn:
      conn._http_session.verify = cert_location
      cur = conn.cursor()
      cur.execute('select * from tpch.tiny.customer limit 10')
      rows = cur.fetchall()
      print(rows)