Installing the IBM Big SQL service in Ambari

To extend the power of the Hortonworks Data Platform (HDP), install and deploy Big SQL, which is the IBM SQL interface to the Hadoop-based platform. Big SQL provides support for HDP 2.5.x, 2.6.2. and 2.6.3.

Before you begin

Make sure that you do the prerequisite steps listed in step 5 in Preparing to install IBM Big SQL, as well as enabling the Big SQL extension with the EnableBigSQLExtension.py as described in Enabling the Big SQL extension.

Remember, you must install Big SQL with at least two nodes in the cluster.

Tip: The Big SQL service binaries will be installed from a yum repository. If you want to check the URL of the repository used to install Big SQL, follow these steps:

From the Ambari UI, in the menu bar, click admin > Manage Ambari.
From the Clusters panel, click Versions > <stack name>.
If needed, change the repository URL, and click Save.

Restriction:

If you use YARN to manage your Big SQL service:

You must install the Big SQL service with at least two nodes in the cluster: One where Big SQL head resides, and another where Big SQL Worker resides. The Big SQL worker node must have a NodeManager component of Yarn installed.
A Big SQL worker node must be installed on all hosts that have a NodeManager component.
When installing Big SQL, ensure that the NodeManager component is not installed on either the head node or the secondary head node as this is not a supported configuration. It is fine to install the Resource manager on the same node.

Procedure

Open a browser and access the Ambari server dashboard.
The following is the default URL.
```
http://<server-name>:<server-port>
```
The default user name is admin, and the default password is admin. Use HTTPS if your Ambari server has SSL enabled for the Ambari Web UI.
In the Ambari web interface, click Actions > Add Service.
In the Add Service Wizard, select the Big SQL service and click Next.

If you do not see the option to select the Big SQL service, complete the steps in Installing the IBM Big SQL package.
In the Assign Masters page, decide which nodes of your cluster you want to run the specified components, or accept the default nodes. Click Next.
In the Assign Slaves and Clients page, make specific assignments for your nodes. Do not select the Big SQL Head Node as a worker node. You must select at least one node as the worker node.
In the Customize Services page, accept the default configurations for the Big SQL, or customize the configuration by expanding the configuration files and modifying the values. In the bigsql-env configuration group, you can enter a comma-separated list of mounted database directories or enter /hadoop/bigsql if you plan to use only one disk. In the bigsql-users-env configuration group, enter the current Ambari administrator username and password, and enter the desired password for the Big SQL user account. If you want to change the default username for this Big SQL service account, select the Misc tab in the Customize Services page.

In the bigsql-env configuration group, you can enter a comma-separated list of mounted database directories or enter /hadoop/bigsql if you plan to use only one disk.
You can review your selections in the Review page before accepting them. If you want to modify any values, click the Back button. If you are satisfied with your setup, click Deploy.
In the Install, Start and Test page, the Big SQL service is installed and verified. If you have multiple nodes, you can see the progress on each node. When the installation is complete, either view the errors or warnings by clicking the link. If there are no installation errors, the Ambari web UI allows you to click Next to see a summary and the new service added to the list of services.
If the Big SQL service fails to install, review the errors, correct the problems, and click Retry in the Ambari UI. You can find additional installation log files in /tmp/<bigsql_user>/logs.

If the install still fails, you must execute the fullBigSqlCleanup script:
```
var/lib/ambari-server/resources/extensions/IBM-Big_SQL/5.0.1.0/services/BIGSQL/package/scripts/fullBigSqlCleanup.sh -u <ambari_user> -p <ambari_pass> -x <ambari_port>
```
The following code is an example of the command:
```
/var/lib/ambari-server/resources/extensions/IBM-Big_SQL/5.0.1.0/services/BIGSQL/package/scripts/fullBigSqlCleanup.sh -u admin -p admin -x 8080
```
The following properties in the hdfs-site.xml, core-site.xml, and yarn-site.xml sections of the configuration are updated for you by the installation of Big SQL. You can verify that these properties are configured.
1. In the Ambari web interface, click the HDFS service.
2. Click the Configs tab and then the Advanced tab/
  1. Expand the Custom core site section to see the following properties:
    
    Key: hadoop.proxyuser.bigsql.groups
    
    Value: *, or preferably set to one or more groups of users that the bigsql user is allowed to impersonate.
    
    Key: hadoop.proxyuser.bigsql.hosts
    
    Value: Substitute with the fully qualified names of the hosts on which the Big SQL service is installed. If Big SQL HA is enabled, this comma-delimited list must also contain the hostnames of both Big SQL head nodes.
  2. Expand the Custom hdfs-site to see the following property:
    
    Key: dfs.namenode.acls.enabled
    
    Value: true
3. In the Ambari web interface, click the YARN service.
4. Click the Configs tab and then the Advanced tab. Expand the Resource Manager section.
5. Find the yarn.admin.acl property.
  1. In the Value text field for the yarn.admin.acl property, find the bigsql user.
    It might look like the following value: yarn,bigsql.
Restart all services for which Ambari web UI indicates a restart is required.
1. For each service that requires restart, select the service.
2. Click Service Actions.
3. Click Restart All.
A web application interface for Big SQL monitoring and editing is available to your end-users to work with Big SQL. You access this monitoring utility from the Big SQL service.
Restart the Knox Service. Also start the Knox Demo LDAP service if you have not configured your own LDAP.
If you want to run SQL statements from the Big SQL monitoring and editing tool, install DSM and use the quick links button on the DSM service to access the console:
```
https://<knox_host>:<knox_port>/<knox_gateway_path>/default/dsm
```
Where:

knox_host

The host where Knox is installed and running

knox_port

The port where Knox is listening (by default this is 8443)

knox_gateway_path

The value entered in the gateway.path field in the Knox configuration (by default this is 'gateway')

For example, the URL might look like the following address:
```
https://myhost.company.com:8443/gateway/default/dsm
```
If you use the Knox Demo LDAP service, the default credential is:
```
userid = guest
password = guest-password
```
To invalidate your session, click the Menu icon on the Big SQL page and select Sign Out. You will need to re-authenticate before being able to display the Big SQL page again.

Your end users can also use the JSqsh client, which is a component of the Big SQL service.
For HBase, do the following post-installation steps:
1. For all nodes where HBase is installed, check that the symlinks to hive-serde.jar and hive-common.jar in the hbase/lib directory are valid.
  - To verify the symlinks are created and valid:
    - namei /usr/hdp/<version-number>/hbase/lib/hive-serde.jar
    - namei /usr/hdp/<version-number>/hbase/lib/hive-common.jar
  - If they are not valid, do the following steps:
```
cd /usr/hdp/<version-number>/hbase/lib 
rm -rf hive-serde.jar 
rm -rf hive-common.jar 
ln -s /usr/hdp/<version-number>/hive/lib/hive-serde.jar hive-serde.jar 
ln -s /usr/hdp/<version-number>/hive/lib/hive-common.jar hive-common.jar 
```
2. After installing the Big SQL service, and fixing the symlinks, restart the HBase service from the Ambari web interface.
If you want to enable the automatic synching of catalog changes from Hive, open the Ambari web interface, click IBM Big SQL > Service Actions and select Enable Metadata Sync.
This enables the automatic processing of Hive catalog changes in Big SQL.
If you are planning to use the HA feature with Big SQL, ensure that the local user "bigsql" exists in /etc/passwd on all nodes in the cluster.
If you want to load data by using the LOAD command into Hadoop tables from files that are compressed by using the lzo codec, do the following steps:
1. Update the HDFS configuration.
  1. Click the HDFS service and open the Config > Advanced tabs.
  2. Expand the Advanced core-site section.
  3. Edit the io.compression.codecs field and append the following value to it:
```
com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
```
  4. Restart the HDFS and Big SQL services.
2. Make sure the following packages are installed on all nodes in the cluster:
  - lzo
  - lzop
  - hadoop-lzo
  - hadoop-lzo-native

What to do next

After you add Big SQL worker nodes, make sure that you stop and then restart the Hive service.

If the Big SQL check script shows that Big SQL cannot create an HBase table due to insufficient privileges, then run the following command in the HBase shell:

grant 'bigsql', 'RWCA'

For post-installation tasks that you can do, see Post-installation configuration. For information about using Big SQL, see Analyzing and manipulating big data with Big SQL.

If Big SQL did not install successfully and you do not want to try installing it again, you can run the fullBigSQLCleanup.sh clean-up utility to restore your environment to its previous state. For more information, see Clean-up utility.

Note: When you navigate to the Big SQL service page on the Ambari server UI, the last action listed under Service Actions is Delete Service. Please DO NOT click on this option to delete Big SQL. Use of the Delete Service option can leave the Big SQL service in an undefined state, which means that subsequent attempts to re-install the service might fail. If you want to remove the installed Big SQL service including all its artifacts, please follow the instructions in Removing IBM Big SQL.