To extend the power of the Hortonworks Data Platform (HDP), install and deploy
Big SQL, which is the IBM SQL interface to the Hadoop-based platform. Big SQL provides support for
HDP 2.5.x, 2.6.2. and 2.6.3.
Before you begin
Make sure that you do the prerequisite steps listed in step 5 in Preparing to install IBM Big SQL, as well as enabling the Big SQL extension with the
EnableBigSQLExtension.py as described in Enabling the Big SQL extension.
Remember, you must install Big SQL with at least two nodes in the cluster.
Tip: The Big SQL service binaries will be installed from a
yum
repository. If you want to check the URL of the repository used to install Big SQL, follow these
steps:
- From the Ambari UI, in the menu bar, click .
- From the Clusters panel, click .
- If needed, change the repository URL, and click Save.
Restriction:
If you use YARN to manage your Big SQL service:
- You must install the Big SQL service with at least two nodes in the cluster: One where Big SQL
head resides, and another where Big SQL Worker resides. The Big SQL worker node must have a
NodeManager component of Yarn installed.
- A Big SQL worker node must be installed on all hosts that have a NodeManager component.
- When installing Big SQL, ensure that the NodeManager component is not installed on either the
head node or the secondary head node as this is not a supported configuration. It is fine to install
the Resource manager on the same node.
Procedure
- Open a browser and access the Ambari server dashboard.
The following is the default URL.
http://<server-name>:<server-port>
The default user name is
admin, and the default password is
admin. Use HTTPS if your
Ambari server has SSL enabled for the Ambari Web UI.
- In the Ambari web interface, click .
-
In the Add Service Wizard, select the Big SQL
service and click Next.
- In the Assign Masters page, decide which nodes of your cluster you
want to run the specified components, or accept the default nodes. Click
Next.
- In the Assign Slaves and Clients page, make specific assignments
for your nodes. Do not select the Big SQL Head Node as a worker node. You must select at least one
node as the worker node.
-
In the Customize Services page, accept the default configurations for
the Big SQL, or customize the configuration by expanding the configuration files and modifying the
values. In the bigsql-env configuration group, you can enter a comma-separated
list of mounted database directories or enter /hadoop/bigsql if you plan to
use only one disk. In the bigsql-users-env configuration group, enter the
current Ambari administrator username and password, and enter the desired password for the Big SQL
user account. If you want to change the default username for this Big SQL service account, select
the Misc tab in the Customize Services page.
In the bigsql-env configuration group, you can enter a comma-separated list
of mounted database directories or enter /hadoop/bigsql if you plan to use
only one disk.
- You can review your selections in the Review page
before accepting them. If you want to modify any values, click the Back button.
If you are satisfied with your setup, click Deploy.
-
In the Install, Start and Test page, the Big SQL service is installed
and verified. If you have multiple nodes, you can see the progress on each node. When the
installation is complete, either view the errors or warnings by clicking the link. If there are no
installation errors, the Ambari web UI allows you to click Next to see a
summary and the new service added to the list of services.
If the Big SQL service fails to install, review the errors, correct the
problems, and click Retry in the Ambari UI. You can find additional
installation log files in /tmp/<bigsql_user>/logs.
If the install still fails, you must execute the fullBigSqlCleanup
script:
var/lib/ambari-server/resources/extensions/IBM-Big_SQL/5.0.1.0/services/BIGSQL/package/scripts/fullBigSqlCleanup.sh -u <ambari_user> -p <ambari_pass> -x <ambari_port>
The following code is an example of the command:
/var/lib/ambari-server/resources/extensions/IBM-Big_SQL/5.0.1.0/services/BIGSQL/package/scripts/fullBigSqlCleanup.sh -u admin -p admin -x 8080
-
The following properties in the hdfs-site.xml,
core-site.xml, and yarn-site.xml sections of the
configuration are updated for you by the installation of Big SQL. You can verify that these
properties are configured.
-
In the Ambari web interface, click the HDFS service.
-
Click the Configs tab and then the Advanced tab/
- Expand the section to see the following properties:
- Key: hadoop.proxyuser.bigsql.groups
- Value: *, or preferably set to one or more groups of users that the
bigsql user is allowed to impersonate.
- Key: hadoop.proxyuser.bigsql.hosts
- Value: Substitute with the fully qualified names of the hosts on which the Big SQL service is
installed. If Big SQL HA is enabled, this comma-delimited list must also contain the hostnames of
both Big SQL head nodes.
- Expand the to see the following property:
- Key: dfs.namenode.acls.enabled
- Value: true
-
In the Ambari web interface, click the YARN service.
-
Click the Configs tab and then the Advanced tab.
Expand the Resource Manager section.
-
Find the yarn.admin.acl property.
- In the Value text field for the yarn.admin.acl
property, find the bigsql user.
It might look like the following value:
yarn,bigsql.
-
Restart all services for which Ambari web UI indicates a restart is required.
-
For each service that requires restart, select the service.
-
Click Service Actions.
-
Click Restart All.
-
A web application interface for Big SQL monitoring and editing is available to your end-users
to work with Big SQL. You access this monitoring utility from the Big SQL service.
- Restart the Knox Service. Also start
the Knox Demo LDAP service if you have not configured your own LDAP.
-
If you want to run SQL statements from the Big SQL monitoring and editing tool, install DSM and use the quick links button on the
DSM service to access the console:
https://<knox_host>:<knox_port>/<knox_gateway_path>/default/dsm
Where:
- knox_host
- The host where Knox is installed and running
- knox_port
- The port where Knox is listening (by default this is 8443)
- knox_gateway_path
- The value entered in the gateway.path field in the Knox configuration (by
default this is 'gateway')
For example, the URL might look like the following address:
https://myhost.company.com:8443/gateway/default/dsm
If you use the Knox Demo LDAP service, the default credential
is:
userid = guest
password = guest-password
To
invalidate your session, click the Menu icon on the Big
SQL page and select Sign Out. You will need to re-authenticate
before being able to display the Big SQL page again.
Your end users
can also use the JSqsh client,
which is a component of the Big SQL service.
- For HBase, do the following post-installation steps:
- For all nodes where HBase is installed, check that the
symlinks to hive-serde.jar and hive-common.jar in the hbase/lib directory
are valid.
-
After installing the Big SQL service, and fixing the symlinks, restart the HBase service from
the Ambari web interface.
-
If you want to enable the automatic synching of catalog changes from Hive, open the Ambari web
interface, click and select Enable Metadata Sync.
This enables the automatic processing of Hive catalog changes in Big SQL.
- If you are planning to use the HA feature with Big SQL, ensure that the local user
"bigsql" exists in /etc/passwd on all nodes in the cluster.
-
If you want to load data by using the LOAD command into Hadoop tables from files that are
compressed by using the lzo codec, do the following steps:
-
Update the HDFS configuration.
- Click the HDFS service and open the tabs.
- Expand the Advanced core-site section.
- Edit the io.compression.codecs field and append the following
value to
it:
com.hadoop.compression.lzo.LzoCodec,com.hadoop.compression.lzo.LzopCodec
- Restart the HDFS and Big SQL services.
-
Make sure the following packages are installed on all nodes in the cluster:
- lzo
- lzop
- hadoop-lzo
- hadoop-lzo-native
What to do next
After you add Big SQL worker nodes, make sure that you stop and then restart the
Hive service.
If the Big SQL check script shows that Big SQL cannot create an HBase table due to insufficient
privileges, then run the following command in the HBase shell:
grant 'bigsql', 'RWCA'
For post-installation tasks that you can do, see Post-installation configuration. For information about using Big SQL,
see Analyzing and manipulating big data with Big SQL.
If Big SQL did not install successfully and you do not want to try installing it again, you can
run the fullBigSQLCleanup.sh clean-up utility to restore your environment to its
previous state. For more information, see Clean-up utility.
Note: When you navigate to the Big SQL service page on the Ambari server UI, the
last action listed under
Service Actions is
Delete
Service. Please
DO NOT click on this option to delete Big SQL. Use of the
Delete Service option can leave the Big SQL service in an undefined state, which means that
subsequent attempts to re-install the service might fail. If you want to remove the installed Big
SQL service including all its artifacts, please follow the instructions in
Removing IBM Big SQL.