Enabling Hive metastore high availability

To provide high availability for the Hive metastore, multiple Hive metastore services can be started to which clients can connect. Because the metastore services store their state in a database (for example the MySQL database in the default configuration), the database should be highly available as well, for example through replication. To configure high availability for your database, refer to the documentation provided by the database vendor.

About this task

In the Ambari web interface, configure all the nodes on which you want to run a Hive metastore service. It is recommended that you run each instance of the Hive metastore on a different cluster node.

Procedure

  1. From the Ambari web interface, click Hive and stop the service.
  2. While in the Hive service, click Service Actions > Add Hive Metastore.
  3. Select a host and then confirm.
  4. Start the Hive service.

    Metastore clients find the URI of the metastore from the configuration parameter hive.metastore.uris. The property hive.metastore.uris is a comma separated list of metastore URIs on which a metastore service is running.


    After restarting the Hive services, the value for hive.metastore.uris shows the new Hive metastore service, along with the already configured values:
    thrift://abc1209.abc.com:9083,thrift://abc1210.abc.com:9083 
    When a client connects to a metastore, it starts with the first URI in the list of metastore URIs. If the metastore from the first URI is not responding, it randomly picks another URI from the list until it is able to connect. If the client is not able to connect to any metastore, the connection fails.
  5. To configure the number of times Hive metastore clients try to connect to the metastore, modify the following parameters:
    Table 1. Properties that control retry attempts for connections
    Property Value Description
    hive.metastore.client.connect.retry.delay 5s The number of seconds for the client to wait between consecutive connection attempts.
    hive.metastore.connect.retries 24 The number of retries while opening a connection to metastore.
    With the default values, a metastore client would try to connect to any of the metastore URIs. If the connection fails to all of them, wait 5 seconds to re-try the connection. If the connection fails 24 times, a connection exception is thrown.
  6. In a secure cluster, also configure the following property:
    
    Name: hive.cluster.delegation.token.store.class 
    Value: org.apache.hadoop.hive.thrift.DBTokenStore 
  7. Restart all Hive services from the Ambari web interface after updating the Hive configurations.