Hive Metastore overview

Hive Metastore (HMS) is a service that stores metadata that is related to Presto and other services in a backend Relational Database Management System (RDBMS) or Hadoop Distributed File System (HDFS).

IBM watsonx.data stand-alone

When you create a new table, information that is related to the schema such as column names, data types is stored in the metastore relational database. A metastore enables the user to see the data files in the HDFS object storage as if they are stored in tables with HMS.

Metastore acts as a bridge between the schema of the table and the data files that are stored in object storages. HMS holds the definitions, schema, and other metadata for each table and maps the data files and directories to the table representation that is viewed by the user. Therefore, HMS is used as a storage location for the schema and tables. HMS is a metastore server that connects to the object storage to store data and keeps its related metadata on PostgreSQL.

Any database with a JDBC driver can be used as a metastore. Presto makes requests through thrift protocol to HMS. The Presto instance reads and writes data to HMS. HMS supports 5 backend databases as follows. In watsonx.data, PostgreSQL database is used.
  • Derby
  • MySQL
  • MS SQL Server
  • Oracle
  • PostgreSQL
Currently HMS in watsonx.data supports the Iceberg table format.
The following three modes of deployment are supported for HMS. In watsonx.data the remote mode is used.
  • Embedded Metastore - Derby with singe session.
  • Local Metastore - MySQl with multiple sessions accessible locally.
  • Remote Metastore - metastore runs on its own separate JVM and is accessible by using thrift network APIs.