Running TPC-DS test

This topic lists the steps to run a TPC-DS test.

  1. Prepare Hive-testbench by running the tpcdc-build.sh script to build the TPC-DS and the data generator. Run the tpcds-setup to set up the testbench database and load the data into the created tables.
    cd ~/hive-testbench-hive14/
    
    ./tpcds-build.sh 
    
    ./tpcds-setup.sh 2 (A map reduce job runs to create the data and load the data into hive. 
    This will take some time to complete. The last line in the script is: Data loaded into 
    database tpcds_bin_partitioned_orc_2.)
    
  2. Create a new remote Low Latency Analytical Processing (LLAP) database on the remote HDFS Transparency cluster.
    hive> DROP database if exists llap CASCADE;
    hive> CREATE database if not exists llap LOCATION 'hdfs://c16f1n03.gpfs.net:8020/user/hive/llap.db';
    
  3. Create 24 tables and load data from the tables.
    hive> DROP table if exists llap.call_center; 
    hive> CREATE table llap.call_center stored as orc as select * from tpcds_text_2.call_center;
    
  4. Run the benchmark queries on the tables that you created on the remote LLAP database.
    hive> use llap;
    hive> source query52.sql; 
    hive> source query55.sql; 
    hive> source query91.sql;
    hive> source query42.sql; 
    hive> source query12.sql; 
    hive> source query73.sql; 
    hive> source query20.sql; 
    hive> source query3.sql; 
    hive> source query89.sql; 
    hive> source query48.sql;
    

    For more information, refer to the Apache Hive SQL document.