Overriding HBase put timestamp with hive exports

With Hive HBase-backed external tables, you can override the put timestamp by setting the hbase.put.timestamp property.

However, at this time (last tested with Hive 0.8.1), you need to drop and re-create the external table every time you want to update the put timestamp because of these reasons:

  1. hbase.put.timestamp is a serde property set when defining the external table, and is fixed at creation time
  2. in general, you can modify serde properties via ALTER TABLE.. but this doesn’t work for external tables.

As an example of #2, if you try to update the put timestamp (suppose we’re changing it to 10000005), you would run:

alter table stock SET SERDEPROPERTIES ("hbase.put.timestamp" = "10000005", "hbase.columns.mapping" = ":key,f1:name,f1:lv");

But because this is an external table, you get this error:

FAILED: Error in metadata: Cannot use ALTER TABLE on a non-native table
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

Given the above issues, for now the only way to make this work is to drop and recreate the external table with the updated timestamp. Clearly you don’t want to do this for every put, but hive-to-hbase inserts are a batch scenario anyway. So the best fit is an example like batch inserts to hbase every hour, in which case you’d want to assign the current datetime in milliseconds, rounded to hour.

Note that this put timestamp value can be parameterized within a script for creating the external table, which is useful for recurring loads.

So if we have a script to create the external table called create_external_table.hql, we’d modify the serdeproperties to add:

WITH SERDEPROPERTIES ("hbase.put.timestamp" = "${hiveconf:put_timestamp}" ...

Then, for each load to hbase, you’d do the following

  1. drop external hbase table (note this doesn’t drop hbase data since it’s external)
  2. set put_timestamp environment variable
  3. call create_external_table.hql with slight modification shown below
  4. load the data as usual
hive -f create_external_table.hql -hiveconf put_timestamp=$put_timestamp

Loading in this manner, I verified multiple versions are successfully created and maintained, up to the max_versions amount (specified at table creation time).

Hadoop HA impact on HBase

Overview

High availability was a much anticipated feature of Hadoop 2.0.0, because it meant the namenode was no longer a single point of failure. Being skeptical, I wanted to test it out. Specifically, I wanted to see the impact to HBase clients while a failover was occurring. To make it interesting, I ran the terasort M/R job as well as a YCSB workload to target HBase.

The goals was to determine impact to cluster during namenode failover, looking at both performance and correctness

Test #1: Performance and general impact under stress

Test Setup: Stress the 40-node cluster by running these 2 jobs:

  • Terasort: read heavy map reduce
  • HBase reads/writes: running mixed workload, ycsb workload a which is 50% reads and 50% writes
    • Chose this workload because the most impact to HBase would occur in write-heavy workloads
    • Ran at target throughput of 5K operations per second

Performed failover by stopping active namenode in Cloudera Manager while keeping the above jobs running.

Event timeline

master1 and master2 are the namenodes. This timeline shows the active/standby transitions as the current active was stopped.

time master1 master2
Before 1:13 standby active
1:13 now active STOPPED
1:17 restarted as standby
1:18 healthy as standby
1:19 STOPPED now active
1:23 restarted as standby
1:24 healthy as standby

Failover occurred within the lowest granularity of remaining log data available (1 minute).

Screen Shot 2014-03-03 at 9.28.25 PM

HBase Impact

The failover was mostly unnoticable to HBase clients. During first failover, the throughput dipped but was close to recovery within 5 minutes. At that time, we triggered a second failover by stopping the currently active namenode. This caused more performance impact due to HDFS read latency.

Note that this second failover was just mean of us, but we were curious.

time event
1:24 HBase Critical event: “The health test result for Health check: REGION_SERVER_READ_LATENCY has become bad: Average HDFS read latency is 109 ms. Critical threshold: 100 ms.”
1:26 HBase health reports start improving but performance impact still noticeable – throughput drop to ~800 ops per second and timeout exceptions to client code
1:32 HBase healthy again

After second failover, timeouts and drop to lowest level around 1:29. HBase clients would need to catch and handle this exception, but this is commonly done anyway.

Screen Shot 2014-03-03 at 9.30.46 PM

Drill into HDFS view of 1 data node

Screen Shot 2014-03-03 at 9.31.54 PM

Test #2: Correctness

For this test, I modified the HBase PerformanceEvaluation class to write predictable values. I ran this in write mode during failover, and read mode afterward to verify all keys and values were present. This test passed.