This example demonstrates how to implement a group-by aggregation using HBase coprocessor and Algebird monoid. The HBase version we used here is 0.94.18, which is exactly the same one available on AWS EMR.
- Download and unzip hbase 0.94.18
- Start hbase
bin/start-hbase.sh
- Create a table named as mobile-device
create 'mobile-device', { NAME => 'stats', VERSIONS => 1, TTL => 7776000 }
- Compile a fat jar, and copy it to HBase classpath
$ sbt assembly
$ cp $PWD/target/scala-2.10/hbase-coprocessor-assembly-1.0.jar $HBASE_DIR/lib/
- Edit $HBASE_HOME/conf/hbase-site.xml to include coprocessor class in the configuration
<property>
<name>hbase.coprocessor.region.classes</name>
<value>GroupByMonoidSumCoprocessorEndpoint</value>
</property>
- Restart HBase
$ $HBASE_HOME/bin/hbase-stop.sh
$ $HBASE_HOME/bin/hbase-start.sh
$ sbt test