Tuesday 27 September 2016

Running maven build HBase application in MapR


Step by step guide to build a simple HBase application using Maven and execute the same in a MapR cluster.

Assumption: We already have a MapR cluster up and running with Hbase and other required components.

Perform the following steps:

      Go into the following folder and create a directory for maven
cd /home/mapr
mkdir maven

        Download and install maven
cd maven
tar -xvf apache-maven-3.3.9-bin.tar.gz

        Set maven home
export M2_HOME=/home/mapr/maven/apache-maven-3.3.9
export PATH=${M2_HOME}/bin:${PATH}

        Create a folder for the sample HBase application
mkdir -p java-jobs/hbase
cd java-jobs/hbase

        Generate a sample maven project
mvn archetype:generate -DgroupId=com.ajames.hbase.examples -DartifactId=hbaseapp -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

        Modify the pom.xml file to add the mapr repository details
cd hbaseapp/
vi pom.xml

Add mapr repository details:
              
  <repositories>
    <repository>
      <id>mapr-maven</id>
      <url>http://repository.mapr.com/maven</url>
      <releases><enabled>true</enabled></releases>
      <snapshots><enabled>false</enabled></snapshots>
    </repository>
  </repositories>

        Modify pom.xml file to add the HBase dependecy

  <dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.1.1-mapr-1602</version>
  </dependency>

        Create the sample application
cd src/main/java/com/ajames/hbase/examples/

Sample code:
package com.ajames.hbase.examples;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import java.io.IOException;

/* Create table schema using following command.
 * echo "create 'details','a','b'" | hbase shell
 */

public class HBaseDemo {

                    public static void main(String[] args) throws IOException {
                                        Configuration conf = HBaseConfiguration.create();
                                        HTable table = new HTable(conf,"details");
                                        Put p1 = new Put("details1".getBytes());
                                       
                                        byte[] a = "a".getBytes();
                                        byte[] b = "b".getBytes();
                                       
                                        p1.add(a,"c1".getBytes(),"First".getBytes());
                                        p1.add(b,"c1".getBytes(),"Second".getBytes());
                                                                               
                                        table.put(p1);
                                        table.close();
                    }
}

        Building the application
cd /home/mapr/java-jobs/hbase/hbaseapp
mvn clean package

Running the application
export HADOOP_CLASSPATH=`hbase classpath`
[mapr@m52-d18-1 hbaseapp]$ hadoop jar target/hbaseapp-1.0-SNAPSHOT.jar com.ajames.hbase.examples.HBaseDemo
16/09/26 21:08:01 INFO client.ConnectionFactory: mapr.hbase.default.db unsetDB is neither MapRDB or HBase, set HBASE_MAPR mode since mapr client is installed.
16/09/26 21:08:01 INFO client.ConnectionFactory: ConnectionFactory receives mapr.hbase.default.db(unsetDB), set clusterType(HBASE_MAPR), hbase_admin_connect_at_construction(false)
16/09/26 21:08:01 INFO client.ConnectionFactory: ConnectionFactory creates a hbase connection!
16/09/26 21:08:02 INFO client.HTable: BufferedMutator Use HBase ThreadPool
16/09/26 21:08:02 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x66dd1d3 connecting to ZooKeeper ensemble=10.10.72.147:5181,10.10.72.148:5181
16/09/26 21:08:02 INFO client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15767d869180047


      Verify the HBase table
hbase shell
hbase(main):001:0> scan 'details'
ROW                                              COLUMN+CELL
 details1                                        column=a:c1, timestamp=1474938482964, value=First
 details1                                        column=b:c1, timestamp=1474938482964, value=Second

1 row(s) in 0.5730 seconds