Tuesday, May 12, 2015

Trouble Connecting Apache Spark with Hbase due to missing classes

Ideally when you try to connect HBase with Apache Spark, in most of the cases, it throws exception like ImmutableBytesWritableToStringConverter or Google Utils not found and various other errors while trying to run.

Almost all belongs to the same family of missing drivers.


To solve it straight forward,


Just go to spark-defaults.conf

update your spark.driver.extraClassPath with required libraries. keep on adding them.

like for missing ImmutableBytesWritableToStringConverter , add spark-examples-1.3.1-hadoop2.4.0.jar.


spark.driver.extraClassPath /Users/abhishekchoudhary/anaconda/anaconda/lib/python2.7/site-packages/graphlab/graphlab-create-spark-integration.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/hbase-server-0.98.6-cdh5.2.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/hbase-protocol-0.98.6-cdh5.2.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/hbase-hadoop2-compat-0.98.6-cdh5.2.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/hbase-client-0.98.6-cdh5.2.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/hbase-common-0.98.6-cdh5.2.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/htrace-core-2.04.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/lib/spark-examples-1.3.1-hadoop2.4.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/spark-1.3.1/lib/spark-assembly-1.3.1-hadoop2.4.0.jar:/Users/abhishekchoudhary/bigdata/cdh5.2.0/hbase/lib/guava-12.0.1.jar




And one more thing , its actually ultra fast to access Hbase using Spark , so real-time updates


No comments: