How to install snappy with HBase 0.96.x (and Hadoop 2.2.0..)

Almost a year ago I published some lines about Snappy installation on HBase 0.94.x. Since both Hadoop 2.2.0 and HBase 0.96.0 are now out, I have decided to install a new cluster with those 2 versions.

Installation was quite simple but whe when I tried to move the data in this new cluster, things did not work well since I was missing Snappy, again.

Since it has not been that much straight forward to install it, goal of this blog is to provide some feedback on my experience.

Again, and first thing, the command line to confirm if snappy is working or not is the following:

bin/hbase org.apache.hadoop.hbase.util.CompressionTest file:///tmp/test.txt snappy

The goal is to get this final output:

2013-12-25 18:08:02,820 INFO? [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
2013-12-25 18:08:03,903 INFO? [main] util.ChecksumType: Checksum using org.apache.hadoop.util.PureJavaCrc32
2013-12-25 18:08:03,905 INFO? [main] util.ChecksumType: Checksum can use org.apache.hadoop.util.PureJavaCrc32C
2013-12-25 18:08:04,143 INFO? [main] compress.CodecPool: Got brand-new compressor [.snappy]
2013-12-25 18:08:04,149 INFO? [main] compress.CodecPool: Got brand-new compressor [.snappy]
2013-12-25 18:08:04,685 INFO? [main] compress.CodecPool: Got brand-new decompressor [.snappy]
SUCCESS

And here are the steps to get it. All the difficulties to get the snappy libs is to get the right versions of the right tools. To get the snappy libs for your infrastructure, you need to compile both Snappy and Hadoop (since it doesn't come with 64 bits native libs). To compile the Snappy lib and get the related so files, you can refer to the previous post. Getting the so files for Hadoop is a bit more complicated. Hadoop 2.2.0 depends on maven 3.0.5 and on protobuf 2.5. Debian Wheezy (stable) doesn't include those versions of those tools. Only Jessie (testing) contains those version. So you have 2 options to compile it. First, you change your distribution to the testing version, or you compile it into a VM then deploy it. I choosed the 2nd option.

Maven and protobuf required a recent version of libc6. You will need to install this version on all the servers you will install the libs on.

First, those packages are required:
- subversion
- maven
- biuld-essential
- cmake
- zlib1g-dev
- libsnappy-dev
- pkg-config
- libssl-dev

To install them, just run, as root:
apt-get install subversion maven buid-essential cmake zlib1g-dev libsnappy-dev pkg-config


Make sure you JAVA_HOME is setup correctly. I tried with Sun JDK 1.7.0u45.

When everything is installed correctly, you can start with the core of the operations. First, you will need to extract the hadoop source code. Make sure you extract the right tag. Adjust that it based on the Hadoop version you will install those libraries to.

svn checkout http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.2.0/

If you are using 2.2.0, some files will fail to compil. They are related to the tests. We don?t need them, so simply delete them.

release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/AuthenticatorTestCase.java
release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/TestPseudoAuthenticator.java
release-2.2.0/hadoop-common-project/hadoop-auth/src/test/java/org/apache/hadoop/security/authentication/client/TestKerberosAuthenticator.java

Now the tricky part. Hadoop 2.2.0 depends on Protobuf 2.5. However, this version is only available in debian experimental! So you will need to update your apt sources.list file to add the experimental repository and run this command:

apt-get -t experimental install protobuf-compiler

This will also update those packages under the experimental version:

libc-dev-bin libc6 libc6-dev locales

So you will have to get those versions too on the servers you are going to install to.

Now move into the release-2.2.0 folder and build using the following command:

mvn package -Drequire.snappy -Pdist,native,src -DskipTests -Dtar

If everything go as expected, you should find your so files under hadoop-dist/target/hadoop-2.2.0/lib/native/. Look for both libhdfs.so and libhadoop.so.

You should now have 3 files:

- libsnappy.so
- libhdfs.so
- libhadoop.so

Restore your sources.list to the initial version (remove the testing and experimental lines), and copy those 3 files under hbase/lib/native/Linux-amd64-64 (or under hbae/lib/native/Linux-i386-32 if you are not running 64 bits).

With your 3 files now in place you can now re-run the initial command and snappy should be running fine. Don't forget to copy those files on all your region servers.


Reference: http://www.spaggiari.org/index.php/hbase/how-to-install-snappy-with-1#.Us-8R_QW1CE

http://www.spaggiari.org/index.php/hbase/how-to-install-snappy-with#.Us--2vQW1CE

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值