snappy的介绍就不多说,我们只需要知道它是一个压缩库在hadoop中有着很广泛的使用,具体详见http://www.infoq.com/cn/news/2011/04/Snappy/
环境:hadoop4台测试环境,详见之前文章
1.安装包
yum install gcc c++ autoconf automake libtool gcc+ gcc-c++
2.安装maven3
cd /usr/local
wget http://mirror.bjtu.edu.cn/apache//maven/binaries/apache-maven-3.0.5-bin.tar.gz
tar xzvf apache-maven-3.0.5-bin.tar.gz
ln -s apache-maven-3.0.5 maven
**2.1 vi /etc/profile**
export JAVA_HOME=/usr/local/jdk1.6.0_45
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export JAVA_HOME JAVA_BIN PATH CLASSPATH
export MAVEN_HOME=/usr/local/maven
PATH=$PATH:$HOME/bin:$MAVEN_HOME/bin
source /etc/profile
**#2.2验证**
mvn -version
3.下载安装snappy
http://code.google.com/p/snappy/downloads/list 最新版本1.1.3
tar xzvf snappy-1.1.3.tar.gz
cd snappy-1.1.3
./configure
make
make install
默认安装在/usr/local/lib
4.下载hadoop-snappy
https://github.com/electrum/hadoop-snappy
cd /user/local
unzip hadoop-snappy-master.zip
cd hadoop-snappy-master
mvn clean package
生成完毕后在target,hadoop-snappy-0.0.1-SNAPSHOT.tar.gz
解压
cp hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64/* $HADOOP_HOME/lib/native/
cd $HADOOP_HOME/lib/native
chown -R hadoop:hadoop libsnappy*
chown -R hadoop:hadoop libhadoopsnappy*
chmod 755 libsnappy.so*
chmod 755 libhadoopsnappy.so*
5.复制文件到其他机器
因为snappy主要是执行MR的时候,需要在datanode上执行,将$HADOOP_HOME/lib/native下的snappy文件和/usr/local/lib文件copy到对应机器即可。
重新执行wordcount的MR,执行成功