CentOS6.5+HADOOP2.7.1安装配置测试编译详细教程

最新推荐文章于 2023-07-11 14:46:59 发布

lbyyy

最新推荐文章于 2023-07-11 14:46:59 发布

阅读量4.1k

点赞数

分类专栏： linux 文章标签： hadoop 安装测试

本文链接：https://blog.csdn.net/lbyyy/article/details/49001847

版权

linux 专栏收录该内容

19 篇文章 0 订阅

订阅专栏

HADOOP2.7.0为测试版本，2.7.1才是正式版

由于网络上向下载的hadoop-2.7.1.tar.gz无法在64位系统运行【存在native 32->64问题等】，所以下载源代码自行编译了,具体编译过程见后，由于权限问题，不能上传286MB的附件，具体需要的，可以邮件：４４８９３０６７４at qq.com 联系。也可以按照后面自行编译。

1. 规划：
准备四台机器，操作系统都是centOS6.5[2.6.32-431.el6.x86_64]，角色分配如下：
ip    hostname    role
192.168.81.151    hdp01    secondraynamenode, job tracker, namenode
192.168.81.152    hdp02    task tracker, datanode
192.168.81.153    hdp03    task tracker, datanode
192.168.81.154    hdp04    task tracker, datanode

2. 系统环境准备：
ssh到每台远程机器上
1)修改hostname
/etc/sysconfig/network
NETWORKING=yes
HOSTNAME=hdp01/02/03/04 <-修改为对应服务器名
NOZEROCONF=yes
2)修改名称解析
#vi /etc/hosts
127.0.0.1    localhost
192.168.81.151   hdp01
192.168.81.152   hdp02
192.168.81.153   hdp03
192.168.81.154   hdp04
3）添加HADOOP用户及用户组
sudo groupadd hadoop
sudo useradd hadoop -g hadoop
sudo passwd hadoop 123456
4）每台机器上，建立相应文件夹
$ mkdir /usr/local/bg
$ chmod 777 -R /usr/local/bg
$ mkdir /usr/local/bg/storage/hadoop
$ mkdir /usr/local/bg/storage/hadoop/temp
$ mkdir /usr/local/bg/storage/hadoop/data
$ mkdir ~/.ssh
$ chmod 755 ~/.ssh

3.在所有的机器上安装JAVA：
$scp jdk-8u40-linux-x64.tar.gz hadoop@192.168.81.151:/usr/local/bg
$tar -zxvf jdk-8u40-linux-x64.tar.gz
设置 java 环境
#vim /etc/profile
export JAVA_HOME=/usr/local/bg/jdk1.8.0_40
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$CLASSPATH:.:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
立即生效：source /etc/profile

4. 在所有的机器上安装hadoop
由于网络上向下载的hadoop-2.7.1.tar.gz无法在64位系统运行【存在native hadoop问题、32-64问题等】，所以下载源代码自行编译了,具体编译过程见后，由于权限问题，不能上传286MB的附件。
$scp hadoop-2.7.1.LiBin.CentOS6.5.[2.6.32-431.el6.x86_64].tar.gz hadoop@192.168.81.151:/usr/local/bg
$tar -zxvf hadoop-2.7.1.My.CentOS6.5.[2.6.32-431.el6.x86_64].tar.gz

修改环境变量
# vim /etc/profile
# set hadoop environment
export HADOOP_HOME=/usr/local/bg/hadoop-2.7.1
export HADOOP_HOME_WARN_SUPPRESS=1
export PATH=$PATH:$HADOOP_HOME/bin
source /etc/profile
[hadoop@hdp01 hadoop]# pwd
/usr/local/bg/hadoop-2.7.1/etc/hadoop

修改/usr/local/bg/hadoop-2.7.1/etc/hadoop/文件夹下的hadoop-env.sh、yarn-env.sh的JAVA_HOME，否则启动时会报error
export JAVA_HOME=/usr/local/bg/jdk1.8.0_40

修改 etc/hadoop/下core-site.xml文件: ：<修改XML后必须同步4服务器，否则，无法正常启动点>
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hdp01:9000</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/bg/storage/hadoop/temp</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
<description>Should native hadoop libraries, if present, be used.</description>
</property>
</configuration>

修改 etc/hadoop/下的hdfs-site.xml
<configuration>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/bg/storage/hadoop/data</value>
<final>true</final>
<description>
    Determines where on the local filesystem the DFS name node should store the name table(fsimage). If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy.
</description>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hdp01:9001</value>
<final>true</final>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
<final>true</final>
<description>
    Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.
</description>
</property>
</configuration>

修改conf下mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hdp01:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hdp01:19888</value>
</property>
<property>
    <name>mapred.job.tracker</name>
    <value>hdp01:19001</value>
    <description>
      The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.
    </description>
</property>
</configuration>

修改yarn-site.xml，加上
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hdp01:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hdp01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hdp01:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hdp01:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hdp01:8088</value>
</property>
</configuration>

修改etc/hadoop下slavers成：
hdp02
hdp03
hdp04
这个是决定datanode和tasktracker的

SSH免密码访问设置：
151:
　$cd ~/.ssh
　$ssh-keygen -t rsa --然后一直按回车键，将生成的密钥保存在.ssh/id_rsa文件中。
　$cp id_rsa.pub authorized_keys
　$scp authorized_keys hadoop@192.168.81.152:/home/hadoop/.ssh
152:
　$cd ~/.ssh
　$ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$scp authorized_keys hadoop@192.168.81.153:/home/hadoop/.ssh
153:
　$cd ~/.ssh
　$ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$scp authorized_keys hadoop@192.168.81.154:/home/hadoop/.ssh
154:
　$cd ~/.ssh
　$ssh-keygen -t rsa
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
$scp authorized_keys hadoop@192.168.81.151:/home/hadoop/.ssh
$scp authorized_keys hadoop@192.168.81.152:/home/hadoop/.ssh
$scp authorized_keys hadoop@192.168.81.153:/home/hadoop/.ssh
然后分别在151、152、153、154上互相$ssh hdp01、ssh hdp02、ssh hdp03、ssh hdp04访问1次后就OK

出现需要密码ssh 关闭IPTABLES与SELINUX,互相访问1次后OK
service iptables stop
chkconfig iptables off
vi /etc/selinux/config
SELINUX=disabled
非root出现需要密码，解决如下：
查看/var/log/secure
报Authentication refused: bad ownership or modes for directory /home/hadoop/.ssh/
确实是用户主目录的权限问题造成的
/home/hadoop/.ssh/ 之前是777
后来改成755后就正常了

格式化HADOOP空间
bin/hdfs namenode -format，注意只需要格式化一次，否则你的数据将全部丢失，还会出现datanode不能启动等一系列问题

5.启动hadoop
在主结点151上进行操作
启动：sbin/start-all.sh
关闭：sbin/stop-all.sh

6.测试：
在主结点151上sbin/jps
应该发现ResourceManager,SecondrayNameNode, NameNode这个3个进程
修改测试电脑名称解析
#vi /etc/hosts
127.0.0.1    localhost
192.168.81.151   hdp01
192.168.81.152   hdp02
192.168.81.153   hdp03
192.168.81.154   hdp04
hdp01:8088查看集群信息
hdp01:50070能进行一些节点的管理
[hadoop@hdp01 ~]# mkdir test
[hadoop@hdp01 ~]# cd test
[hadoop@hdp01 test]# echo "hello world" > t1.txt
[hadoop@hdp01 test]# echo "hello hadoop" > t2.txt
[hadoop@hdp01 test]# ll
total 8
-rw-r--r-- 1 root root 12 9月 15 01:42 t1.txt
-rw-r--r-- 1 root root 13 9月 15 01:43 t2.txt
在虚拟分布式文件系统上创建2个测试目录
bin/hdfs dfs -mkdir /in
bin/hdfs dfs -mkdir /out
[hadoop@hdp01 test]# bin/hdfs dfs -put ./ /in
[hadoop@hdp01 test]# bin/hdfs dfs -ls /in/test
Found 2 items
-rw-r--r--   3 root supergroup         12 2015-09-15 01:43 /in/test/t1.txt
-rw-r--r--   3 root supergroup         13 2015-09-15 01:43 /in/test/t2.txt
[hadoop@hdp01 test]# bin/hdfs dfs -ls ./in
Found 2 items
-rw-r--r--   3 root supergroup         12 2015-09-15 01:43 /in/test/t1.txt
-rw-r--r--   3 root supergroup         13 2015-09-15 01:43 /in/test/t2.txt
[hadoop@hdp01 test]# hadoop dfs -ls
Found 1 items
drwxr-xr-x   - root supergroup          0 2015-09-15 01:43 /in
[hadoop@hdp01 ~]# hadoop dfs -get ./in/* ./
[hadoop@hdp01 ~]# ls
anaconda-ks.cfg install.log install.log.syslog test test1.txt test2.txt

向hadoop提交单词统计任务：
bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /in/test/t1.txt /out/o1
第一次运行用时：11mins, 10sec
Hadoop报错：NoRouteToHostException: No route to host ，发现153端口被iptables封，大量错误，巨慢，停iptables继续：
bin/hadoop jar./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /in/test/t2.txt /out/o2
28sec 【停iptables后，非常顺利】
查看HADOOP各服务器大致情况：
hdfs dfsadmin -report
进入和退出hadoop的安全模式
hdfs dfsadmin -safemode enter
hdfs dfsadmin -safemode leave

------------------------------------------------------------
拷贝的多个虚拟机还存在无法启动网络问题：
1) cat /etc/udev/rules.d/70-persistent-net.rules
记录下，eth1网卡的mac地址00:0c:29:d6:df:bb
# vi /etc/sysconfig/network-scripts/ifcfg-eth0
将 DEVICE="eth0" 改成 DEVICE="ethX" ,
将 HWADDR="00:0c:29:03:01:46" 改成上面的mac地址 HWADDR="00:0c:29:d6:df:bb" hdp01 eth1
2)"00:0c:29:25:2b:63" hdp02 eth3
3)"00:0c:29:21:9e:00" hdp03 eth3
4)"00:0c:29:19:e1:27" hdp04 eth2
service network restart
OK

------------------------------------------------------------
1.2.1版本测试：
[hadoop@hdp01 hadoop-1.2.1]# hadoop jar ./hadoop-examples-1.2.1.jar wordcount in out
15/09/15 01:49:40 INFO input.FileInputFormat: Total input paths to process : 2
15/09/15 01:49:40 INFO util.NativeCodeLoader: Loaded the native-hadoop library
15/09/15 01:49:40 WARN snappy.LoadSnappy: Snappy native library not loaded
15/09/15 01:49:41 INFO mapred.JobClient: Running job: job_201509150117_0001
15/09/15 01:49:42 INFO mapred.JobClient: map 0% reduce 0%
15/09/15 01:49:51 INFO mapred.JobClient: map 50% reduce 0%
15/09/15 01:49:52 INFO mapred.JobClient: map 100% reduce 0%
15/09/15 01:50:00 INFO mapred.JobClient: map 100% reduce 33%
15/09/15 01:50:02 INFO mapred.JobClient: map 100% reduce 100%
15/09/15 01:50:03 INFO mapred.JobClient: Job complete: job_201509150117_0001
15/09/15 01:50:03 INFO mapred.JobClient: Counters: 29
15/09/15 01:50:03 INFO mapred.JobClient:   Map-Reduce Framework
15/09/15 01:50:03 INFO mapred.JobClient:     Spilled Records=8
15/09/15 01:50:03 INFO mapred.JobClient:     Map output materialized bytes=61
15/09/15 01:50:03 INFO mapred.JobClient:     Reduce input records=4
15/09/15 01:50:03 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=5798174720
15/09/15 01:50:03 INFO mapred.JobClient:     Map input records=2
15/09/15 01:50:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=210
15/09/15 01:50:03 INFO mapred.JobClient:     Map output bytes=41
15/09/15 01:50:03 INFO mapred.JobClient:     Reduce shuffle bytes=61
15/09/15 01:50:03 INFO mapred.JobClient:     Physical memory (bytes) snapshot=420593664
15/09/15 01:50:03 INFO mapred.JobClient:     Reduce input groups=3
15/09/15 01:50:03 INFO mapred.JobClient:     Combine output records=4
15/09/15 01:50:03 INFO mapred.JobClient:     Reduce output records=3
15/09/15 01:50:03 INFO mapred.JobClient:     Map output records=4
15/09/15 01:50:03 INFO mapred.JobClient:     Combine input records=4
15/09/15 01:50:03 INFO mapred.JobClient:     CPU time spent (ms)=2180
15/09/15 01:50:03 INFO mapred.JobClient:     Total committed heap usage (bytes)=337780736
15/09/15 01:50:03 INFO mapred.JobClient:   File Input Format Counters
15/09/15 01:50:03 INFO mapred.JobClient:     Bytes Read=25
15/09/15 01:50:03 INFO mapred.JobClient:   FileSystemCounters
15/09/15 01:50:03 INFO mapred.JobClient:     HDFS_BYTES_READ=235
15/09/15 01:50:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=178147
15/09/15 01:50:03 INFO mapred.JobClient:     FILE_BYTES_READ=55
15/09/15 01:50:03 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25
15/09/15 01:50:03 INFO mapred.JobClient:   Job Counters
15/09/15 01:50:03 INFO mapred.JobClient:     Launched map tasks=2
15/09/15 01:50:03 INFO mapred.JobClient:     Launched reduce tasks=1
15/09/15 01:50:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=10296
15/09/15 01:50:03 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
15/09/15 01:50:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=13515
15/09/15 01:50:03 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
15/09/15 01:50:03 INFO mapred.JobClient:     Data-local map tasks=2
15/09/15 01:50:03 INFO mapred.JobClient:   File Output Format Counters
15/09/15 01:50:03 INFO mapred.JobClient:     Bytes Written=25
$ls /usr/local/bg/storage/data
在hdfs文件系统中，数据是存放在datanode中，因此应该从slave结点中查看。由上图可以看出，hdfs系统上的文件如果从linux角度上来看，主要是一些元数据和一些数据项，这两者才构成一个完整的文件，也就是说在linux角度查看hdfs文件的数据内容时，是一堆乱七八糟的东西，是没有任何意义的。
[hadoop@hdp01 bg]# hadoop dfsadmin -report
Configured Capacity: 12707119104 (11.83 GB)
Present Capacity: 4093890560 (3.81 GB)
DFS Remaining: 4093448192 (3.81 GB)
DFS Used: 442368 (432 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)
Name: 192.168.81.153:50010
Decommission Status : Normal
Configured Capacity: 4235706368 (3.94 GB)
DFS Used: 147456 (144 KB)
Non DFS Used: 2871005184 (2.67 GB)
DFS Remaining: 1364553728(1.27 GB)
DFS Used%: 0%
DFS Remaining%: 32.22%
Last contact: Tue Sep 15 02:00:59 CST 2015

Name: 192.168.81.154:50010
Decommission Status : Normal
Configured Capacity: 4235706368 (3.94 GB)
DFS Used: 147456 (144 KB)
Non DFS Used: 2871156736 (2.67 GB)
DFS Remaining: 1364402176(1.27 GB)
DFS Used%: 0%
DFS Remaining%: 32.21%
Last contact: Tue Sep 15 02:00:58 CST 2015

Name: 192.168.81.152:50010
Decommission Status : Normal
Configured Capacity: 4235706368 (3.94 GB)
DFS Used: 147456 (144 KB)
Non DFS Used: 2871066624 (2.67 GB)
DFS Remaining: 1364492288(1.27 GB)
DFS Used%: 0%
DFS Remaining%: 32.21%
Last contact: Tue Sep 15 02:00:59 CST 2015
进入和退出hadoop的安全模式
[hadoop@hdp01 bg]# hadoop dfsadmin -safemode enter
[hadoop@hdp01 bg]# hadoop dfsadmin -safemode leave

-------------------------------------------------------------------
下面是编译HADOOP2.7.1部分信息，记录供自己留存

native hadoop问题32-64问题，重新编译hadoop：

maven官方下载地址，可以选择源码编码安装，这里就直接下载编译好的就可以了
wget http://mirror.bit.edu.cn/apache/maven/maven-3/3.1.1/binaries/apache-maven-3.1.1-bin.zip
解压文件后，同样在/etc/profie里配置环境变量
export MAVEN_HOME=/usr/local/bg/apache-maven-3.1.1
export PATH=$PATH:$MAVEN_HOME/bin
验证配置是否成功: mvn -version
    Apache Maven 3.1.1 (0728685237757ffbf44136acec0402957f723d9a; 2013-09-17 23:22:22+0800)
    Maven home: /opt/maven3.1.1
    Java version: 1.7.0_45, vendor: Oracle Corporation
    Java home: /opt/jdk1.7/jre
    Default locale: en_US, platform encoding: UTF-8
    OS name: "linux", version: "2.6.32-358.el6.x86_64", arch: "amd64", family: "unix"
由于maven国外服务器可能连不上，先给maven配置一下国内镜像，在maven目录下，conf/settings.xml,在<mirrors></mirros>里添加，原本的不要动
   <mirror>
        <id>nexus-osc</id>
         <mirrorOf>*</mirrorOf>
     <name>Nexusosc</name>
     <url>http://maven.oschina.net/content/groups/public/</url>
   </mirror>
同样，在<profiles></profiles>内新添加
<profile>
       <id>jdk-1.7</id>
       <activation>
         <jdk>1.7</jdk>
       </activation>
       <repositories>
         <repository>
           <id>nexus</id>
           <name>local private nexus</name>
           <url>http://maven.oschina.net/content/groups/public/</url>
           <releases>
             <enabled>true</enabled>
           </releases>
           <snapshots>
             <enabled>false</enabled>
           </snapshots>
         </repository>
       </repositories>
       <pluginRepositories>
         <pluginRepository>
           <id>nexus</id>
          <name>local private nexus</name>
           <url>http://maven.oschina.net/content/groups/public/</url>
           <releases>
             <enabled>true</enabled>
           </releases>
           <snapshots>
             <enabled>false</enabled>
           </snapshots>
         </pluginRepository>
       </pluginRepositories>
     </profile>
安装基本应用程序：
yum -y install svn ncurses-devel gcc*
yum -y install lzo-devel zlib-devel autoconf automake libtool cmake openssl–devel
yum -y install build-essential autoconf automake libtool cmake zlib1g-dev pkg-config libssl-dev
安装 protobuf（不安装，编译将无法完成）
Hadoop使用protocol buffer进行通信，需要下载和安装protobhf-2.5.0.tar.gz；由于官网已经无法下载了，
可以到百度云盘下载http://pan.baidu.com/s/1pJlZubT
编译安装 protobuf
① cd   protobuf - 2.5.0
② ./configure
③ make
④ make install
检测 protoc –version
执行命令 mvn package -Pdist,native,docs -DskipTests -Dtar
第1次失败，没有安装protobuf
第2次失败，hadoop-common-project/hadoop-common/target/antrun/build-main.xml
yum -y install zlib-devel
yum -y install ncurses-devel
yum install ant
mvn clean package -Pdist,native,docs -DskipTests -Dtar
第3次失败，hadoop-common-project/hadoop-common/${env.FINDBUGS_HOME}/src/xsl/default.xsl doesn’t exist.
下载并安装配置findbugs-3.0.1.tar.gz到bg/下
vi /etc/profile添加
export FINDBUGS_HOME=/usr/local/bg/findbugs-3.0.1
mvn clean package -Pdist,native,docs -DskipTests -Dtar
第4次失败，[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part ...<exec dir="/home/xxl/hadoop-2.5.2-src/hadoop-tools/hadoop-pipes/target/native" executable="cmake" failοnerrοr="true">... @ 5:120 in /home/xxl/hadoop-2.5.2-src/hadoop-tools/hadoop-pipes/target/antrun/build-main.xml
[ERROR] -> [Help 1]
Libssl-dev on CentOS 6 is openssl-devel
yum install openssl-devel
mvn clean package -Pdist,native,docs -DskipTests -Dtar
第5次失败，4G空间不够：添加10G,使用6.8G
# fdisk /dev/sdb n p 1    w
# pvcreate /dev/sdb1
# vgextend vg_test /dev/sdb1
# lvextend -L 18.5G /dev/vg_test/lv_root
# resize2fs /dev/vg_test/lv_root
mvn clean package -Pdist,native,docs -DskipTests -Dtar
build success total time 39:59s
rm -rf src
resize2fs /dev/vg_test/lv_root 8.5G   <-失败，系统崩溃
lvreduce -L 8.5G /dev/vg_test/lv_root

此步骤要求逐级能访问公网，并且此步骤比较漫长，由于服务器配置和网上的不同因素，时间大致为1小时左右；
接下来就是漫长的等待，有些包下载的会比较慢，如果一些包下载过慢甚至最后停止，下次再重新编译的时候一定要先mvn clean 。
编译成功后， /hadoop-dist/target/hadoop-2.5.1.tar.gz   就是我们需要的文件了；
-------------------------------------------------
加载失败，输出为：
INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
在Hadoop的配置文件core-site.xml中可以设置是否使用本地库：
<property>
<name>hadoop.native.lib</name>
<value>true</value>
<description>Should native hadoop libraries, if present, be used.</description>
</property>
Hadoop默认的配置为启用本地库。
另外，可以在环境变量中设置使用本地库的位置：
export JAVA_LIBRARY_PATH=/usr/local/bg/hadoop-2.7.1/lib/native
---------------------------------
下载最新的protobuf，下载地址：https://code.google.com/p/protobuf/downloads/list
下载protobuf2.5.o版本，protobuf-2.5.0.tar.gz解压并进行安装。
解压：tar xvf protobuf-2.5.0.tar.gz
安装步骤：（1）./configure （2）make （3）make check （4）make install
注意：安装成功后，将它的bin和lib目录分别加入到PATH和LD_LIBRARY_PATH环境变量，以方便直接调用。
通常建议安装到/usr/local目录下，执行configure时，指定--prefix=/usr/local/protobuf即可
设置环境变量过程：编辑/etc/profile，在文件末尾添加：
export PATH=$PATH:/usr/local/protobuf/bin export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/protobuf/lib
-------------------------------------
hadoop-common-project/hadoop-common/target/antrun/build-main.xml
解决：
yum -y install zlib-devel
yum -y install ncurses-devel
yum -y install ant
debian:apt-get install zlib1g-dev
debian: apt-get install libncurses5-dev
------------------------------------------
hadoop-common-project/hadoop-common/${env.FINDBUGS_HOME}/src/xsl/default.xsl doesn’t exist. -> [Help 1]
下载并安装配置findbugs-3.0.1.tar.gz
http://findbugs.sourceforge.net/downloads.html
tar -zxvf findbugs-3.0.1.tar.gz 到bg/下
vi /etc/profile添加
export FINDBUGS_HOME=/usr/local/bg/findbugs-3.0.1
--------------------------------------------
编译HADOOP出现错误处理：
http://my.oschina.net/laigous/blog/356552
错误1：
[ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.2.0:protoc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecutionException: protoc version is 'libprotoc 2.4.1', expected version is '2.5.0' -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-common
安装protoc
wget https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
（此处下载https://code.google.com/p/protobuf/downloads/list）
解压，进入根目录执行 sudo ./configure --prefix=/usr
若安装报错：
cpp: error trying to exec 'cc1plus': execvp: No such file or directory
则安装g++
sudo apt-get install g++
sudo make
sudo make check
sudo make install
protoc --version
遇到protoc: error while loading shared libraries: libprotoc.so.8: cannot open shared object file: No such file or directory时，如ubuntu系统，默认安装在/usr/local/lib下，需要指定/usr。sudo ./configure --prefix=/usr 必须加上--proix参数，重新编译和安装。
错误2：
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-
    plugin:1.6:run (make) on project hadoop-common: An Ant BuildException has
    occured: Execute failed: java.io.IOException: Cannot run program "cmake" (in
    directory "/home/wyf/hadoop-2.0.2-alpha-src/hadoop-common-project/hadoop-
    common/target/native"): java.io.IOException: error=2, No such file or directory
    -> [Help 1]
    [ERROR]
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e
    switch.
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.
    [ERROR]
    [ERROR] For more information about the errors and possible solutions, please
    read the following articles:
    [ERROR] [Help 1]
    http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
安装Cmake
sudo apt-get install cmake
错误3：
ERROR] Failed to execute goal org.codehaus.mojo.jspc:jspc-maven-plugin:2.0-
alpha-3:compile (hdfs) on project hadoop-hdfs: Execution hdfs of goal
org.codehaus.mojo.jspc:jspc-maven-plugin:2.0-alpha-3:compile failed: Plugin
org.codehaus.mojo.jspc:jspc-maven-plugin:2.0-alpha-3 or one of its dependencies
could not be resolved: Could not transfer artifact ant:ant:jar:1.6.5 from/to
central (http://repo.maven.apache.org/maven2): GET request of:
ant/ant/1.6.5/ant-1.6.5.jar from central failed: Read timed out -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please
read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/PluginResolutionException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :hadoop-hdfs
安装ant
1.首先下载ant
百度网盘： apache-ant-1.9.4-bin.tar.gz
http://pan.baidu.com/s/1c0vjhBy
或则下面链接：
apache-ant-1.9.4-bin.tar.gz
2.解压
    tar zxvf apache-ant-1.9.4-bin.tar.gz
3.配置环境变量
vim ~/.bashrc
export ANT_HOME=/home/xxl/apache-ant-1.9.4
export PATH=$ANT_HOME:$PATH
source ~/.bashrc
错误4：
    [ERROR] Failed to execute goal org.apache.hadoop:hadoop-maven-plugins:2.4.0:prot
    oc (compile-protoc) on project hadoop-common: org.apache.maven.plugin.MojoExecut
    ionException: 'protoc --version' did not return a version -> [Help 1]
    [ERROR]
    [ERROR] To see the full stack trace of the errors, re-run Maven with the -e swit
    ch.
    [ERROR] Re-run Maven using the -X switch to enable full debug logging.
    [ERROR]
    [ERROR] For more information about the errors and possible solutions, please rea
    d the following articles:
    [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionE
    xception
    [ERROR]
    [ERROR] After correcting the problems, you can resume the build with the command
    [ERROR]   mvn <goals> -rf :hadoop-common
protobuf版本过低
安装2.5版本的即可
错误5：
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (compile) on project hadoop-snappy: An Ant BuildException has occured: The following error occurred while executing this line:
[ERROR] /home/ngc/Char/snap/hadoop-snappy/hadoop-snappy-read-only/maven/build-compilenative.xml:75: exec returned: 2
这个原因比较恶心，是因为Hadoop snappy对gcc版本还有要求，因为我是2012年12月份的ubuntu12.04，所以gcc已经是4.6了，但是在google code那看到有人说他从gcc4.6回退成gcc4.4就ok了，我也是了一下，果然这个错误没有了。
gcc --version #查看gcc版本

gcc (Ubuntu/Linaro 4.4.7-1ubuntu2) 4.6.3
Copyright © 2010 Free Software Foundation, Inc.
本程序是自由软件；请参看源代码的版权声明。本软件没有任何担保；
包括没有适销性和某一专用目的下的适用性担保。
如何回退呢？
1. apt-get install gcc-4.4
2. rm /usr/bin/gcc
3. ln -s /usr/bin/gcc-4.4 /usr/bin/gcc
之后，再gcc --version，你就会发现，gcc已经变成4.4.7了。
错误6：
.[exec] /bin/bash ./libtool --tag=CC   --mode=link gcc -g -Wall -fPIC -O2 -m64 -g -O2 -version-info 0:1:0 -L/usr/local//lib -o libhadoopsnappy.la -rpath /usr/local/lib src/org/apache/hadoop/io/compress/snappy/SnappyCompressor.lo src/org/apache/hadoop/io/compress/snappy/SnappyDecompressor.lo -ljvm -ldl
     [exec] /usr/bin/ld: cannot find -ljvm
     [exec] collect2: ld returned 1 exit status
     [exec] make: *** [libhadoopsnappy.la] 错误 1
     [exec] libtool: link: gcc -shared -fPIC -DPIC src/org/apache/hadoop/io/compress/snappy/.libs/SnappyCompressor.o src/org/apache/hadoop/io/compress/snappy/.libs/SnappyDecompressor.o   -L/usr/local//lib -ljvm -ldl -O2 -m64 -O2   -Wl,-soname -Wl,libhadoopsnappy.so.0 -o .libs/libhadoopsnappy.so.0.0.1
网上有很多解决/usr/bin/ld: cannot find -lxxx 这样的博客，但是这里，我告诉你，他们的都不适用。因为这儿既不是缺什么，也不是版本不对，是因为没有把安装jvm的libjvm.so symbolic link到usr/local/lib。如果你的系统时amd64，可到/root/bin/jdk1.6.0_37/jre/lib/amd64/server/察看libjvm.so link到的地方，这里修改如下：
ln -s /root/bin/jdk1.6.0_37/jre/lib/amd64/server/libjvm.so /usr/local/lib/
问题即可解决。
错误7：
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.6:run (make) on project hadoop-common: An Ant BuildException has occured: exec returned: 1 -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
安装zlib-devel
ubuntu安装是
sudo apt-get install zlib1g-dev
错误8：
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-pipes: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part ...<exec dir="/home/xxl/hadoop-2.5.2-src/hadoop-tools/hadoop-pipes/target/native" executable="cmake" failοnerrοr="true">... @ 5:120 in /home/xxl/hadoop-2.5.2-src/hadoop-tools/hadoop-pipes/target/antrun/build-main.xml
[ERROR] -> [Help 1]
安装：sudo apt-get install libssl-dev
错误9：
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (tar) on project hadoop-dist: An Ant BuildException has occured: exec returned: 1
[ERROR] around Ant part ...<exec dir="/home/xxl/hadoop-2.5.2-src/hadoop-dist/target" executable="sh" failοnerrοr="true">... @ 21:96 in /home/xxl/hadoop-2.5.2-src/hadoop-dist/target/antrun/build-main.xml
安装：sudo apt-get install build-essential
sudo apt-get install libglib2.0-dev

---------------------------------------------------------------------------------------------------------------------------------

附：

1.因时间问题程序无法运行解决：
Application err：
...current time is 1444483068327 found 1444480155352
Note: System times on machines may be out of sync. Check system time and time zones. ...
解决办法：
在master151上启动ntp服务
修改NTP的配置文件/etc/ntp.conf，添加：
restrict 192.168.81.0 mask 255.255.255.0 nomodify notrap
server 127.127.1.0
fudge 127.127.1.0 stratum 10
其他服务器：ntpdate 192.168.81.151

lbyyy

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
CentOS6.5+HADOOP2.7.1安装配置测试编译详细教程

HADOOP2.7.0为测试版本，2.7.1才是正式版由于网络上向下载的hadoop-2.7.1.tar.gz无法在64位系统运行【存在native 32->64问题等】，所以下载源代码自行编译了,具体编译过程见后，由于权限问题，不能上传286MB的附件，具体需要的，可以邮件：４４８９３０６７４at qq.com 联系。也可以按照后面自行编译。1. 规划：准备四台机器，操作系统都是
复制链接

扫一扫