http://www.loongson.cn/news/company/468.html
一、hadoop 简介
hadoop是一个由Apache基金会所开发的分布式系统基础架构。用户可以在不了解分布式底层细节的情况下,开发分布式程序。充分利用集群的威力进行高速运算和存储。
龙芯3A2000上运行Hadoop
hadoop实现了一个分布式文件系统(hadoop Distributed File System),简称HDFS。HDFS有高容错性的特点,并且设计用来部署在低廉的(low-cost)硬件上;而且它提供高吞吐量(high throughput)来访问应用程序的数据,适合那些有着超大数据集(large data set)的应用程序。HDFS放宽了(relax)POSIX的要求,可以以流的形式访问(streaming access)文件系统中的数据。
hadoop的框架最核心的设计就是:HDFS和MapReduce。HDFS为海量的数据提供了存储,则MapReduce为海量的数据提供了计算。
hadoop是一个能够对大量数据进行分布式处理的软件框架, 它以一种可靠、高效、可伸缩的方式进行数据处理。维护多个工作数据副本,确保能够针对失败的节点重新分布处理。并行工作方式,提高处理速度,之处处理PB级数据。
hadoop是一个能够让用户轻松架构和使用的分布式计算平台。用户可以轻松地在hadoop上开发和运行处理海量数据的应用程序。它主要有以下几个优点:
高可靠性: hadoop按位存储和处理数据的能力值得人们信赖。
高扩展性: hadoop是在可用的计算机集簇间分配数据并完成计算任务的,这些集簇可以方便地扩展到数以千计的节点中。
高效性: hadoop能够在节点之间动态地移动数据,并保证各个节点的动态平衡,因此处理速度非常快。
高容错性:hadoop能够自动保存数据的多个副本,并且能够自动将失败的任务重新分配。
低成本: 与一体机、商用数据仓库以及QlikView、Yonghong Z-Suite等数据集市相比,hadoop是开源的,项目的软件成本因此会大大降低。
本文主要涉及以下内容:hadoop源码编译,hadoop在分布式计算云存储系统中的部署和应用,同时也将记录hadoop搭建过程的FAQ和相对解决方案。
hadoop 集群(cluster) 支持如下3种操作模式:
1. Local/Standalone Mode
完成下载后,默认情况下hadoop 被配置为Standalone 模式,作为单个Java进程运行。
2. Pseudo Distributed Mode
此种模式下,每个hadoop 守护进程,如hdfs,yarn,MapReduce 等分布式部署在不同的机器上,分别作为独立的Java 进程,这种模式有助于开发。
3. Fully Distributed Mode
完全分布式部署,需要至少2台机器,作为一个集群,稍后进行详解。
二、移植环境
首先给出本机的软硬件信息,
软件环境:
(1)loongnix1.0 系统(2016.8.10版本)。下载地址 www.loongnix.org
(2)内核版本:3.10.84-all
(3)JDK版本:1.8.0_25-rc16-b17 or later
(4)MAVEN:3.2.2 or later
硬件环境:
(1)开发板类型: Loongson-3B-780E-2w-V0.2-demo
(2)固件版本: loongson-PMON-V3.3.0
本例中使用的hadoop的版本为2.7.2, hadoop 源码下载地址,参见附录中的”hadoop downloads” 链接。hadoop 编译依赖findbugs和cmake软件包,建议在编译前通过yum 命令进行自动安装,安装方式如下:
[hadoop@localhost log]$ sudo yum -y install java-1.8.0-openjdk-devel java-1.8.0-openjdk-headless \ java-1.8.0-openjdk findbugs cmake protobuf-compiler
完成安装后,需要设置如下环境变量,建议将以下内容追加到/et c/profile文件,并用source 命令使其生效。
export FINDBUGS_HOME=/usr/share/findbugs
export MAVEN_HOME=/usr/share/maven
export MAVEN_OPTS="-Xms256m -Xmx512m"
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.25-5.rc16.fc21.loongson.m
PATH=/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/h
export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin
Build From Scratch:首先解压源码到自定义目录(本例采用/usr/local)利用mvn clean package -Pdist,native,src -DskipTests -Dtar 命令进行编译。
tar xvf hadoop-2.7.2.src.gz -C mkdir /usr/local/
cd /usr/local/hadoop-2.7.2
mvn clean package -Pdist,native,src -DskipTests -Dtar
三、注意事项
(1)本例中采用/usr/local 作为工作目录需要root权限
(2)编译过程报错,可参见对应FAQ,问题解决后,通过mvn package -Pdist,native,src -DskipTests -Dtar 命令再次启动编译。
(3)FAQ的标识由序号(从001开始)和模块名组成,其中者通过冒号间隔。模块名源自maven Reactor 涉及的modules名称。
四、FAQ
001:Apache hadoop Common
终端报错:
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000000ffe18f46fc, pid=5300, tid=1099154321904
#
# JRE version: OpenJDK Runtime Environment (8.0_25-b17) (build 1.8.0_25-rc16-b17)
# Java VM: OpenJDK 64-Bit Server VM (25.25-b02 mixed mode linux- compressed oops)
# Problematic frame:
# J 62748 C2 scala.tools.asm.ClassWriter.get(Lscala/tools/asm/Item;)Lscala/tools/asm/Item; (49 bytes) @ 0x000000ffe18f46fc [0x000000ffe18f46a0+0x5c]
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
解决方法:
此问题与JDK的并行GC相关,编译hadoop和spark均有遇到,目前的解决方法:调整/etc/profile 文件MAVEN_OPTS 环境变量为如下内容:
export MAVEN_OPTS="-Xms3560m -Xmx3560m -XX:-UseParallelGC -XX:-UseParallelOldGC"
002: any-modules
终端现象: maven 编译过程中构件(xxx.jar和xxx.pom) 无法下载。
解决方法: 打开maven 配置文件的代理设置选项,并重新安装ca-certificates
#为maven 设置代理
<proxies>
<!-- proxy
| Specification for one proxy, to be used in connecting to the network.
|-->
<proxy>
<id>proxy01</id>
<active>true</active>
<protocol>http</protocol>
<host>ip_address</host>
<port>port</port>
<nonProxyHosts>localhost</nonProxyHosts>
</proxy>
<proxy>
<id>proxy02</id>
<active>true</active>
<protocol>https</protocol>
<host>ip_address</host>
<port>port</port>
<nonProxyHosts>localhost</nonProxyHosts>
</proxy>
</proxies>
#重新安装ca-certificates
Sudo yum -y install ca-certificates
注意事项: 凡出现Maven 编译过程构件无法下载,均可参考本FAQ内容进行适当修改。
五、编译结果
Maven编译通过后,将在终端显示hadoop 的maven Reactor(本次编译的所有maven 模块)和编译时间信息。下面给出的时耗信息,进攻参考不同软硬件平台将会产生差异。
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Apache hadoop Main ................................. SUCCESS [ 10.769 s]
[INFO] Apache hadoop Project POM .......................... SUCCESS [ 8.793 s]
[INFO] Apache hadoop Annotations .......................... SUCCESS [ 18.834 s]
[INFO] Apache hadoop Assemblies ........................... SUCCESS [ 2.414 s]
[INFO] Apache hadoop Project Dist POM ..................... SUCCESS [ 9.653 s]
[INFO] Apache hadoop Maven Plugins ........................ SUCCESS [ 25.215 s]
[INFO] Apache hadoop MiniKDC .............................. SUCCESS [ 20.682 s]
[INFO] Apache hadoop Auth ................................. SUCCESS [ 26.240 s]
[INFO] Apache hadoop Auth Examples ........................ SUCCESS [ 23.112 s]
[INFO] Apache hadoop Common ............................... SUCCESS [45:23 min]
[INFO] Apache hadoop NFS .................................. SUCCESS [ 45.079 s]
[INFO] Apache hadoop KMS .................................. SUCCESS [01:27 min]
[INFO] Apache hadoop Common Project ....................... SUCCESS [ 1.104 s]
[INFO] Apache hadoop HDFS ................................. SUCCESS [21:45 min]
[INFO] Apache hadoop HttpFS ............................... SUCCESS [02:13 min]
[INFO] Apache hadoop HDFS BookKeeper Journal .............. SUCCESS [ 47.832 s]
[INFO] Apache hadoop HDFS-NFS ............................. SUCCESS [ 34.029 s]
[INFO] Apache hadoop HDFS Project ......................... SUCCESS [ 1.075 s]
[INFO] hadoop-yarn ........................................ SUCCESS [ 1.354 s]
[INFO] hadoop-yarn-api .................................... SUCCESS [07:20 min]
[INFO] hadoop-yarn-common ................................. SUCCESS [35:51 min]
[INFO] hadoop-yarn-server ................................. SUCCESS [ 1.020 s]
[INFO] hadoop-yarn-server-common .......................... SUCCESS [01:42 min]
[INFO] hadoop-yarn-server-nodemanager ..................... SUCCESS [01:58 min]
[INFO] hadoop-yarn-server-web-proxy ....................... SUCCESS [ 25.288 s]
[INFO] hadoop-yarn-server-applicationhistoryservice ....... SUCCESS [01:05 min]
[INFO] hadoop-yarn-server-resourcemanager ................. SUCCESS [02:52 min]
[INFO] hadoop-yarn-server-tests ........................... SUCCESS [ 40.356 s]
[INFO] hadoop-yarn-client ................................. SUCCESS [ 54.780 s]
[INFO] hadoop-yarn-server-sharedcachemanager .............. SUCCESS [ 24.110 s]
[INFO] hadoop-yarn-applications ........................... SUCCESS [ 1.017 s]
[INFO] hadoop-yarn-applications-distributedshell .......... SUCCESS [ 21.223 s]
[INFO] hadoop-yarn-applications-unmanaged-am-launcher ..... SUCCESS [ 17.608 s]
[INFO] hadoop-yarn-site ................................... SUCCESS [ 1.145 s]
[INFO] hadoop-yarn-registry ............................... SUCCESS [ 42.659 s]
[INFO] hadoop-yarn-project ................................ SUCCESS [ 34.614 s]
[INFO] hadoop-mapreduce-client ............................ SUCCESS [ 1.905 s]
[INFO] hadoop-mapreduce-client-core ....................... SUCCESS [33:18 min]
[INFO] hadoop-mapreduce-client-common ..................... SUCCESS [32:57 min]
[INFO] hadoop-mapreduce-client-shuffle .................... SUCCESS [ 28.868 s]
[INFO] hadoop-mapreduce-client-app ........................ SUCCESS [01:00 min]
[INFO] hadoop-mapreduce-client-hs ......................... SUCCESS [ 46.223 s]
[INFO] hadoop-mapreduce-client-jobclient .................. SUCCESS [ 29.643 s]
[INFO] hadoop-mapreduce-client-hs-plugins ................. SUCCESS [ 15.580 s]
[INFO] Apache hadoop MapReduce Examples ................... SUCCESS [ 40.229 s]
[INFO] hadoop-mapreduce ................................... SUCCESS [ 24.719 s]
[INFO] Apache hadoop MapReduce Streaming .................. SUCCESS [ 33.669 s]
[INFO] Apache hadoop Distributed Copy ..................... SUCCESS [ 59.792 s]
[INFO] Apache hadoop Archives ............................. SUCCESS [ 19.986 s]
[INFO] Apache hadoop Rumen ................................ SUCCESS [ 47.303 s]
[INFO] Apache hadoop Gridmix .............................. SUCCESS [ 30.258 s]
[INFO] Apache hadoop Data Join ............................ SUCCESS [ 22.306 s]
[INFO] Apache hadoop Ant Tasks ............................ SUCCESS [ 19.212 s]
[INFO] Apache hadoop Extras ............................... SUCCESS [ 27.362 s]
[INFO] Apache hadoop Pipes ................................ SUCCESS [ 6.723 s]
[INFO] Apache hadoop OpenStack support .................... SUCCESS [ 34.857 s]
[INFO] Apache hadoop Amazon Web Services support .......... SUCCESS [ 37.631 s]
[INFO] Apache hadoop Azure support ........................ SUCCESS [ 30.848 s]
[INFO] Apache hadoop Client ............................... SUCCESS [01:02 min]
[INFO] Apache hadoop Mini-Cluster ......................... SUCCESS [ 3.409 s]
[INFO] Apache hadoop Scheduler Load Simulator ............. SUCCESS [ 33.821 s]
[INFO] Apache hadoop Tools Dist ........................... SUCCESS [ 55.501 s]
[INFO] Apache hadoop Tools ................................ SUCCESS [ 0.768 s]
[INFO] Apache hadoop Distribution ......................... SUCCESS [03:44 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:33 h
[INFO] Finished at: 2016-08-01T14:22:17+08:00
[INFO] Final Memory: 125M/3096M
[INFO] ------------------------------------------------------------------------
本例的编译结果位于/usr/local/hadoop-2.7.2/hadoop-dist/target/目录,源码包和二进制包分别为hadoop-2.7.2-src.tar.gz和hadoop-2.7.2.tar.gz。至此hadoop编译结束。
六、Hadoop 集群搭建测试
本节采用hadoop ”Fully Distributed Mode” 工作模式,在IP地址分别为10.20.42.22(slave1),10.20.42.22(slave2),10.20.42.199(master)的机器上部署3节点的hadoop集群。
1. 设置SSH免密码登录
SSH免密码登录,假设使用root用户,在每台服务器都生成公钥,再合并到authorized_keys,具体操作如下:
(1)fadora21默认没有启动ssh无密登录,修改/etc/ssh/sshd_config注释掉以下2行。(每台机器都要设置)
#RSAAuthentication yes
#PubkeyAuthentication yes
(2)在集群中的每台机器上,打开shell终端输入命令,ssh-keygen -t rsa,生成key,不要输入密码,一直回车,/root就会生成.ssh文件夹,这个文件一般是隐藏的。(每台服务器都要设置)
(3)合并slave节点的公钥到authorized_keys文件。在Master服务器,进入/root/.ssh目录,使用如下命令:
cat id_rsa.pub>> authorized_keys
ssh root@10.20.42.22 cat ~/.ssh/id_rsa.pub>> authorized_keys
ssh root@10.20.42.10 cat ~/.ssh/id_rsa.pub>> authorized_keys
(4)把Master服务器的authorized_keys、known_hosts复制到Slave服务器的/root/.ssh目录
(5)终端输入ssh root@10.20.42.22和ssh root@10.20.42.10进行验证是否免密登陆配置成功
2. 搭建hadoop 3节点集群
搭建思路:准备1台主服务器和2台从服务器,从主服务器可以ssh免密登录从服务器器。hadoop压缩包采用上节编译结果:hadoop-2.7.2.tar.gz。 3台服务器的概要信息如下:
Master 10.20.42.199
Slave1 10.20.42.22
Slave2 10.20.42.10
搭建前提: 服务器需要安装JDK并设置好JAVA_HOM等环境变量。可参考下面的例子:
#编辑/etc/profile 文件并设置JAVA_HOME等环境变量
vi /etc/profile
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin
#使环境变量生效 并且验证jdk 是否生效
source /etc/profile && java -version
开始搭建
解压hadoop-2.7.2.tar.gz 软件包,笔者的工作目录为/home/loongson/,没有特殊说明下面的配置文件均来自master服务器。
(1)解压hadoop软件包: tar -xvf hadoop-2.7.2.tar.gz -C /home/loongson
(2)在/home/loongson/hadoop-2.7.2目录下手动创建tmp、hdfs、hdfs/data、hdfs/name文件夹。
(3)配置/home/hadoop/hadoop-2.7.2/etc/hadoop目录下的core-site.xml(ip设置成master的地址)
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://10.20.42.199:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/loongson/hadoop/tmp</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131702</value>
</property>
</configuration>
(4)配置/home/loongson/hadoop-2.7.2/etc/hadoop目录下的hdfs-site.xml(ip设置成master的地址)
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/loongson/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/loongson/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>10.20.42.199:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
(5)配置/home/loongson/hadoop-2.7.2/etc/hadoop目录下的mapred-site.xml.template(ip设置成master的地址)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>10.20.42.199:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>10.20.42.199:19888</value>
</property>
</configuration>
(6)配置/home/loongson/hadoop-2.7.2/etc/hadoop目录下的yarn-site.xml(ip设置成master的地址)
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>10.20.42.199:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>10.20.42.199:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>10.20.42.199:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>10.20.42.199:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>10.20.42.199:8088</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>768</value>
</property>
</configuration>
(7)修改位于/home/loongson/hadoop-2.7.2/etc/hadoop目录hadoop-env.sh,yarn-env.sh中的JAVA_HOME等环境变量。
export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el
(8)配置/home/loongson/hadoop-2.7.2/etc/hadoop目录下的slaves文件,增加2个从slave节点,
10.20.42.10
10.20.42.22
(9)将上述配置好的Hadoop-2.7.2(位于master机器上)使用scp复制到各个slave节点对应位置上
scp -r /home/loongson/hadoop-2.7.2 10.20.42.10:/home/loongson
scp -r /home/loongson/hadoop-2.7.2 10.20.42.22:/home/loongson
(10)在Master服务器启动hadoop,从节点会自动启动,进入/home/loongson/hadoop-2.7.2目录
(1)关闭机器防火墙:service iptables stop (主从都设置)
(2)初始化node节点:bin/hdfs namenode -format
(3)启动全部node: sbin/start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
16/09/02 08:49:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform...
using builtin-java classes where applicable
Starting namenodes on [hadoop-master-001]
hadoop-master-001: starting namenode, logging to /home/loongson/hadoop-2.7.2/logs/hadoop-root-namenode-
localhost.localdomain.out
10.20.42.22: starting datanode, logging to /home/loongson/hadoop-2.7.2/logs/hadoop-root-datanode-localhost.localdomain.out
10.20.42.22: /home/loongson/hadoop-2.7.2/bin/hdfs: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el
10.20.42.22: /home/loongson/hadoop-2.7.2/bin/hdfs: line 304: /usr/lib/jvm/java-1.8.0-
openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el/bin/java: 成功
10.20.42.10: starting datanode, logging to /home/loongson/hadoop-2.7.2/logs/hadoop-root-datanode-localhost.localdomain.out
Starting secondary namenodes [hadoop-master-001]
hadoop-master-001: secondarynamenode running as process 18418. Stop it first.
16/09/02 08:50:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java
classes where applicable
starting yarn daemons
resourcemanager running as process 16937. Stop it first.
10.20.42.10: starting nodemanager, logging to /home/loongson/hadoop-2.7.2/logs/yarn-root-nodemanager-localhost.localdomain.out
10.20.42.22: starting nodemanager, logging to /home/loongson/hadoop-2.7.2/logs/yarn-root-nodemanager-localhost.localdomain.out
10.20.42.22: /home/loongson/hadoop-2.7.2/bin/yarn: /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el
10.20.42.22: /home/loongson/hadoop-2.7.2/bin/yarn: line 333: /usr/lib/jvm/java-1.8.0-
openjdk-1.8.0.25-6.b17.rc16.fc21.loongson.mips64el/bin/java: 成功
(4)暂停全部节点的命令: sbin/stop-all.sh
(5)输入jps命令: 如果从节点和主节点显示类似如下,说明节点搭建成功
master:
32497 OServerMain
3506 SecondaryNameNode
3364 DataNode
5654 Jps
2582 OGremlinConsole
16937 ResourceManager
3263 NameNode
slaves:
21580 Jps
20622 DataNode
(11)从浏览器访问: http://10.20.42.199:8088/或http://10.20.42.199:50070/ 查看hadop运行情况。下面给出从浏览器打开,看到的hadoop的运行情况截图:
Hadoop运行预览和概要信息:
Hadoop运行情况:
七、下载成品
如果觉得上面的移植过程太复杂,笔者已经准备好了移植完的二进制,可以直接下载运行:
http://www.loongnix.org/index.php/Apache_hadoop-2.7.2
八、总结
hadoop-2.7.2 在loongnix1.0 系统上正确完成源码编译和搭建小集群测试,可以作为开发者移植hadoop和进行集群测试的示范过程。