搭建hadoop2.6-yarn-spark1.6大数据集群详细步骤(三个节点,每个节点都要执行一遍):
配置/etc/hosts(覆盖原来的配置):
192.168.3.61 namenode1
192.168.3.62 datanode2
192.168.3.63 datanode3
由于机器数量有限,这里把datanode和namenode放在同一个节点,实际生产环境中建议分开;
hostname不能是localhost(127.0.0.1),应该设成本机主机名,hostname是节点在集群唯一通行证,并且要保持/etc/hosts与/etc/sysconfig/network一致;
配置SSH免密码登录
准备工作:
1、确认本机sshd的配置文件(需要root权限)
$ vim /etc/ssh/sshd_config
找到以下内容,并去掉注释符”#“
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
2、如果修改了配置文件需要重启sshd服务 (需要root权限)
$ /sbin/service sshd restart
配置SSH无密码登录需要3步:
1.生成公钥和私钥
2.导入公钥到认证文件,更改权限
3.测试
ssh-keygen -t rsa
scp id_rsa.pub 192.168.3.61:/root/.ssh/t62.pub 把从节点的公钥复制到主节点
cat /home/id_rsa.pub >> ~/.ssh/authorized_keys 把所有节点公钥导入认证文件
scp authorized_keys root@192.168.3.63:/root/.ssh/authorized_keys 把认证文件给所有节点复制一份
多机互信快捷方式:在主节点生成一个SSH公钥,ssh本机,把.ssh文件夹拷贝到剩余节点
小技巧(批量修改hostname):sed -i "s/datanode3/host3/g" `grep datanode3 -rl poa`
创建工作目录:
mkdir -p /work/poa
配置java maven scala环境:
tar zxvf /soft/jdk-8u65-linux-x64.tar.gz -C /work/poa/
export JAVA_HOME=/work/poa/jdk1.8.0_65
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
tar zxvf /soft/apache-maven-3.3.9-bin.tar.gz -C /work/poa/
export MAVEN_HOME=/work/poa/apache-maven-3.3.9
export PATH=$PATH:$MAVEN_HOME/bin
tar zxvf /soft/scala-2.10.6.tgz -C /work/poa/
export SCALA_HOME=/work/poa/scala-2.10.6
export PATH=$PATH:$SCALA_HOME/bin
hadoop使用protocol buffer进行通信,需要下载和安装 protobuf-2.5.0.tar.gz
安装PROTOBUF:
由于该软件依赖C编译器和C++编译器,所以要先:yum install gcc 和 yum install gcc-c++
https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
# tar zxvf /soft/protobuf-2.5.0.tar.gz -C /work/poa/
到protobuf根目录下:
# ./configure --prefix=/work/poa/protobuf-2.5.0
# make && make install
# vim /etc/profile
export PROTO_HOME=/work/poa/protobuf-2.5.0
export PATH=$PATH:$PROTO_HOME/bin
# source /etc/profile
# vim /etc/ld.so.conf
/work/poa/protobuf-2.5.0
# /sbin/ldconfig
搭建Hadoop:
tar zxvf /soft/hadoop-2.6.0.tar.gz -C /work/poa/
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
创建hadoop数据目录
rm -rf /data/hadoop/
mkdir -p /data/hadoop/{pids,storage}
mkdir -p /data/hadoop/storage/{hdfs,tmp}
mkdir -p /data/hadoop/storage/hdfs/{name,data}
vim /work/poa/hadoop-2.6.0/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/storage/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>datanode2:9000</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/storage/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/storage/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>namenode1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>namenode1:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.joblist.cache.size</name>
<value>30</value>
<description>Size of the job list cache,default 20000</description>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode1:8033</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>259200</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>3600</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://namenode1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>YARN-HA</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>namenode1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>datanode2</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>namenode1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>datanode2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>namenode1:2181,datanode2:2181,datanode3:2181</value>
</property>
<!-- add cfg 96G RAM start -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>90112</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>90112</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>23</value>
</property>
<!--add cfg end -->
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/hadoop-env.sh 在里边添加:
export JAVA_HOME=/work/poa/jdk1.8.0_65
在log4j.properties文件下
hadoop.root.logger=WARN,DRFA
把各个appender的输出级别改为ERROR
数据节点配置:
vim /work/poa/hadoop-2.6.0/etc/hadoop/slaves
namenode1
datanode2
datanode3
搭建Yarn HA(RM高可用)
安装三节点Zookeeper集群:
cd /work/poa/zookeeper-3.4.6/conf
sed -i "s/INFO/WARN/g" log4j.properties
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg(修改以下选项,其他默认配置不变)
dataDir=/data/zookeeper
server.1=namenode1:2888:3888
server.2=datanode2:2888:3888
server.3=datanode3:2888:3888
在dataDir指定的目录下面,创建一个myid文件,里面内容为一个数字,用来标识当前主机,conf/zoo.cfg文件中配置的server.X中X为什么数字,则myid文件中就输入这个数字
在每台机器依次启动zookeeper服务:/work/poa/zookeeper-3.4.6/bin/zkServer.sh start
检查zookeeper状态:/work/poa/zookeeper-3.4.6/bin/zkServer.sh status (其中一台为leader,其他为follower)
Hadoop 简单测试
cd /work/poa/hadoop-2.6.0
首次启动集群时,做如下操作【主名字节点上执行】
hdfs namenode -format
start-dfs.sh
创建hadoop日志目录:
hdfs dfs -mkdir -p /jobhistory/logs
start-yarn.sh
接着在datanode2节点启动备用RM:start-yarn.sh
查看yarn状态:yarn rmadmin -getServiceState rm1
mr-jobhistory-daemon.sh start historyserver
hdfs dfs -put /test.txt /
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /test.txt /out
hdfs dfs -ls /out
hdfs dfs -cat /out/part-r-00000
配置Spark:
export SPARK_HOME=/work/poa/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
Spark日志输出配置:
vim spark-defaults.xml
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namenode1:9000/jobhistory/logs
spark.history.fs.logDirectory hdfs://namenode1:9000/jobhistory/logs
spark.broadcast.compress true
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 1g
spark.default.parallelism 10
spark.yarn.executor.memoryOverhead 1024
spark.yarn.driver.memoryOverhead 512
在SparkHome下,启动 /work/poa/spark-1.6.0-bin-hadoop2.6/sbin/start-history-server.sh 用于跟踪Spark作业记录
# vim spark-env.sh(每个节点)
export JAVA_HOME=/work/poa/jdk1.8.0_65
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export SCALA_HOME=/work/poa/scala-2.10.6
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"
SPARK_MASTER_IP=namenode1
SPARK_LOCAL_DIRS=/work/poa/spark-1.6.0-bin-hadoop2.6
## worker节点的主机名列表
# vim slaves
namenode1
datanode2
datanode3
./bin/spark-submit 通过该命令提交相应spark作业
测试实例:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 1g --executor-cores 1 lib/spark-examples-1.6.0-hadoop2.6.0.jar 10
./bin/spark-submit --class com.yundun.datapal.ETL --master yarn --deploy-mode cluster --num-executors 3 --driver-memory 4g --executor-memory 1g --executor-cores 1 /soft/datapal-0.0.1-SNAPSHOT-jar-with-dependencies.jar
查看日志:
yarn logs -applicationId application_1460456359574_0002
yarn application -kill application_1461121409163_0008
附录(/etc/profile新增配置完整版):
export JAVA_HOME=/work/poa/jdk1.8.0_65
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
export MAVEN_HOME=/work/poa/apache-maven-3.3.9
export PATH=$PATH:$MAVEN_HOME/bin
export SCALA_HOME=/work/poa/scala-2.10.6
export PATH=$PATH:$SCALA_HOME/bin
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_SSH_OPTS="-p 22 -o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PROTO_HOME=/work/poa/protobuf-2.5.0
export PATH=$PATH:$PROTO_HOME/bin
export SPARK_HOME=/work/poa/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
集群性能监控工具ganglia安装步骤:
ganglia依赖php,先安装好PHP
主节点:
ganglia所需全部组件:
yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpd
vim /etc/ganglia/gmond.conf (修改以下字段,没有列出的字段不要动它)
cluster {
name = "yarn"
owner = "ganglia"
}
udp_send_channel {
host = 主节点IP
}
vim /etc/ganglia/gmetad.conf
data_source "yarn" 主节点IP
gridname "yarn"
执行:
chown -R ganglia:ganglia /var/lib/ganglia/rrds
chkconfig --levels 235 gmond on
service gmond start
chkconfig --levels 235 gmetad on
service gmetad start
chkconfig --levels 235 httpd on
service httpd start
从节点:
安装ganglia客户端gmond:
yum install ganglia-gmond
把主节点的/etc/ganglia/gmond.conf拷贝到所有从节点
chkconfig --levels 235 gmond on
service gmond start
页面访问: 主节点IP/ganglia
配置Hadoop的ganglia,分别修改master和slave的hadoop.metrics2文件
遇到的问题和解决方法:
1、访问ganglia页面出现 Permission denied 的错误提示
这是因为httpd的设置问题,没有对某些ip的访问开放权限。修改httpd下面ganglia的配置
vi /etc/httpd/conf.d/ganglia.conf
Order deny,allow
#Deny from all
Allow from all (就是修改这个,表示对所有人开放访问权限)
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
2、浏览器访问ganglia出现各个节点的表格显示“ganglia no matching metrics detected”。
原因:/var/lib/ganglia/rrds中对各个节点相应的文件夹是小写,如果节点的hostname中包含大写字母的话,这样就发现找到数据了。
解决方法:修改gmetad.conf,将case_sensitive_hostnames的值设置为1
3、如何恢复gmond.conf默认配置:gmond -t > /etc/ganglia/gmond.conf
配置/etc/hosts(覆盖原来的配置):
192.168.3.61 namenode1
192.168.3.62 datanode2
192.168.3.63 datanode3
由于机器数量有限,这里把datanode和namenode放在同一个节点,实际生产环境中建议分开;
hostname不能是localhost(127.0.0.1),应该设成本机主机名,hostname是节点在集群唯一通行证,并且要保持/etc/hosts与/etc/sysconfig/network一致;
配置SSH免密码登录
准备工作:
1、确认本机sshd的配置文件(需要root权限)
$ vim /etc/ssh/sshd_config
找到以下内容,并去掉注释符”#“
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
2、如果修改了配置文件需要重启sshd服务 (需要root权限)
$ /sbin/service sshd restart
配置SSH无密码登录需要3步:
1.生成公钥和私钥
2.导入公钥到认证文件,更改权限
3.测试
ssh-keygen -t rsa
scp id_rsa.pub 192.168.3.61:/root/.ssh/t62.pub 把从节点的公钥复制到主节点
cat /home/id_rsa.pub >> ~/.ssh/authorized_keys 把所有节点公钥导入认证文件
scp authorized_keys root@192.168.3.63:/root/.ssh/authorized_keys 把认证文件给所有节点复制一份
多机互信快捷方式:在主节点生成一个SSH公钥,ssh本机,把.ssh文件夹拷贝到剩余节点
小技巧(批量修改hostname):sed -i "s/datanode3/host3/g" `grep datanode3 -rl poa`
创建工作目录:
mkdir -p /work/poa
配置java maven scala环境:
tar zxvf /soft/jdk-8u65-linux-x64.tar.gz -C /work/poa/
export JAVA_HOME=/work/poa/jdk1.8.0_65
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
tar zxvf /soft/apache-maven-3.3.9-bin.tar.gz -C /work/poa/
export MAVEN_HOME=/work/poa/apache-maven-3.3.9
export PATH=$PATH:$MAVEN_HOME/bin
tar zxvf /soft/scala-2.10.6.tgz -C /work/poa/
export SCALA_HOME=/work/poa/scala-2.10.6
export PATH=$PATH:$SCALA_HOME/bin
hadoop使用protocol buffer进行通信,需要下载和安装 protobuf-2.5.0.tar.gz
安装PROTOBUF:
由于该软件依赖C编译器和C++编译器,所以要先:yum install gcc 和 yum install gcc-c++
https://protobuf.googlecode.com/files/protobuf-2.5.0.tar.gz
# tar zxvf /soft/protobuf-2.5.0.tar.gz -C /work/poa/
到protobuf根目录下:
# ./configure --prefix=/work/poa/protobuf-2.5.0
# make && make install
# vim /etc/profile
export PROTO_HOME=/work/poa/protobuf-2.5.0
export PATH=$PATH:$PROTO_HOME/bin
# source /etc/profile
# vim /etc/ld.so.conf
/work/poa/protobuf-2.5.0
# /sbin/ldconfig
搭建Hadoop:
tar zxvf /soft/hadoop-2.6.0.tar.gz -C /work/poa/
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
创建hadoop数据目录
rm -rf /data/hadoop/
mkdir -p /data/hadoop/{pids,storage}
mkdir -p /data/hadoop/storage/{hdfs,tmp}
mkdir -p /data/hadoop/storage/hdfs/{name,data}
vim /work/poa/hadoop-2.6.0/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://namenode1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/data/hadoop/storage/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>datanode2:9000</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/storage/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/storage/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>namenode1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>namenode1:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.joblist.cache.size</name>
<value>30</value>
<description>Size of the job list cache,default 20000</description>
</property>
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>namenode1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>namenode1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>namenode1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>namenode1:8033</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>259200</value>
</property>
<property>
<name>yarn.log-aggregation.retain-check-interval-seconds</name>
<value>3600</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://namenode1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>YARN-HA</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>namenode1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>datanode2</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>namenode1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>datanode2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>namenode1:2181,datanode2:2181,datanode3:2181</value>
</property>
<!-- add cfg 96G RAM start -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>90112</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>90112</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>23</value>
</property>
<!--add cfg end -->
</configuration>
vim /work/poa/hadoop-2.6.0/etc/hadoop/hadoop-env.sh 在里边添加:
export JAVA_HOME=/work/poa/jdk1.8.0_65
在log4j.properties文件下
hadoop.root.logger=WARN,DRFA
把各个appender的输出级别改为ERROR
数据节点配置:
vim /work/poa/hadoop-2.6.0/etc/hadoop/slaves
namenode1
datanode2
datanode3
搭建Yarn HA(RM高可用)
安装三节点Zookeeper集群:
cd /work/poa/zookeeper-3.4.6/conf
sed -i "s/INFO/WARN/g" log4j.properties
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg(修改以下选项,其他默认配置不变)
dataDir=/data/zookeeper
server.1=namenode1:2888:3888
server.2=datanode2:2888:3888
server.3=datanode3:2888:3888
在dataDir指定的目录下面,创建一个myid文件,里面内容为一个数字,用来标识当前主机,conf/zoo.cfg文件中配置的server.X中X为什么数字,则myid文件中就输入这个数字
在每台机器依次启动zookeeper服务:/work/poa/zookeeper-3.4.6/bin/zkServer.sh start
检查zookeeper状态:/work/poa/zookeeper-3.4.6/bin/zkServer.sh status (其中一台为leader,其他为follower)
Hadoop 简单测试
cd /work/poa/hadoop-2.6.0
首次启动集群时,做如下操作【主名字节点上执行】
hdfs namenode -format
start-dfs.sh
创建hadoop日志目录:
hdfs dfs -mkdir -p /jobhistory/logs
start-yarn.sh
接着在datanode2节点启动备用RM:start-yarn.sh
查看yarn状态:yarn rmadmin -getServiceState rm1
mr-jobhistory-daemon.sh start historyserver
hdfs dfs -put /test.txt /
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar wordcount /test.txt /out
hdfs dfs -ls /out
hdfs dfs -cat /out/part-r-00000
配置Spark:
export SPARK_HOME=/work/poa/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
Spark日志输出配置:
vim spark-defaults.xml
spark.eventLog.enabled true
spark.eventLog.dir hdfs://namenode1:9000/jobhistory/logs
spark.history.fs.logDirectory hdfs://namenode1:9000/jobhistory/logs
spark.broadcast.compress true
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.driver.memory 1g
spark.default.parallelism 10
spark.yarn.executor.memoryOverhead 1024
spark.yarn.driver.memoryOverhead 512
在SparkHome下,启动 /work/poa/spark-1.6.0-bin-hadoop2.6/sbin/start-history-server.sh 用于跟踪Spark作业记录
# vim spark-env.sh(每个节点)
export JAVA_HOME=/work/poa/jdk1.8.0_65
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export SCALA_HOME=/work/poa/scala-2.10.6
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR="${YARN_CONF_DIR:-$HADOOP_YARN_HOME/conf}"
SPARK_MASTER_IP=namenode1
SPARK_LOCAL_DIRS=/work/poa/spark-1.6.0-bin-hadoop2.6
## worker节点的主机名列表
# vim slaves
namenode1
datanode2
datanode3
./bin/spark-submit 通过该命令提交相应spark作业
测试实例:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 1g --executor-cores 1 lib/spark-examples-1.6.0-hadoop2.6.0.jar 10
./bin/spark-submit --class com.yundun.datapal.ETL --master yarn --deploy-mode cluster --num-executors 3 --driver-memory 4g --executor-memory 1g --executor-cores 1 /soft/datapal-0.0.1-SNAPSHOT-jar-with-dependencies.jar
查看日志:
yarn logs -applicationId application_1460456359574_0002
yarn application -kill application_1461121409163_0008
附录(/etc/profile新增配置完整版):
export JAVA_HOME=/work/poa/jdk1.8.0_65
export CLASS_PATH=$JAVA_HOME/lib:$JAVA_HOME/jre/lib
export PATH=$PATH:$JAVA_HOME/bin
export MAVEN_HOME=/work/poa/apache-maven-3.3.9
export PATH=$PATH:$MAVEN_HOME/bin
export SCALA_HOME=/work/poa/scala-2.10.6
export PATH=$PATH:$SCALA_HOME/bin
export HADOOP_HOME=/work/poa/hadoop-2.6.0
export HADOOP_PID_DIR=/data/hadoop/pids
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_SSH_OPTS="-p 22 -o ConnectTimeout=1 -o SendEnv=HADOOP_CONF_DIR"
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PROTO_HOME=/work/poa/protobuf-2.5.0
export PATH=$PATH:$PROTO_HOME/bin
export SPARK_HOME=/work/poa/spark-1.6.0-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
集群性能监控工具ganglia安装步骤:
ganglia依赖php,先安装好PHP
主节点:
ganglia所需全部组件:
yum install rrdtool ganglia ganglia-gmetad ganglia-gmond ganglia-web httpd
vim /etc/ganglia/gmond.conf (修改以下字段,没有列出的字段不要动它)
cluster {
name = "yarn"
owner = "ganglia"
}
udp_send_channel {
host = 主节点IP
}
vim /etc/ganglia/gmetad.conf
data_source "yarn" 主节点IP
gridname "yarn"
执行:
chown -R ganglia:ganglia /var/lib/ganglia/rrds
chkconfig --levels 235 gmond on
service gmond start
chkconfig --levels 235 gmetad on
service gmetad start
chkconfig --levels 235 httpd on
service httpd start
从节点:
安装ganglia客户端gmond:
yum install ganglia-gmond
把主节点的/etc/ganglia/gmond.conf拷贝到所有从节点
chkconfig --levels 235 gmond on
service gmond start
页面访问: 主节点IP/ganglia
配置Hadoop的ganglia,分别修改master和slave的hadoop.metrics2文件
遇到的问题和解决方法:
1、访问ganglia页面出现 Permission denied 的错误提示
这是因为httpd的设置问题,没有对某些ip的访问开放权限。修改httpd下面ganglia的配置
vi /etc/httpd/conf.d/ganglia.conf
Order deny,allow
#Deny from all
Allow from all (就是修改这个,表示对所有人开放访问权限)
Allow from 127.0.0.1
Allow from ::1
# Allow from .example.com
2、浏览器访问ganglia出现各个节点的表格显示“ganglia no matching metrics detected”。
原因:/var/lib/ganglia/rrds中对各个节点相应的文件夹是小写,如果节点的hostname中包含大写字母的话,这样就发现找到数据了。
解决方法:修改gmetad.conf,将case_sensitive_hostnames的值设置为1
3、如何恢复gmond.conf默认配置:gmond -t > /etc/ganglia/gmond.conf