准备工作
配置静态ip
修改/etc/sysconfig/network-scripts/ifcfg-eth0文件,添加如下内容
TYPE=Ethernet
HWADDR=0C:DA:41:1D:7B:BC
BOOTPROTO=static
NM_CONTROLLED=yes
NAME=eth0
UUID=c5f5e2c0-300c-4731-9312-d3a34db10a4b
DEVICE=eth0
ONBOOT=yes
IPADDR=10.1.31.222
NETMASK=255.255.255.0
GATEWAY=xxx.xxx.xxx.xxx
PREFIX=24
重新启动网卡service network restart
配置hosts
修改/etc/hosts,添加每台服务器的ip以及对应hostname
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.1.31.222 hadoop1
10.1.31.223 hadoop2
10.1.31.224 hadoop3
10.1.31.225 hadoop4
10.1.31.226 hadoop5
10.1.31.227 hadoop6
10.1.31.228 hadoop7
10.1.31.229 hadoop8
10.1.31.230 hadoop9
10.1.31.220 hadoop10
10.1.31.221 hadoop11
关闭防火墙
firewall-cmd --state 查看防火墙状态
systemctl stop firewalld.service 关闭防火墙
systemctl disable firewalld.service 禁止开机时防火墙自启
添加hadoop用户
为每台服务器添加hadoop用户
useradd hadoop
passwd hadoop123
配置SSH免密码登录
在正式配置Hadoop前,需实现master可SSH免密码登陆至所有slave
-
确认各机器是否已安装SSH文件
rpm -qa | grep ssh
-
在master使用hadoop用户在/home/hadoop目录生成密钥对,生成阶段凡需输入直接回车
ssh-keygen -t rsa
-
把 id_rsa.pub 追加到授权的 key 里面去(先cd /)
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
-
修改权限
chmod 600 ~/.ssh/authorized_keys
-
切换至root,查看vim /etc/ssh/sshd_config,确保其存在如下内容,如果如下内容前有#号,去掉#号,然后保存。
RSAAuthentication yes PubkeyAuthentication yes AuthorizedKeysFile .ssh/authorized_keys
如需修改, 则在修改后执行重启SSH服务命令使其生效
service sshd restart
-
在master上将公钥复制到所有的slave机器上(以HADOOP用户身份执行)
scp ~/.ssh/id_rsa.pub hadoop@<IP|HOSTNAME>:~/
-
使用hadoop用户在 slave 机器上创建 .ssh 文件夹,并授权
cd /home/hadoop mkdir -p ~/.ssh chmod 700 ~/.ssh
-
追加到授权文件 authorized_keys,并授权
cat ~/id_rsa.pub >> ~/.ssh/authorized_keys chmod 600 ~/.ssh/authorized_keys
-
在slave机器重复第五步结果验证:以hadoop用户在master机器SSH远程登录slave机器(必须做这一步),如发现登录无需输入账户密码即配置成功 。
-
多台机器互相免密将这几台机器/home/hadoop/.ssh/id_rsa.pub内容复制到authorized_keys下面
配置JDK8
查看是否有自带的openJDK
rpm -qa | grep openjdk
若有,则需要卸载
下载oracle官方的jdk-8u261-linux-x64.tar.gz传入服务器,解压
tar -zxvf jdk-8u261-linux-x64.tar.gz
创建用于存放JDK的目录,并将解压出的JDK目录剪切至该目录。这里存放目录/usr/lib/jvm(root权限)
mkdir -p /usr/lib/jvm
mv jdk1.8.0_261 /usr/lib/jvm
修改配置文件/etc/profile,添加如下行
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_261
export PATH=$JAVA_HOME/bin:$PATH
测试是否成功:
source /etc/profile
java -version
python3
安装以下包
yum search xxx #查找包
yum -y install xxx #安装对应的包
bzip2-devel-1.0.6-13.el7.x86_64.rpm
libffi-devel-3.0.13-19.el7.x86_64.rpm
ncurses-devel-5.9-14.20130511.el7_4.x86_64.rpm
openssl-devel-1.0.2k-19.el7.x86_64.rpm
readline-devel-6.2-11.el7.x86_64.rpm
sqlite-devel-3.7.17-8.el7_7.1.x86_64.rpm
tk-devel-8.5.13-6.el7.x86_64.rpm
zlib-devel-1.2.7-18.el7.x86_64.rpm
make-3.82-24.el7.x86_64.rpm
gcc-4.8.5-39.el7.x86_64.rpm
解压python3包
tar -zxvf Python-3.7.6.tgz -C /home/hadoop/distributed/
进入/usr/local ,然后创建python3文件夹
mkdir python3
编译并安装
cd /home/hadoop/distributed/Python-3.7.6
./configure prefix=/usr/local/python3/
make && make install
在/usr/local/目录下就会有python3目录
备份原来的连接并添加python3的软链接,软链接必须为/usr/bin/python3,否则后续pyspark无法找到python3
mv /usr/bin/python /usr/bin/python.bak
ln -s /usr/local/python3/bin/python3.7 /usr/bin/python3
ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3
修改环境配置
vim /etc/profile
#然后在文件末尾添加
export PATH=$PATH:/usr/local/python3/bin
source /etc/profile # 修改完之后,更新配置
测试是否成功
python3 -V
更改yum配置,因为其要用到python2才能执行:
vi /usr/bin/yum
把#! /usr/bin/python修改为#! /usr/bin/python2
vi /usr/libexec/urlgrabber-ext-down
把#! /usr/bin/python 修改为#! /usr/bin/python2
zookeeper
安装
mkdir -p /home/hadoop/distributed
将安装包解压
tar -zxvf zookeeper-3.4.14.tar.gz -C /home/hadoop/distributed/
修改配置文件
vi /etc/profile
添加下面两行
export ZOOKEEPER_HOME=/home/hadoop/distributed/zookeeper-3.4.14
export PATH=$PATH:$ZOOKEEPER_HOME/bin
激活配置文件
source /etc/profile
配置
在conf目录下添加配置文件zoo.cfg,配置如下内容,主要需要配置的是dataDir,dataLogDir和server配置
tickTime=5000
initLimit=10
syncLimit=5
dataDir=/home/hadoop/distributed/tmp/zookeeper/data
dataLogDir=/home/hadoop/distributed/tmp/zookeeper/logs
clientPort=2181
server.1=hadoop1:2888:3888
server.2=hadoop2:2888:3888
server.3=hadoop3:2888:3888
server.4=hadoop4:2888:3888
server.5=hadoop5:2888:3888
server.6=hadoop6:2888:3888
server.7=hadoop7:2888:3888
server.8=hadoop8:2888:3888
server.9=hadoop9:2888:3888
server.10=hadoop10:2888:3888
server.11=hadoop11:2888:3888
在dataDir(/home/hadoop/distributed/tmp/zookeeper/data)下配置myid,在对应的服务器上与上述配置文件需相符合,例如服务器hadoop11中myid文件内容为11
启动
启动各个节点
zkServer.sh start
使用jps查看是否成功启动
Hadoop
安装
解压hadoop包
tar -zxvf hadoop-3.1.3.tar.gz -C /home/hadoop/distributed/
修改配置文件
vi /etc/profile
添加下面两行
export HADOOP_HOME=/home/hadoop/distributed/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
验证
source /etc/profile
hadoop version
配置文件
配置目录/home/hadoop/distribute/hadoop-3.1.3/etc/hadoop/中的配置文件,如下链接是官方的默认配置,其中包含了每个配置项的具体说明以及默认值(若存在)。
https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-common/core-default.xml)
https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml)
https://hadoop.apache.org/docs/r3.1.3/hadoop-yarn/hadoop-yarn-common/yarn-default.xml
core-site.xml
主要配置为tmp文件路径以以及文件系统默认名称(一个URI,用于确定文件系统的主机、端口等。)
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/distributed/tmp/hadoop</value>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181,hadoop6:2181,hadoop7:2181,hadoop8:2181,hadoop9:2181,hadoop10:2181,hadoop11:2181</value>
</property>
</configuration>
hdfs-site.xml
参考如下官方文档进行配置,使用JQM来搭建高可用HDFS集群
https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/hadoop/distributed/hdfs/namenode1,file:///home/hadoop/distributed/hdfs/namenode2</value>
</property>
<property>
<name>dfs.namenode.name.dir.restore</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>
file:///home/hadoop/hdfs/disk1/data,
file:///home/hadoop/hdfs/disk2/data,
file:///home/hadoop/hdfs/disk3/data,
file:///home/hadoop/hdfs/disk4/data,
file:///home/hadoop/hdfs/disk5/data,
file:///home/hadoop/hdfs/disk6/data,
file:///home/hadoop/hdfs/disk7/data,
file:///home/hadoop/hdfs/disk8/data
</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>40960</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop1:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/distributed/hdfs/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181,hadoop6:2181,hadoop7:2181,hadoop8:2181,hadoop9:2181,hadoop10:2181,hadoop11:2181</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>mycluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>nn1,nn2</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.nn1</name>
<value>hadoop1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.nn2</name>
<value>hadoop2</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address.nn1</name>
<value>hadoop1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.nn1</name>
<value>hadoop1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.nn1</name>
<value>hadoop1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.nn1</name>
<value>hadoop1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.nn1</name>
<value>hadoop1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.address.nn2</name>
<value>hadoop2:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address.nn2</name>
<value>hadoop2:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address.nn2</name>
<value>hadoop2:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address.nn2</name>
<value>hadoop2:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.nn2</name>
<value>hadoop2:8088</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>20480</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1536</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>10240</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>8</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181,hadoop6:2181,hadoop7:2181,hadoop8:2181,hadoop9:2181,hadoop10:2181,hadoop11:2181</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapreduce.jobtracker.http.address.nn1</name>
<value>hadoop1:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address.nn1</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address.nn1</name>
<value>hadoop1:19888</value>
</property>
<property>
<name>mapred.job.tracker.nn1</name>
<value>http://hadoop1:9001</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address.nn2</name>
<value>hadoop2:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address.nn2</name>
<value>hadoop2:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address.nn2</name>
<value>hadoop2:19888</value>
</property>
<property>
<name>mapred.job.tracker.nn2</name>
<value>http://hadoop2:9001</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/distributed/hadoop-3.1.3</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/distributed/hadoop-3.1.3</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/home/hadoop/distributed/hadoop-3.1.3</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1537M</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx3072M</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>800</value>
</property>
</configuration>
hadoop-env.sh
根据jdk的路径加入这一行:
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_261
workers
添加配置的数据节点datanode
hadoop1
hadoop2
hadoop3
hadoop4
hadoop5
hadoop6
hadoop7
hadoop8
hadoop9
hadoop10
hadoop11
完成上述操作后,将hadoop目录同步到其他节点
scp -r /home/hadoop/distributed/hadoop-3.1.3 master@hadoop2:/home/hadoop/distributed/
启动
按下述流程启动,否则可能启动失败
启动zookeeper
每台进入到安装的zookeeper目录下
./zkServer.sh start
启动journalnode
进入到hadoop的安装目录下 然后进到sbin 目录下启动journalnode
./hadoop-daemon.sh start journalnode
hdfs --daemon start journalnode
启动namenode
进行格式化后,启动这台机器的namenode
hdfs namenode -format
hdfs --daemon start namenode
此时需要同步其他namenode,在其他的namenode节点中执行如下命令
hdfs namenode -bootstrapStandby
格式化zkfc
hdfs zkfc -formatZK
关闭journalnode
./hadoop-daemon.sh stop journalnode
hdfs --daemon stop journalnode
启动hadoop集群
在hadoop 目录下的sbin目录执行命令全部启动
./start-all.sh
错误及其解决
-
namenode多次格式化后出现错误:java.io.IOException: Incompatible clusterIDs
java.io.IOException: Incompatible clusterIDs in /home/hadoop/hdfs/disk1/data: namenode clusterID = CID-1b1a99a2-42e1-4387-b62c-0ab499d8c92f; datanode clusterID = CID-25281d17-c391-46f4-879a-2ca8c3b44c1e
原因:每次namenode format会重新创建一个namenode clusterID,而datanode的data目录包含了上次format时的clusterID,二者不匹配。
解决方案:停掉集群,删除datanode的dfs.datanode.data.dir目录下的所有内容,重新格式化namenode。
-
namenode1启动,但namenode2无法启动,查看日志发现namenode未格式化,格式化之后仍然没用
2020-08-12 14:32:18,689 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1: java.io.IOException: NameNode is not formatted.
原因:搭建HA集群,首次启动必须手动同步namenode
解决方案:使用如下命令
hdfs namenode -bootstrapStandby
hdfs官方手册中对该命令解释如下:允许通过从active的NameNode复制最新的命名空间快照来启动standby状态的NameNode的存储目录。这在第一次配置 HA 群集时使用。
Allows the standby NameNode’s storage directories to be bootstrapped by copying the latest namespace snapshot from the active NameNode. This is used when first configuring an HA cluster.
-
若遇到下述问题
ERROR: Attempting to operate on hdfs namenode as root ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
在start-dfs.sh 和 stop-dfs.sh顶部添加如下内容,这两个文件在sbin目录下
HDFS_DATANODE_USER=root HADOOP_SECURE_DN_USER=hdfs HDFS_NAMENODE_USER=root HDFS_JOURNALNODE_USER=root HDFS_ZKFC_USER=root
在start-yarn.sh 和 stop-yarn.sh顶部添加如下内容,这两个文件在sbin目录下
YARN_RESOURCEMANAGER_USER=root HADOOP_SECURE_DN_USER=yarn YARN_NODEMANAGER_USER=root
Hbase
安装
解压hbase包
tar -zxvf hbase-2.2.5-bin.tar.gz -C /home/hadoop/distributed/
修改配置文件
vi /etc/profile
添加下面两行
export HBASE_HOME=/home/hadoop/distributed/hbase-2.2.5
export PATH=$PATH:$HBASE_HOME/bin
验证
source /etc/profile
配置文件
/home/hadoop/distributed/hbase-2.2.5/conf/目录下有四个需要修改的配置文件
hbase-env.sh
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_261
export HBASE_CLASSPATH=/home/hadoop/distributed/hbase-2.2.5/conf
export HBASE_MANAGES_ZK=false #flase表示使用独立的zookeeper
hbase-site.xml
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.unsafe.stream.capability.enforce</name>
<value>false</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop1,hadoop2,hadoop3,hadoop4,hadoop5,hadoop6,hadoop7,hadoop8,hadoop9,hadoop10,hadoop11</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/distributed/tmp/zookeeper/data</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://mycluster/hbase</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>240000</value>
</property>
<property>
<name>hbase.regionserver.restart.on.zk.expire</name>
<value>true</value>
</property>
<!-- RPC监听实例的数量,多实例效率更高,以CPU核数的2倍开始调优-->
<property>
<name>hbase.regionserver.handler.count</name>
<value>32</value>
</property>
<!-- 写缓存占用内存,过大则调低-->
<property>
<name>hbase.client.write.buffer</name>
<value>3145728</value>
</property>
<!-- 若读多写少,memstore.size可增大至0.65 -->
<property>
<name>hbase.regionserver.global.memstore.size</name>
<value>0.45</value>
</property>
<property>
<name>hbase.regionserver.global.memstore.size.lower.limit</name>
<value>0.9</value>
</property>
<!-- 若写数据增长太快,易触发阻塞机制,调大以下两个配置 -->
<property>
<name>hbase.hregion.memstore.flush.size</name>
<value>134217728</value>
</property>
<property>
<name>hbase.hregion.memstore.block.multiplier</name>
<value>4</value>
</property>
<!-- 自动刷写时间间隔,若关闭自动刷写,调为0 -->
<property>
<name>hbase.regionserver.optionalcacheflushinterval</name>
<value>3600000</value>
</property>
<property>
<name>hfile.block.cache.size</name>
<value>0.3</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>10737418240</value>
</property>
<property>
<name>hbase.coprocessor.region.classes</name>
<value>
com.chpi.hbase.endpoint.server.TrendDataEndPoint,
com.chpi.hbase.endpoint.server.InterpDataEndPoint,
com.chpi.hbase.endpoint.server.TimeSlotDataEndPoint
</value>
</property>
</configuration>
regionservers
里面原本应该是 localhost,修改为配置的数据节点datanode
hadoop1
hadoop2
hadoop3
hadoop4
hadoop5
hadoop6
hadoop7
hadoop8
hadoop9
hadoop10
hadoop11
backup-masters
在conf目录下新增文件配置backup-masters,在其内添加要用做Backup Master的节点hostname
hadoop2
复制配置文件
将hadoop中etc/hadoop目录下hdfs-site.xml和core-site.xml复制到hbase的conf目录下
启动
完成上述操作后,将hadoop同步到其他节点
先启动zookeeper,hadoop,然后在启动hbase
start-hbase.sh
访问HMaster的16010端口来打开hbase web页面以验证是否成功启动
错误及其解决
-
log4j冲突
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/hadoop/distributed/hadoop-3.1.3/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/hadoop/distributed/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
原因:两个log4j的jar起了冲突
解决方案:删除其中一个
mv /home/hadoop/distributed/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar //home/hadoop/distributed/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar.bak
-
regionserver无法创建,无法识别命名空间mycluster
2020-08-15 09:28:12,240 ERROR [main] regionserver.HRegionServer: Failed construction RegionServer java.lang.IllegalArgumentException: java.net.UnknownHostException: mycluster ... Caused by: java.net.UnknownHostException: mycluster
原因:hbase无法识别命名空间mycluster
解决方案:将hadoop的配置文件软链接或者复制到hbase中
Zookeeper+Hadoop+Hbase+ThriftServer 启动
Zookeeper 启动
进入每台机的节点
zkServer.sh start
Hadoop 启动
进入每台机器
hdfs --daemon start datanode
hdfs --daemon start namenode
进入master节点
./start-all.sh
Hbase 启动
./start-hbase.sh
redis
安装
首先需要配置编译环境,需安装以下包。这些包之间存在依赖关系,有些包需要在其他的完成后才可以安装。
yum search
yum -y install
mpfr-3.1.1-4.el7.x86_64.rpm
libmpc-1.0.1-3.el7.x86_64.rpm
libstdc++-devel-4.8.5-39.el7.x86_64.rpm
glibc-devel-2.17-307.el7.1.x86_64.rpm
glibc-headers-2.17-307.el7.1.x86_64.rpm
kernel-headers-3.10.0-1127.el7.x86_64.rpm
libgomp-4.8.5-39.el7.x86_64.rpm
cpp-4.8.5-39.el7.x86_64.rpm
gcc-4.8.5-39.el7.x86_64.rpm
gcc-c++-4.8.5-39.el7.x86_64.rpm
解压redis包
tar -zxvf redis-5.0.8.tar.gz -C /home/hadoop/distributed/
编译并安装
cd distributed/redis-5.0.8/
make
make install PREFIX=/home/hadoop/distributed/redis
修改配置文件
vi /etc/profile
添加
export PATH=$PATH:/usr/local/redis/bin
激活配置文件
source /etc/profile
配置
共有三个目录,在此说明
- /home/hadoop/distributed/redis-5.0.8/为源码目录
- /home/hadoop/distributed/redis为redis根目录
- /home/hadoop/distributed/redis-clusters目录为集群的配置文件,分别创建6379,6380,6381文件夹
从redis的源码目录中复制redis.conf到redis的安装目录6379下
cp redis.conf /home/hadoop/distributed/redis-clusters/6379
修改配置文件中的以下内容,注:#之后的注释需要删除
bind 10.1.31.222 #开启网络,保证其他网络可以访问该机
port 6379 #节点端口号
daemonize yes #后台运行redis
pidfile /var/run/redis_6379.pid #pid文件位置
dir /home/hadoop/distributed/redis-clusters/6379 #指定数据文件存放位置
logfile /home/hadoop/distributed/redis-clusters/6379/redis.log #日志文件存放位置
appendonly yes #aof日志记录模式
cluster-enabled yes #开启集群模式
cluster-config-file nodes_800*.conf #集群节点的配置
cluster-node-timeout 5000 #节点请求超时时间
protected-mode no #关闭保护模式
后续可以添加密码
requirepass xxx #设置redis访问密码
masterauth xxx #设置集群节点间访问密码,跟上面一致
将配置文件复制到之前创建的/home/hadoop/distributed/redis-clusters中的6380,6381路径下,并分别更改配置文件中的6379为对应的端口号6380和6381
将Redis传到其他服务器上
启动
启动单个节点
./redis-server /home/hadoop/distributed/redis-clusters/6379/redis.conf
启动三个节点
./redis-server /home/hadoop/distributed/redis-clusters/6379/redis.conf
./redis-server /home/hadoop/distributed/redis-clusters/6380/redis.conf
./redis-server /home/hadoop/distributed/redis-clusters/6381/redis.conf
建立集群,将全部的节点添加进去,其中cluster-replicas表示主从配置比
redis-cli --cluster create 10.1.31.222:6379 10.1.31.222:6380 10.1.31.222:6381 10.1.31.223:6379 10.1.31.223:6380 10.1.31.223:6381 10.1.31.224:6379 10.1.31.224:6380 10.1.31.224:6381 10.1.31.225:6379 10.1.31.225:6380 10.1.31.225:6381 10.1.31.226:6379 10.1.31.226:6380 10.1.31.226:6381 10.1.31.227:6379 10.1.31.227:6380 10.1.31.227:6381 10.1.31.228:6379 10.1.31.228:6380 10.1.31.228:6381 10.1.31.229:6379 10.1.31.229:6380 10.1.31.229:6381 10.1.31.230:6379 10.1.31.230:6380 10.1.31.230:6381 10.1.31.220:6379 10.1.31.220:6380 10.1.31.220:6381 10.1.31.221:6379 10.1.31.221:6380 10.1.31.221:6381 --cluster-replicas 1
./redis-cli --cluster create 192.168.190.11:6379 192.168.190.11:6380 192.168.190.11:6381 192.168.190.12:6379 192.168.190.12:6380 192.168.190.12:6381 192.168.190.13:6379 192.168.190.13:6380 192.168.190.13:6381 --cluster-replicas 1
查看集群状态,-c表示集群模式,-h指定ip地址,-p指定端口号,-a访问服务端密码
redis-cli -c -h 10.1.31.222 -p 6379 cluster info
redis-cli -c -h 10.1.31.222 -p 6379 cluster nodes
关闭集群需要逐个关闭
redis-cli -c -h 10.1.31.222 -p 6379 shutdown
全部关闭,可创建一个shell脚本shutdown.sh
redis-cli -c -h 10.1.31.222 -p 6379 shutdown
redis-cli -c -h 10.1.31.222 -p 6380 shutdown
redis-cli -c -h 10.1.31.222 -p 6381 shutdown
错误及解决方案
-
为第一台服务器三个节点构建集群失败
原因:redis集群要求至少有3个Master节点,但是根据参数主从配置比–cluster-replicas为1,即需要3个slave节点,至少需要六个节点
解决方案:按照redis至少3个Master节点以及主从配置比的参数,添加节点数量。
-
建立集群redis-cli --cluster create的时候,一直停在waiting for the cluster to join
原因:节点之间通讯失败,无法构建集群。可能是端口号被占用,如果节点配置的端口号是6379,则16379端口也需要可用,用于总线连接。但是我们的防火墙已经关闭,只需保证redis节点可在绑定的端口启动即可。可能是绑定的ip为0.0.0.0而不是固定ip的原因。
解决方案:修改配置文件中的bind为对应的静态ip。
kafka
安装
解压kafka包
tar -zxvf kafka_2.12-2.5.0.tgz -C /home/hadoop/distributed/
配置
创建目录logs目录:/home/hadoop/distributed/tmp/kafka-logs,需要与配置文件中的log.dirs一致
server.properties
修改/kafka_2.12-2.5.0/config/server.properties
文件
############################# Server Basics #############################
#kafka集群唯一标识,后续每添加一台broker就+1
broker.id=0
############################# Socket Server Settings #############################
#Socket服务器侦听的地址。如果没有配置,即listeners=PLAINTEXT://:9092,它将获得从Java.NET.InAddio.GETCANONICALITHAMEMENE()返回的值
listeners=PLAINTEXT://10.1.31.222:9095
port=9095
#服务器用于从网络接收请求并向网络发送响应的线程数
num.network.threads=3
#服务器用于处理请求的线程数,其中可能包括磁盘I/O
num.io.threads=8
#发送缓冲区,默认100KB
socket.send.buffer.bytes=102400
#接收数据缓冲区,默认100KB
socket.receive.buffer.bytes=102400
#请求最大数量,默认100MB
socket.request.max.bytes=104857600
############################# Log Basics #############################
#日志目录
log.dirs=/home/hadoop/distributed/tmp/kafka-logs
#topic默认的partitions数量。在创建topic时,一般会指定partitions数量。若未指定,则使用该默认值
num.partitions=3
#在启动时恢复日志和关闭时刷盘日志时每个数据目录的线程的数量,默认1
num.recovery.threads.per.data.dir=1
#topic的offset的备份份数,建议设置更高的数字保证更高的可用性
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=1
transaction.state.log.min.isr=1
############################# Log Retention Policy #############################
#日志数据保存时间,默认为7天
log.retention.hours=168
message.max.byte=5242880
default.replication.factor=1
replica.fetch.max.bytes=5242880
#日志分段大小。一topic的一partition下的所有日志会进行分段,达到该大小,即进行日志分段,创建新的日志文件,默认1G
log.segment.bytes=1073741824
#检查日志过期状态的时间间隔,默认5分钟
log.retention.check.interval.ms=300000
log.cleaner.enable=false
############################# Zookeeper #############################
zookeeper.connect=hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181,hadoop6:2181,hadoop7:2181,hadoop8:2181,hadoop9:2181,hadoop10:2181,hadoop11:2181
#ZK连接超时时间,默认为1.8s
zookeeper.connection.timeout.ms=18000
############################# Group Coordinator Settings #############################
#生产环境配置为3000
group.initial.rebalance.delay.ms=0
将/home/hadoop/distributed/kafka_2.12-2.5.0目录传到其他服务器上,修改server.properties中的以下内容
broker.id=1
listeners=PLAINTEXT://10.1.31.223:9095
port=9095
启动
启动每台服务器的kafka
# 前台启动
./bin/kafka-server-start.sh ./config/server.properties
# 后台启动
./bin/kafka-server-start.sh -daemon ./config/server.properties &
# 停止
./kafka-server-stop.sh
验证
使用jps查看是否有kafka进程
创建topic
kafka-topics.sh --create --zookeeper hadoop1:2181,hadoop2:2181,hadoop3:2181 --replication-factor 3 --partitions 3 --topic TestTopic
创建消费者
kafka-console-consumer.sh --bootstrap-server hadoop1:9095 --topic TestTopic --from-beginning
创建生产者
kafka-console-producer.sh --broker-list hadoop1:9095 --topic TestTopic
在生产者中产生的消息若可以同步到消费者,则验证kafka配置成功
spark
安装
解压kafka包
tar -zxvf spark-2.4.6-bin-without-hadoop.tgz -C /home/hadoop/distributed/
修改配置文件
vi /etc/profile
添加下面两行
export SPARK_HOME=/home/hadoop/distributed/spark-2.4.6-bin-without-hadoop
export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
激活配置文件
source /etc/profile
配置
复制conf下的spark-env.sh.template为spark-env.sh
spark-env.sh
添加下述配置
export SPARK_DIST_CLASSPATH=$(/home/hadoop/distributed/hadoop-3.1.3/bin/hadoop classpath)
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_261
export SCALA_HOME=/home/hadoop/distributed/scala-2.12.12
export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH
export PYSPARK_PYTHON=python3
export HADOOP_HOME=/home/hadoop/distributed/hadoop-3.1.3
export HADOOP_CONF_DIR=/home/hadoop/distributed/hadoop-3.1.3/etc/hadoop
export SPARK_HOME=/home/hadoop/distributed/spark-2.4.6-bin-without-hadoop
export SPARK_MASTER_IP=hadoop1
export SPARK_MASTER_HOST=hadoop1
export SPARK_LOCAL_IP=10.1.31.222
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=1
export SPARK_DAEMON_JAVA_OPTS="
-Dspark.deploy.recoveryMode=ZOOKEEPER
-Dspark.deploy.zookeeper.url=hadoop1:2181,hadoop2:2181,hadoop3:2181,hadoop4:2181,hadoop5:2181,hadoop6:2181,hadoop7:2181,hadoop8:2181,hadoop9:2181,hadoop10:2181,hadoop11:2181
-Dspark.deploy.zookeeper.dir=/spark"
export SPARK_LOG_DIR=home/hadoop/distributed/spark-2.4.6-bin-without-hadoop/logs
Slaves
在conf下创建slaves,并添加以下内容
hadoop2
hadoop3
hadoop4
hadoop5
hadoop6
hadoop7
hadoop8
hadoop9
hadoop10
hadoop11
启动
完成上述操作后,将spark同步到其他节点
首先启动zookeeper,hadoop,然后再启动spark
sbin/start-all.sh
查看master的8080端口查看web页面
关闭spark
sbin/stop-all.sh
使用Yarn启动spark
spark-shell --master yarn --deploy-mode client