基于linux系统安装hadoop+zk+flume+kafka+mysql+hive+redis+es+Rmq+hbase+spark+storm+azkaban
- 安装zookeeper
(1)解压zookeeper安装包(/opt)
#tar -zxvf zookeeper-3.4.7.tar.gz
(2)将zookeeper解压包移到/usr/house/zookeeper下
mv zookeeper-3.4.7 /usr/house/zookeeper
(3)在zookeeper安装包下新建文件夹data(存储zookeeper数据的目录)
#mkdir -p /usr/house/zookeeper/zoo1/data[其他两台机器相同]
#mkdir -p /usr/house/zookeeper/zoo2/data
#mkdir -p /usr/house/zookeeper/zoo3/data
(4)产生标记文件
echo ‘1’> /usr/house/zookeeper/zoo1/data/myid[其他两台机器相同]
echo ‘2’> /usr/house/zookeeper/zoo2/data/myid
echo ‘3’> /usr/house/zookeeper/zoo3/data/myid
(5)创建zoo.cfg文件,并且配置(安装包 conf/ 下)
vi zoo.cfg
路径:/usr/house/zookeeper/zookeeper-3.4.7/conf
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/usr/house/zookeeper/zoo1/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
server.1=192.168.127.21:2888:3888
server.2=192.168.127.22:2888:3888
server.3=192.168.127.23:2888:3888
tickTime=2000 心跳时间间隔
initLimit=10 zookeeper peers向zookeeper集群初次初始化汇报时间 间隔,超过则认为server启动失败
syncLimit=5 peers与leader请求响应超时时间不超过ticktime的倍数
dataDir=/usr/house/zookeeper/zoo1/data zookeeper具体数据存储路径
clientPort=2181 client端口
server.1=192.168.127.21:2888:3888 各个serverIP地址及其选举端口
server.2=192.168.127.22:2888:3888
server.3=192.168.127.23:2888:3888
(6)将1号机上的zookeeper包发送到2,3号机上
scp -r /usr/house/zookeeper/zookeeper-3.4.7/ hadoop03.icccuat:/usr/house/zookeeper
(7)修改2,3号机上的cfg文件 (改成各自的绝对路径)
(8)配置zookeeper的环境(改成自己的安装包绝对路径)
vi /etc/profile source /etc/profile
export ZOOKEEPER_INSTALL=/usr/house/zookeeper/zookeeper-3.4.7
export PATH=$PATH:$ZOOKEEPER_INSTALL/bin
(9)验证安装(启动zookeeper)
zkServer.sh start
分别在三台机器上执行 # zkServer.sh start
查看每台及其zookeeper状态 # zkServer.sh status
回馈结果: 1个leader 2个follower
Zookeeper产生log地址:zookeeper安装包下的zoo1/2/3下
(10)修改zookeeper日志的zookeeper.输出路径
- 修改zookeeper安装包下/bin目录下的zkEnv.sh文件,ZOO_LOG_DIR指定要输出的路径。ZOO_LOG4J_PROP,指定INFO,ROLLINGFILE日志APPENDER
2 - 修改$ZOOKEEPER_HOME/conf/log4j.properties文件的: zookeeper.root.logger的值与前一个文件的ZOO_LOG4J_PROP 保持 一致,该日志配置是以日志文件大小轮转的,如果想要按照天轮转, 可以修改为DaliyRollingFileAppender
2. 安装Hadoop HA*
(1)解压hadoop安装包(/opt)
tar -zxvf hadoop-2.8.2.tar.gz
(2)将hadoop解压包移到/usr/house/hadoop下
mv hadoop-2.8.2 /usr/house/hadoop
(3)配置hadoop配置文件
路径:/usr/house/hadoop/hadoop-2.8.2/etc/hadoop
a、core-site.xml(common属性配置)
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://192.168.127.21:8020</value>
</property>
<property>
<name>fs.trash.checkpoint.interval</name>
<value>0</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop01.icccuat:2181,hadoop02.icccuat:2181,hadoop03.icccuat:2181</value>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>2000</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.GzipCodec,
org.apache.hadoop.io.compress.DefaultCodec,
org.apache.hadoop.io.compress.BZip2Codec,
org.apache.hadoop.io.compress.SnappyCodec
</value>
</property>
</configuration>
<property>
<!-- 这个属性用来指定namenode的hdfs协议的文件系统通信地址,可以指定一个主机+端口,也可 以指定为一个namenode服务(这个服务内部可以有多台namenode实现ha的namenode服务) -->
<name>fs.defaultFS</name>
<value>hdfs://192.168.127.21:8020</value>
</property>
<property>
<!-- hadoop文件系统指向路径-->
<name>hadoop.tmp.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/tmp</value>
</property>
<property>
<!--指定 HaZookeeper路径-->
<name>ha.zookeeper.quorum</name>
<value>hadoop01.icccuat:2181,hadoop02.icccuat:2181,hadoop03.icccuat:2181</value>
</property>
b、配置hdfs-site.xml文件
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.permissions.superusergroup</name>
<value>root</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/data/dfs/name</value>
</property>
<property>
<name>dfs.namenode.edits.dir</name>
<value>${
dfs.namenode.name.dir}</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/data/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop01.icccuat:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop02.icccuat:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop01.icccuat:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop02.icccuat:50070</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8480</value>
</property>
<property>
<name>dfs.journalnode.rpc-address</name>
<value>0.0.0.0:8485</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop01.icccuat:8485;hadoop02.icccuat:8485;hadoop03.icccuat:8485/mycluster</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/data/dfs/jd</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.hosts</name>
<value>/usr/house/hadoop/hadoop-2.8.2/etc/hadoop/slaves</value>
</property>
</configuration>
<property>
<!--hdfs的元数据保存路径-->
<name>dfs.namenode.name.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/data/dfs/name</value>
</property>
<property>
<name>dfs.namenode.edits.dir</name>
<value>${dfs.namenode.name.dir}</value>
</property>
<property>
<!--hadoop中datanode的data路径-->
<name>dfs.datanode.data.dir</name>
<value>/usr/house/hadoop/hadoop-2.8.2/data/dfs/data</value>
</property>
<property>
<!--hdfs备份数量-->
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<!--hdfsblock块大小-->
<name>dfs.blocksize</name>
<value>268435456</value>
</property>
<property>
<!---->
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>hadoop01.icccuat:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>hadoop02.icccuat:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>hadoop01.icccuat:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>hadoop02.icccuat:50070</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<