HadoopHA高可用大数据宝宝教程

目录

一.前置环境配置

1.配置网络

1-1 修改网络编辑器,配置网段,我的是200,你们看心情,注意将主机虚拟适配器连接到此网络。

1-2:vi /etc/sysconfig/network-scripts/ifcfg-ens33(三台都要)

2:修改主机名hostnamectl set-hostname master1-1直接bash刷新(三台都要)

3:修改主机映射 vi /etc/hosts(三台都要)

4:关闭防火墙(三台都要)

5:ssh免密登陆

6:解压配置文件 tar –zxvf 要解压的文件 -C /usr/local/src

7:vi /etc/profile 修改环境变量(三台都要)

二.ZOOKEEPER相关配置

1.配置myid

1-1.在zk下创建data,在data里面vi myid写入1

2.配置zk配置文件

3.同传conf文件,记得修改同传后的myid

4.启动zk并查看zk状态(三台都要)

HADOOP配置

1.修改配置文件

1-1:vi slaves

1-2:vi hadoop.env.sh

1-3: vi core-site.xml

1-4: vi hdfs-site.xml

1-5: vi yarn-site.xml

1-6-2:vi mapred-site.xml

2:同传至其他两台节点

3.启动日志节点,注册zk,格式化namenode

3-1:启动日志节点

4.启动集群以及启动备用节点

4-1: 主节点启动集群

4-2:备用节点单独启动yarn和namenode服务

4-3查看节点数量

HBASE配置

1.修改配置文件

1-1:vi regionservers

1-2:vi hbase-env.sh

1-3:vi hbase-site.xml

2.同传文件至其他节点

3.启动hbase集群并查看节点

4.查看节点和网页端是否正确

HIVE配置(只需主节点)

1.解压4个rpm文件(按顺序)

1-1:rpm -ivh --nodeps mysql-community-common-5.7.18-1.el7.x86_64.rpm会报错

2.重启mysql服务,并查看mysql的初始密码并登陆mysql

3.修改mysql密码并且授予权限并刷新权限

4.将hive的驱动传入hive的lib下

5.修改hive的配置文件

5-1:vi hive-env.sh

5-2:vi hive-site.xml (技巧性)

5-2-1:通过URL查找第一个

5-2-2:通过复制上一个的javax.jdo.option.查看剩下的3个

5-2-3:同理

5-2-4:同理

5-2-5:这个需要记忆querylog.location找到第一个

5-2-6:同理

5-2-7:同理

5-2-8:同理

6.进行hive的格式化schematool -initSchema -dbType mysql

KAFKA配置

1. vi /usr/local/src/kafka/config/server.properties

2.启动kafka集群(三台都要)

3.查看节点是否正确(只要有kafka就是对的)

SPARK配置(含scala)

1.修改配置文件

1-2:vi slaves  (删除localhost)

1-3:vi spark-env.sh

2.同传文件

3.启动spark集群

4.查看节点和页面是否正确

5.SCALA只需要配置一个环境变量就可以用啦


一.前置环境配置

1.配置网络

1-1 修改网络编辑器,配置网段,我的是200,你们看心情,注意将主机虚拟适配器连接到此网络。

1-2:vi /etc/sysconfig/network-scripts/ifcfg-ens33(三台都要)

配置如下信息并且注意修改为BOOTPROTO=static

IPADDR=192.168.200.10

GATEWAY=192.168.200.2

NETMASK=255.255.255.0          

DNS1=8.8.8.8

systemctl restart network

重启网络生效

2:修改主机名hostnamectl set-hostname master1-1直接bash刷新(三台都要)

3:修改主机映射 vi /etc/hosts(三台都要)

192.168.200.10 master1-1

192.168.200.20 slave1-1

192.168.200.30 slave1-2

4:关闭防火墙(三台都要)

systemctl stop firewalld

systemctl disable firewalld

5:ssh免密登陆

ssh-keygen -t rsa

三个回车,潇洒回头

ssh-copy-id master1-1

ssh-copy-id slave1-1

ssh-copy-id slave1-2

6:解压配置文件 tar –zxvf 要解压的文件 -C /usr/local/src

解压完把文件改个名字,好做环境变量

7:vi /etc/profile 修改环境变量(三台都要)

export JAVA_HOME=/usr/local/src/java

export HADOOP_HOME=/usr/local/src/hadoop

export ZOOKEEPER_HOME=/usr/local/src/zk

export HBASE_HOME=/usr/local/src/hbase

export KAFKA_HOME=/usr/local/src/kafka

export HIVE_HOME=/usr/local/src/hive

export STORM_HOME=/usr/local/src/storm

export SCALA_HOME=/usr/local/src/scala

export SPARK_HOME=/usr/local/src/spark



export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:$HBASE_HOME/bin:$KAFKA_HOME/bin:$HIVE_HOME/bin:$STORM_HOME/bin:$SCALA_HOME/bin:$SPARK_HOME/bin:$SPARK_HOME/sbin

source /etc/profile

建议三台再java -version看下是否出现版本号,出现即为正确。没出现确认下是否src目录下文件是否同传过去

.ZOOKEEPER相关配置

1.配置myid

1-1.在zk下创建data,在data里面vi myid写入1

2.配置zk配置文件

2-1: 进入conf目录下

2-2:    cp zoo_sample.cfg zoo.cfg

2-3:    vi zoo.cfg 里面的dataDir是没有被注释掉的,你可以删掉,或者直接修改他的dataDir,我习惯于删掉

dataDir=/usr/local/src/zk/data

server.1=master1-1:2888:3888

server.2=slave1-1:2888:3888

server.3=slave1-2:2888:3888

3.同传conf文件,记得修改同传后的myid

scp -r /usr/local/src/zk/ slave1-1:/usr/local/src/

scp -r /usr/local/src/zk/ slave1-2:/usr/local/src/

slave1-1:vi /usr/local/src/zk/data/myid      2

slave1-2:vi /usr/local/src/zk/data/myid      3

4.启动zk并查看zk状态(三台都要)

zkServer.sh start

zkServer.sh status

正确应该是2个follow  1个leader顺序不重要,有这三个就行

如果出错,着重注意是否是防火墙和myid的问题

HADOOP配置

1.修改配置文件

1-1vi slaves

master1-1

slave1-1

slave1-2

1-2:vi hadoop.env.sh

export JAVA_HOME=/usr/local/src/java

1-3: vi core-site.xml

<property>

<name>dfs.nameservices</name>

<value>mycluster</value>

</property>



<property>

<name>fs.defaultFS</name>

<value>hdfs://mycluster</value>

</property>



<property>

<name>hadoop.tmp.dir</name>

<value>/usr/local/src/hadoop/tmp</value>

</property>



<property>

<name>ha.zookeeper.quorum</name>

<value>master1-1:2181,slave1-1:2181,slave1-2:2181</value>

</property>

1-4: vi hdfs-site.xml

<property>

<name>dfs.nameservices</name>

<value>mycluster</value>

</property>



<property>

<name>dfs.ha.namenodes.mycluster</name>

<value>master1-1,slave1-1</value>

</property>





<property>

<name>dfs.namenode.name.dir</name>

<value>/usr/local/src/hadoop/tmp/name</value>

</property>



<property>

<name>dfs.datanode.data.dir</name>

<value>/usr/local/src/hadoop/tmp/data</value>

</property>



<property>

<name>dfs.journalnode.edits.dir</name>

<value>/usr/local/src/hadoop/tmp/journal</value>

</property>



<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://master1-1:8485;slave1-1:8485;slave1-2:8485/mycluster</value>

</property>



<property>

<name>dfs.namenode.rpc-address.mycluster.master1-1</name>

<value>master1-1:9000</value>

</property>



<property>

<name>dfs.namenode.rpc-address.mycluster.slave1-1</name>

<value>slave1-1:9000 </value>

</property>



<property>

<name>dfs.namenode.http-address.mycluster.master1-1</name>

<value>master1-1: 50070</value>

</property>



<property>

<name>dfs.namenode.http-address.mycluster.slave1-1</name>

<value>slave1-1: 50070</value>

</property>



<property>

<name>dfs.ha.fencing.methods</name>

<value>sshfence</value>

</property>



<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/root/.ssh/id_rsa</value>

</property>



<property>

<name>dfs. ha.automatic-failover. enabled</name>

<value>true</value>

</property>



<property>

<name>dfs.client.failover.proxy.provider.mycluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>



1-5: vi yarn-site.xml

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>



<property>

<name>yarn.resourcemanager.ha.enabled</name>

<value>true</value>

</property>



<property>

<name>yarn.resourcemanager.cluster-id</name>

<value>yrc</value>

</property>



<property>

<name>yarn.resourcemanager.ha.rm-ids</name>

<value>rm1,rm2</value>

</property>



<property>

<name>yarn.resourcemanager.hostname.rm1</name>

<value>master1-1</value>

</property>



<property>

<name>yarn.resourcemanager.hostname.rm2</name>

<value>slave1-1</value>

</property>



<property>

<name>yarn.resourcemanager.zk-address</name>

<value>master1-1:2181,slave1-1:2181,slave1-2:2181</value>

</property>



<property>

<name>yarn.resourcemanager.store.class</name>

<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>

</property>

1-6-1:

 cp mapred-site.xml.template mapred-site.xml

1-6-2:vi mapred-site.xml

<property>

<name>mapreduce.framework.name</name>

<value>yarn</value>

</property>

2:同传至其他两台节点

scp -r /usr/local/src/hadoop/etc/hadoop/ slave1-1:/usr/local/src/hadoop/etc/

scp -r /usr/local/src/hadoop/etc/hadoop/ slave1-2:/usr/local/src/hadoop/etc/

3.启动日志节点,注册zk,格式化namenode

3-1:启动日志节点

hadoop-daemons.sh start journalnode

3-2:注册ZK

hdfs zkfc –formatZK

3-3:格式化namenode

hdfs namenode -format

4.启动集群以及启动备用节点

4-1: 主节点启动集群

master1-1: start-all.sh

4-2:备用节点单独启动yarn和namenode服务

slave1-1: yarn-daemon.sh start resourcemanager

slave1-1: hadoop-daemon.sh start namenode

4-3查看节点数量

[root@master1-1 ~]# jps                    (8)

3200 NameNode

3602 DFSZKFailoverController

3810 NodeManager

2150 QuorumPeerMain

3705 ResourceManager

4139 Jps

3325 DataNode

2831 JournalNode

[root@slave1-1 ~]# jps                       (8)

2336 DFSZKFailoverController

1891 QuorumPeerMain

2692 Jps

2405 NodeManager

2613 NameNode

2055 JournalNode

2223 DataNode

2543 ResourceManager

[root@slave1-2 ~]# jps                             (5)

1888 QuorumPeerMain

2167 DataNode

2393 Jps

2061 JournalNode

2285 NodeManager

4-4:查看页面端是否正确

HBASE配置

1.修改配置文件

1-1vi regionservers

slave1-1

slave1-2

1-2vi hbase-env.sh

结尾处写上

export JAVA_HOME=/usr/local/src/java

export HADOOP_HOME=/usr/local/src/hadoop

1-3vi hbase-site.xml

<property>

<name>hbase.cluster.distributed</name>

<value>true</value>

</property>



<property>

<name>hbase.tmp.dir</name>

<value>/usr/local/src/hbase/tmp</value>

</property>



<property>

<name>hbase.rootdir</name>

<value>hdfs://mycluster/hbase</value>

</property>



<property>

<name>hbase.master.info.port</name>

<value>60010</value>

</property>



<property>

<name>hbase.zookeeper.quorum</name>

<value>master1-1,slave1-1,slave1-2</value>

</property>



<property>

<name>hbase.zookeeper.property.port</name>

<value>2181</value>

</property>



<property>

<name>zookeeper.session.timeout</name>

<value>120000</value>

</property>

2.同传文件至其他节点

scp -r /usr/local/src/hbase/conf/ slave1-1:/usr/local/src/hbase/

scp -r /usr/local/src/hbase/conf/ slave1-2:/usr/local/src/hbase/

3.启动hbase集群并查看节点

start-hbase.sh

4.查看节点和网页端是否正确

[root@master1-1 ~]# jps             (9)

3200 NameNode

3602 DFSZKFailoverController

3810 NodeManager

2150 QuorumPeerMain

3705 ResourceManager

4490 HMaster

4795 Jps

3325 DataNode

2831 JournalNode

[root@slave1-1 ~]# jps                       (9个)

2336 DFSZKFailoverController

3186 Jps

1891 QuorumPeerMain

2405 NodeManager

2613 NameNode

2055 JournalNode

2985 HRegionServer

2223 DataNode

2543 ResourceManager

[root@slave1-2 ~]# jps                  (6)

1888 QuorumPeerMain

2578 HRegionServer

2788 Jps

2167 DataNode

2061 JournalNode

2285 NodeManager

HIVE配置(只需主节点)

1.解压4个rpm文件(按顺序)

mysql-community-common-5.7.18-1.el7.x86_64.rpm

mysql-community-libs-5.7.18-1.el7.x86_64.rpm

mysql-community-client-5.7.18-1.el7.x86_64.rp

mysql-community-server-5.7.18-1.el7.x86_64.rpm

1-1:rpm -ivh --nodeps mysql-community-common-5.7.18-1.el7.x86_64.rpm会报错

然后根据提示输入命令

rpm -e –nodeps 后面报错会提示出来

rpm -e --nodeps mariadb-libs-1:5.5.56-2.el7.x86_64

然后就可以按顺序安装了

rpm –ivh –nodeps mysql-community-common-5.7.18-1.el7.x86_64.rpm

rpm –ivh –nodeps mysql-community-libs-5.7.18-1.el7.x86_64.rpm

rpm –ivh –nodeps mysql-community-client-5.7.18-1.el7.x86_64.rp

rpm –ivh –nodeps mysql-community-server-5.7.18-1.el7.x86_64.rpm

2.重启mysql服务,并查看mysql的初始密码并登陆mysql

2-1:重启mysql服务:systemctl restart mysqld

2-2:查看mysql初始密码:grep password /var/log/mysqld.log

2-3:登陆mysql:mysql –uroot –p 回车后输入初始化的密码

3.修改mysql密码并且授予权限并刷新权限

mysql> set password=password("Password123$");

mysql> grant all privileges on *.* to "root"@"%" identified by "Password123$";

mysql> flush privileges;

4.将hive的驱动传入hive的lib下

cp -r /root/h3cu/mysql-connector-java-5.1.46.jar /usr/local/src/hive/lib/

5.修改hive的配置文件

cp hive-env.sh.template hive-env.sh

cp hive-default.xml.template hive-site.xml

5-1:vi hive-env.sh

HADOOP_HOME=/usr/local/src/hadoop

5-2:vi hive-site.xml (技巧性)

5-2-1:通过URL查找第一个

<property>

<name>javax.jdo.option.ConnectionURL</name>  

<value>jdbc:mysql://matser1-1:3306/hive?createDatabaseIfNotExist=true&amp;useSSL=false</value>

<description>JDBC connect string for a JDBC metastore</description>

</property>

5-2-2:通过复制上一个的javax.jdo.option.查看剩下的3个

  <property>

    <name>javax.jdo.option.ConnectionDriverName</name>

    <value>com.mysql.jdbc.Driver</value>

    <description>Driver class name for a JDBC metastore</description>

  </property>

5-2-3:同理

<property>

    <name>javax.jdo.option.ConnectionPassword</name>

    <value>Password123$</value>

    <description>password to use against metastore database</description>

  </property>

5-2-4:同理

  <property>

    <name>javax.jdo.option.ConnectionUserName</name>

    <value>root</value>

    <description>Username to use against metastore database</description>

  </property>

5-2-5:这个需要记忆querylog.location找到第一个

复制${system:java.io.tmpdir}/${system:user.name}

把后续需要的所有的${system:java.io.tmpdir}/${system:user.name}都换成/usr/local/src/hive/tmp

  <property>

    <name>hive.querylog.location</name>

    <value>/usr/local/src/hive/tmp</value>

    <description>Location of Hive run time structured log file</description>

  </property>

5-2-6:同理

 <property>

 <name>hive.server2.logging.operation.log.location</name>

 <value>/usr/local/src/hive/tmp/operation_logs</value>

 <description>Top level directory where operation logs are stored if logging functionality is enabled</description>

 </property>

5-2-7:同理

  <property>

    <name>hive.exec.local.scratchdir</name>

    <value>/usr/local/src/hive/tmp</value>

    <description>Local scratch space for Hive jobs</description>

  </property>

5-2-8:同理

  <property>

    <name>hive.downloaded.resources.dir</name>

    <value>/usr/local/src/hive/tmp/resources</value>

    <description>Temporary local directory for added resources in the remote file system.</description>

  </property>

6.进行hive的格式化schematool -initSchema -dbType mysql

出现如下即为成功schemaTool completed

schematool -initSchema -dbType mysql

KAFKA配置

1. vi /usr/local/src/kafka/config/server.properties

master1-1:

broker.id=0

zookeeper.connect=master1-1,slave1-1,slave1-2



slave1-1:

broker.id=1

zookeeper.connect=master1-1,slave1-1,slave1-2



slave1-2:

broker.id=2

zookeeper.connect=master1-1,slave1-1,slave1-2

2.启动kafka集群(三台都要)

kafka-server-start.sh -daemon /usr/local/src/kafka/config/server.properties

3.查看节点是否正确(只要有kafka就是对的)

[root@master1-1 ~]# jps

3200 NameNode

3602 DFSZKFailoverController

3810 NodeManager

2150 QuorumPeerMain

3705 ResourceManager

4490 HMaster

7227 Kafka

3325 DataNode

7341 Jps

2831 JournalNode

[root@slave1-1 ~]# jps

2336 DFSZKFailoverController

4064 Kafka

1891 QuorumPeerMain

2405 NodeManager

2613 NameNode

2055 JournalNode

2985 HRegionServer

4156 Jps

2223 DataNode

2543 ResourceManager

[root@slave1-2 ~]# jps

1888 QuorumPeerMain

2578 HRegionServer

3203 Kafka

3286 Jps

2167 DataNode

2061 JournalNode

2285 NodeManager

SPARK配置(含scala)

1.修改配置文件

cp slaves.template slaves

cp spark-env.sh.template spark-env.sh

1-2vi slaves  (删除localhost)

master1-1

slave1-1

slave1-2

1-3:vi spark-env.sh

export JAVA_HOME=/usr/local/src/java

export HADOOP_HOME=/usr/local/src/hadoop

export SCALA_HOME=/usr/local/src/scala

export SPARK_HOME=/usr/local/src/spark

export SPARK_MASTER_IP=master1-1

export HADOOP_CONF_DIR=/usr/local/src/hadoop/etc

export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master1-1,slave1-1,slave1-2 -Dspark.deploy.zookeeper.dir=/spark"

如果不加如下这句话的话,会导致spark非高可用,即slave1-1不是standby

export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=master1-1,slave1-1,slave1-2 -Dspark.deploy.zookeeper.dir=/spark"

2.同传文件

scp -r /usr/local/src/spark/conf/ slave1-1:/usr/local/src/spark/

scp -r /usr/local/src/spark/conf/ slave1-2:/usr/local/src/spark/

3.启动spark集群

[root@master1-1 sbin]# ./start-all.sh

[root@slave1-1 sbin]# ./start-master.sh

由于配置了spark的环境变量且spark的集群启动命令和hadoop相同,所以spark的集群启动一定要在spark的sbin目录下进行./start-all.sh 否则启动的是hadoop的集群

4.查看节点和页面是否正确

[root@master1-1 sbin]# jps

3200 NameNode

3602 DFSZKFailoverController

3810 NodeManager

2150 QuorumPeerMain

3705 ResourceManager

4490 HMaster

7227 Kafka

7404 Master

3325 DataNode

7485 Worker

2831 JournalNode

7535 Jps

[root@slave1-1 ~]# jps

2336 DFSZKFailoverController

4064 Kafka

1891 QuorumPeerMain

2405 NodeManager

2613 NameNode

2055 JournalNode

2985 HRegionServer

4364 Jps

4270 Worker

2223 DataNode

2543 ResourceManager

7411 Master

[root@slave1-2 ~]# jps

1888 QuorumPeerMain

2578 HRegionServer

3203 Kafka

3334 Worker

2167 DataNode

3404 Jps

2061 JournalNode

2285 NodeManager

下面图是借的,应该室slave1-1:7077,注意看status是STANDBY

5.SCALA只需要配置一个环境变量就可以用啦


不全后续再议

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值