5.5(正经)全分布式集群
当用虚拟机操作时,可以操作完master后直接克隆,只需要修改slave ip,就可以实现集群搭建
5.5.1集群规划
5.5.2分别创建两台虚拟机,然后分别为他们配置网络
1.1worker00 ip改为200
vi /etc/sysconfig/network-scripts/ifcfg-ens33
将BOOTPROTO=dhcp改为=static
ONBOOT=no改为yes
在末尾添加
IPADDR=192.168.XX.200 #用自己的ip
NETMASK=255.255.255.0
GATEWAY=192.168.XX.2 #用自己的ip
DNS1=114.114.114.114
#重新加载网络设置
service network restar
1.2#worker01改为201
vi /etc/sysconfig/network-scripts/ifcfg-ens33
将BOOTPROTO=dhcp改为=static
ONBOOT=no改为yes
在末尾添加
IPADDR=192.168.XX.201 #用自己的ip
NETMASK=255.255.255.0
GATEWAY=192.168.XX.2 #用自己的ip
DNS1=114.114.114.114
#重新加载网络设置
service network restar
5.5.3分别为主机修改主机名
hostnamectl set-hostname worker00
bash #重置
hostnamectl set-hostname worker01
bash #重置
5.5.4设置几个服务器相互免密
在worker00上生成密钥
ssh-keygen
#发送给自己
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.xx.200
#发送给worker01
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.xx.201
在worker01上生成密钥
ssh-keygen
#发送给自己
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.xx.201
#发送给worker01
ssh-copy-id -i ~/.ssh/id_rsa.pub 192.168.xx.200
5.5.5修改ip映射
在worker00上配置该内容
vi /etc/hosts
#文末添加
192.168.xx.200 worker00
192.168.xx.201 worker01
将配置好的hosts文件发送到worker01
scp /etc/hosts worker01:/etc/hosts
5.5.6解压jdk和hadoop #在worker00上操作
ls /opt(可省略)
tar -zxf /opt/jdk-8u221-linux-x64.tar.gz -C /usr/local/
tar -zxf /opt/hadoop-3.2.4.tar.gz -C /usr/local/
配置环境变量
vi /etc/profile 文末添加
export JAVA_HOME=/usr/local/jdk1.8.0_221
export HADOOP_HOME=/usr/local/hadoop-3.2.4
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
输入source /etc/profile使文件生效
5.5.7配置hadoop #在worker00上操作
cd /usr/local/hadoop-3.2.4/etc/hadoop #配置文件在此目录
在文件开头添加如下内容
HDFS_NAMENODE_USER=root
HDFS_DATANODE_USER=root
HDFS_SECONDARYNAMENODE_USER=root
YARN_RESOURCEMANAGER_USER=root
YARN_NODEMANAGER_USER=root
export JAVA_HOME=/usr/local/jdk1.8.0_221
保存并退出
vi core-site.xml #对hdfs的通信端口进行指定
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://worker00:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop</value>
</property>
</configuration>
vi hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
保存并退出
cd $HADOOP_HOME/etc/hadoop #进入hadoop的配置文件
配置mapred-site.xml
vi mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.2.4</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.2.4</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=/usr/local/hadoop-3.2.4</value>
</property>
</configuration>
vi yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
vi workers #添加被管理的datanode节点
删除所有数据并添加代表着两个节点存数据
worker00
worker01
5.5.8将所有文件发送到worker01
#在worker00上操作
scp /etc/profile worker01:/etc/profile
#到worker01上输入source /etc/profile使更改生效
#在worker00上操作
#jdk
scp -r /usr/local/jdk1.8.0_221 worker01:/usr/local/jdk1.8.0_221
#hadoop
scp -r /usr/local/hadoop-3.2.4/ worker01:/usr/local/hadoop-3.2.4/
5.5.9格式化集群
#在worker00上操作
hdfs namenode -format
5.6.0启动集群
#在worker00上操作
cd $HADOOP_HOME
sbin/start-dfs.sh #启动hdfs集群
sbin/stop-dfs.sh #停止hdfs集群
sbin/start-yarn.sh #启动yarn集群
sbin/stop-yarn.sh #停止yarn集群
5.6.1关闭防火墙
systemctl stop firewalld #关闭防火墙
systemctl disable firewalld #开机不开启防火墙
systemctl status firewalld #查看防火墙状态
6.6 hive配置(要保证伪/全分布式|hdfs+yarn成功配置)
6.6.1下载hive
yum -y install wget
cd /opt
wget https://mirrors.aliyun.com/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
先要启动集群(伪分布式就可以了)
#在worker00上操作
cd $HADOOP_HOME
sbin/start-dfs.sh #启动hdfs集群
sbin/start-yarn.sh #启动yarn集群
6.6.2derby模式配置
1.1解压
tar -zxf /opt/apache-hive-3.1.2-bin.tar.gz -C /usr/local/
1.2配置环境变量HIVE_HOME
vi /etc/profile
文末添加以下内容
export HIVE_HOME=/usr/local/apache-hive-3.1.2-bin
export PATH=$PATH:$HIVE_HOME/bin
输入source /etc/profile使修改生效
直接输入hive启动hive,会报错,按下面步骤继续搞
1.3替换hive中落后的文件
cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/
rm -f $HIVE_HOME/lib/guava-19.0.jar
1.4启动hive-要保证hadoop启动hdfs yarn
切换到/root目录
初始化hive
cd ~ #切换到root目录‘
rm -fr * #删除root下的所有文件,免得干扰初始化
schematool -initSchema -dbType derby #初始化derby数据库
1.5测试一下
create database xx;
use xx;
create table t1(id int,name string);
insert into t1 values(1,'faker'),(2,'bob');
select * from t1;
select count(*) from t1 where id>1;
6.6.3mysql模式配置
1.1安装mysql
rpm -qa|grep mariadb
rpm -e mariadb-libs-5.5.68-1.el7.x86_64 --nodeps #建议用这种
cd /opt
tar -xf mysql-5.7.40-1.el7.x86_64.rpm-bundle.tar #解压
#安装包
rpm -ivh mysql-community-common-5.7.40-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-5.7.40-1.el7.x86_64.rpm
rpm -ivh mysql-community-cliet-5.7.40-1.el7.x86_64.rpm
yum install -y net-tools
yum install -y perl
rpm -ivh mysql-community-server-5.7.40-1.el7.x86_64.rpm
rpm -ivh mysql-community-server-5.7.40-1.el7.x86_64.rpm
1.2启动mysql
systemctl start mysql 如果启动失败,rm -fr /var/lib/mysql/*删除运行痕迹再启动
1.3查看临时密码
cat /var/log/mysql.log | grep password
#登录mysql
mysql -u root -p
输入密码
1.4修改密码
set global validate_password_policy=LOW;
set global validate_password_length=6;
ALTER USER 'root'@'localhost' IDENTIFIED BY'123456';
1.5在mysql中创建hive使用的数据库
create database hivedb CHARACTER SET utf8;
1.6解压
tar -zxf /opt/apache-hive-3.1.2-bin.tar.gz -C /usr/local/
1.7配置环境变量HIVE_HOME
vi /etc/profile
文末添加以下内容
export HIVE_HOME=/usr/local/apache-hive-3.1.2-bin
export PATH=$PATH:$HIVE_HOME/bin
source /etc/profile 使修改生效
直接输入hive启动hive,会报错,按下面步骤继续搞
1.8替换hive中落后的文件
cp $HADOOP_HOME/share/hadoop/common/lib/guava-27.0-jre.jar $HIVE_HOME/lib/
rm -f $HIVE_HOME/lib/guava-19.0.jar
1.9配置hive-site.xml
cd $HIVE_HOME/conf #进入配置文件夹
vi hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hivedb?createDatabaselfNotExist=true&character Encoding=UTF-8&useSSL=false&serverTimezone=GMT</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!--修改为自己的mysql账号-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!--修改为自己的mysql密码-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<!--忽略HIVE元数据库版本的校验,如果非要校验就要进入mysql升级版本-->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
</configuration>
2.0将mysql的链接jar包传到hive安装目录下的lib目录
下载驱动包
/usr/local/apache-hive-3.1.2-bin/lib
2.1初始化mysql与hive
schematool -initSchema -dbType mysql #只需要在安装完成后格式化未来启动不需要
2.2启动hive
hive