文章目录
部署规划
ip | hostname | install software | process |
---|---|---|---|
10.62.84.37 | master | hadoop,zookeeper | namenode,ResouceManager,ZKFC |
10.62.84.38 | master2 | hadoop,zookeeper,mysql,hive,spark ,hue | namenode,ResouceManager,mrHistoryserver,ZKFC,mysql,metastore,hiveserver2,Master,sparkHistoryserver,hue |
10.62.84.39 | worker1 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode ,Worker |
10.62.84.40 | worker2 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode,Worker |
10.62.84.41 | worker3 | hadoop,zookeeper,spark | datanode,nodeManager,zookeeper,Journalnode ,Worker |
10.62.84.42 | worker4 | hadoop,spark | datanode,nodeManager,Worker |
元数据库MySql安装
目录规划
/data
|__ mysql
|__ data
|__ logs
|__ pids
|__ client
数据库与户用规划
DBName | userName | password |
---|---|---|
hive | hive | hive |
hue | hue | hue |
首先清除CentOS7系统中默认的数据库mariadb,否则不能安装mysql
首先切换到root用户,然后执行如下命令
rpm -qa |grep mariadb |xargs yum remove -y
安装MySql
切换到hadoop用户
- 安装相关依赖包
sudo yum install -y openssl openssl-devel
sudo yum install -y perl-Module-Install.noarch
- 解除linux系统“文件句柄数和最大用户进程数限制”
具体参考:https://blog.csdn.net/wangkai_123456/article/details/78481698
3、下载MySql的安装包
在https://dev.mysql.com/downloads/mysql/选择为Red Hat Enterprise Linux 7/Oracle Linux 7,把os的版本选择为Red Hat Enterprise Linux 7/Oracle Linux 7(x86,64-bit),直接下载mysql-8.0.16-el7-x86_64.tar.gz
4、将下载的mysql-8.0.16-el7-x86_64.tar.gz放到/usr/local/hadoop/mysql目录,解压缩安装包
tar zxvf mysql-8.0.16-el7-x86_64.tar.gz
mv mysql-8.0.16-el7-x86_64 mysql
5、在mysql目录下新建一个my.cnf配置文件
cd mysql
vi my.cnf
文件内容如下
[client]
socket=/data/mysql/client/mysql.sock
[mysqld]
basedir=/usr/local/hadoop/mysql
datadir=/data/mysql/data
socket=/data/mysql/mysql.sock
log-error=/data/mysql/logs/mysqld.log
pid-file=/data/mysql/pids/mysqld.pid
6、配置环境变量
vi ~/.bashrc
打开后,在文档最下方添加如下配置:
# MySql
export MySQL_HOME=/usr/local/hadoop/mysql
export PATH=$MySQL_HOME/bin:$PATH
修改完后,保存退出,执行如下命令,使更改生效
source ~/.bashrc
7、开始安装mysql
mysqld --defaults-file=/usr/local/hadoop/mysql/my.cnf --initialize --user=hadoop --basedir=/usr/local/hadoop/mysql --datadir=/data/mysql/data
8、启动mysql
mysqld_safe --defaults-file=/usr/local/hadoop/mysql/my.cnf --user=hadoop &
9、设置开机自启动
10、通过 cat /data/mysql/logs/mysqld.log | grep password 命令获取root用户密码
2019-06-01T06:31:42.780544Z 5 [Note] [MY-010454] [Server] A temporary password is generated for root@localhost: g_UgRigTD5=t
测试MySql安装是否成功
1、以root用户登录MySql,执行命令
mysql -u root -p -S /data/mysql/mysql.sock
输入以上命令回车进入,出现输入密码提示
2、输入刚刚查到的密码,进行数据库的登陆,复制粘贴就行,MySQL 的登陆密码也是不显示的
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 8
Server version: 8.0.16
Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>
3、通过 ALTER USER ‘root’@‘localhost’ IDENTIFIED WITH mysql_native_password BY ‘root’; 命令来修改密码
ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'xhw888';
4、通过 exit; 命令退出 MySQL,然后通过新密码再次登陆
至此,mysql8.0.15数据库就安装完成了。
MySql远程访问授权配置
1、以root用户登录MySql
mysql -u root -p -S /data/mysql/mysql.sock
2、使root(也可以配置其他的用户,如hadoop)用户可以操作数据库hive中的所有表
create user 'root'@'%' identified with mysql_native_password by 'root';
grant all privileges on *.* to 'root'@'%' with grant option;
flush privileges;
3、创建一个名为hive的库,并创建新用户hive,使hive用户可以操作数据库hive中的所有表
在master2上执行
# 登录mysql数据库
mysql -u root -p -S /data/mysql/mysql.sock
# 创建数据库 hive
create database hive;
# 创建用户
create user 'hive'@'%' identified by 'hive';
# 授权
grant all privileges on hive.* to 'hive'@'%';
flush privileges;
附:
创建新用户: CREATE USER ‘用户名’@‘host名称’ IDENTIFIED WITH mysql_native_password BY ‘密码’;
给新用户授权:GRANT ALL PRIVILEGES ON . TO ‘用户名’@‘host名称’;
刷新权限: FLUSH PRIVILEGES;
hive安装
安装hive
1、下载hive安装包,下载地址:http://mirror.bit.edu.cn/apache/hive/hive-3.1.1/
2、将下载的hive安装包放到/usr/local目录,解压缩安装包
tar zxvf apache-hive-3.1.1-bin.tar.gz
mv apache-hive-3.1.1-bin hive //重命名目录
3、配置hive环境变量
vi ~/.bashrc
打开后,在文档最下方添加如下配置:
export HIVE_HOME=/usr/local/hive
export PATH=$HIVE_HOME/bin:$PATH
:wq!保存退出,执行如下命令,使更改生效
source ~/.bashrc
4、验证hive是否安装成功
执行hive --version
有hive的版本显现,表示安装成功!
配置hive
切换目录到$HIVE_HOME/conf/下
cd hive/conf/ #进入hive配置目录
1、配置hive运行环境
#从配置模板复制
cp hive-env.sh.template hive-env.sh
vi hive-env.sh
在hive-env.sh末尾添加以下内容(这是我的配置,你可以自行修改):
HADOOP_HOME=/usr/local/hadoop/hadoop
这里修改成自己的hadoop安装目录;
2、添加配置文件hive-site.xml
#新建配置文件hive-site.xml
vi hive-site.xml
将里面的内容修改为
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- mysql config Start -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master2:3306/hive?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>hive</value>
<description>password to use against metastore database</description>
</property>
<!-- mysql config End -->
<!-- hive config Start -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/user/hive/tmp</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.reducers.bytes.per.reducer</name>
<value>67108864</value>
</property>
<property>
<name>hive.default.fileformat</name>
<value>orc</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://master2:9083</value>
</property>
<!-- hive config End -->
<!-- Support Update and Delete Operation Start -->
<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
<name>hive.compactor.initiator.on</name>
<value>true</value>
</property>
<property>
<name>hive.compactor.worker.threads</name>
<value>1</value>
</property>
<!-- Support Update and Delete Operation End -->
</configuration>
hive集成MySql作为元数据库
hive使用MySql作为元数据库,需要下载mysql的jdbc<mysql-connector-java-8.0.16.jar>,然后将下载的jdbc放到hive安装包的lib目录中,mysql包可以在mysql官网下载
下载地址:https://dev.mysql.com/downloads/connector/j/
Select Operating System:Platform Independent
选择下载:mysql-connector-java-8.0.16.tar.gz
#解压
tar zxvf mysql-connector-java-8.0.16.tar.gz
#将解压后mysql-connector-java-8.0.16目录下的mysql-connector-java-8.0.16.jar放到hive安装包的lib目录下
cp mysql-connector-java-8.0.16.jar /usr/local/hadoop/hive/lib/
#删除或重命名/usr/local/hadoop/hive/lib/目录下的log4j-slf4j-impl-2.10.0.jar文件
hadoop与hive集成后会启动hive会出现如下错误
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
mv log4j-slf4j-impl-2.10.0.jar log4j-slf4j-impl-2.10.0.jar.backup
Hive跨集群迁移
主要涉及
- 数据迁移
- metastore迁移
- hive版本升级(本次不涉及)
从源集群迁移hdfs数据至新集群
在新的集群上执行
hadoop distcp hdfs://10.62.124.41:8020/user/hive/* /user/hive
# 差异复制
hadoop distcp -update hdfs://10.62.124.41:8020/user/hive/* /user/hive
源集群metastore数据备份导出
mysqldump -u root -p --databases hive > mysql_hive_data.sql
将metastore数据导入新的集群
mysql -u hive -p -S /data/mysql/mysql.sock --default-character-set=utf8 hive < mysql_hive_data.sql
修改metastore内容库的集群信息
因为新集群HDFS访问的文件系统的名字发生了变化,所以需要修改下hive库中的表DBS和SDS内容
修改如下:
update DBS set DB_LOCATION_URI = replace(DB_LOCATION_URI,'hdfs://hadoopSvr1:8020','hdfs://ns1') ;
update SDS set LOCATION = replace(LOCATION ,'hdfs://hadoopSvr1:8020','hdfs://ns1');