ubuntu22 hadoop3.3配置hive全过程记录

目录

安装mysql

安装配置hive

执行前配置

执行hive


安装mysql

sudo apt-get update
sudo apt-get install mysql-server

查看初始密码,如下所示:

sudo cat /etc/mysql/debian.cnf
​
[client]
host     = localhost
user     = debian-sys-maint
password = 密码内容
socket   = /var/run/mysqld/mysqld.sock
[mysql_upgrade]
host     = localhost
user     = debian-sys-maint
password = 密码内容
socket   = /var/run/mysqld/mysqld.sock

用初始密码登录mysql (用上一步的默认密码登录

sudo mysql -u debian-sys-maint -p

修改用户名密码

mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
​
Database changed
mysql> Alter user 'root'@'localhost' IDENTIFIED BY 'root';
Query OK, 0 rows affected (0.02 sec)
​
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

exit退出后用新用户名密码登录

sudo mysql -u root -p

创建新用户hive并授权

mysql> create user 'hive'@'localhost' identified by 'hive';
Query OK, 0 rows affected (0.01 sec)
​
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
​
mysql> grant all privileges on *.* to 'hive'@'localhost' with grant option;
Query OK, 0 rows affected (0.01 sec)
​
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

退出数据库,使用hive用户登录,并创建数据库表

mysql> create database hive default character set utf8 collate utf8_general_ci;

查看hive数据库

mysql> show databases like 'hive';
+------------------+
| Database (hive) |
+------------------+
| hive            |
+------------------+
1 row in set (0.00 sec)

安装配置hive

解压hive文件,并修改权限(目录可更换),注意分条执行

sudo tar -zxvf apache-hive-3.1.3-bin.tar.gz -C /usr/local
cd /usr/local
sudo mv apache-hive-3.1.3-bin/ hive/
sudo chgrp -R root ./hive
sudo chown -R root ./hive
sudo chmod -R 755 ./hive

配置hive环境变量

sudo vim ~/.bashrc
​
# hive
export HIVE_HOME=/opt/apache-hive-bin
export PATH=${PATH}:${HIVE_HOME}/bin

配置conf文件

使用副本 and 新建文件

cd /usr/local/hive/conf
sudo cp ./hive-default.xml.template ./hive-default.xml
sudo cp ./hive-env.sh.template ./hive-env.sh
sudo cp ./hive-log4j2.properties.template ./hive-log4j2.properties
sudo touch ./hive-site.xml

修改hive-env.xml

sudo vim ./hive-env.xml
​
# Set HADOOP_HOME to point to a specific hadoop install directory
# HADOOP_HOME=${bin}/../../hadoop
# hadoop path
export HADOOP_HOME=/usr/local/hadoop
​
# Hive Configuration Directory can be controlled by:
# export HIVE_CONF_DIR=
export HIVE_CONF_DIR=/usr/local/hive/conf
​
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
# export HIVE_AUX_JARS_PATH=
export HIVE_AUX_JARS_PATH=/usr/local/hive/lib

修改hive-log4j2.properties,配置hive的log

sudo vim ./hive-log4j2.properties
​
配置参数如下:
​
property.hive.log.dir=/usr/local/hive/logs 

在/usr/local/hive 目录下新建目录 tmp ,并在其中创建目录hive

cd /usr/local/hive
sudo mkdir tmp
sudo mkdir tmp/hive

修改hive-site.xml文件,参考其他博客,初次配置可用如下配置

<configuration>
    <property>
        <name>hive.metastore.schema.verification</name>
        <value>false</value>
    </property>
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
    <property>
        <name>hive.exec.scratchdir</name>
        <value>/tmp/hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.cj.jdbc.Driver</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>hive</value>
    </property>
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>hive</value>
    </property>
</configuration>

执行前配置

在hdfs中创建hive相关文件路径

hdfs dfs -mkdir -p /user/hive/warehouse
​
可以用如下命令进行查询验证
​
hdfs dfs -ls -R / | grep 'hive'

下载mysql-connector-java,并将其jar包复制到hive lib 文件夹下,下载地址:https://dev.mysql.com/downloads/

# 解压
tar -zxvf mysql-connector-java_8.0.30.tar.gz -C ./
​
# 进入目录
cd mysql-connector-java_8.0.30/
​
# 将jar包复制到hive lib 文件夹
sudo cp ./mysql-connector-java-8.0.30.jar /usr/local/hive/lib/
​
# 进入 hive lib 目录
cd /usr/local/hive/lib/

由于hadoop和hive中存在guava的版本冲突,因此需要替换低版本的jar包

# 查看hive的guava版本
ls -al | grep 'guava'
​
# 查看hadoop的guava版本
ls -al /usr/local/hadoop/share/hadoop/common/lib/*.jar | grep 'guava'
​
# 复制
sudo cp /usr/local/hadoop/share/hadoop/common/lib/guava-27.0-jre.jar ./
​
# 删除低版本
sudo rm -rf guava-19.0.jar

同上删除hive中log4j-slf4j 的jar包

sudo mv log4j-slf4j-impl-2.17.1.jar log4j-slf4j-impl-2.17.1.jar-bck

初始化hiveSchema

(踩坑点1:报错Could not create directory /usr/local/hive/logs,解决方案见最后)

sudo /usr/local/hive/bin/schematool -dbType mysql -initSchema

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL:    jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&characterEncoding=UTF-8
Metastore Connection Driver :    com.mysql.cj.jdbc.Driver
Metastore connection User:   hive
Starting metastore schema initialization to 3.1.0
Initialization script hive-schema-3.1.0.mysql.sql

执行hive

:~$ hive
Hive Session ID = f7d8e77c-ceb4-439f-8942-7149974dd882
​
Logging initialized using configuration in file:/usr/local/hive/conf/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.

创建库表并插入数据

hive> insert overwrite table hive_table select cast(split(t2.l, '[,]')[0] as int) as id, split(t2.l, '[,]')[1] as name, split(t2.l, '[,]')[2] as ver, split(t2.l, '[,]')[3] as package, split(t2.l, '[,]')[4] as path from (select '1,hadoop,3.3.0,hadoop-3.3.0.tar.gz,/opt/hadoop:2,hive,3.2.1,apache-hive-3.1.2-bin.tar.gz,/opt/apache-hive-bin:3,mysql,8.0.20,mysql-server,/usr/local/mysql' as l) t1 lateral view explode(split(t1.l, '[:]')) t2 as l where coalesce(t2.l, '') <> '';
Query ID = biqiangwang_20220921231025_a2d296c2-b74f-41f1-b01f-2fa771d915ea
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1663767802943_0001, Tracking URL = http://master:8088/proxy/application_1663767802943_0001/
Kill Command = /usr/local/hadoop/bin/mapred job  -kill job_1663767802943_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2022-09-21 23:13:40,683 Stage-1 map = 0%,  reduce = 0%
2022-09-21 23:14:28,980 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.73 sec
2022-09-21 23:15:29,204 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.73 sec
2022-09-21 23:16:29,379 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.73 sec
2022-09-21 23:17:30,281 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.73 sec
2022-09-21 23:18:30,455 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 8.73 sec
2022-09-21 23:19:10,298 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 12.7 sec
2022-09-21 23:19:25,624 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 20.23 sec
MapReduce Total cumulative CPU time: 20 seconds 230 msec
Ended Job = job_1663767802943_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://master:9000/user/hive/warehouse/hive.db/hive_table/.hive-staging_hive_2022-09-21_23-10-25_538_3474724099283121188-1/-ext-10000
Loading data to table hive.hive_table
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 20.23 sec   HDFS Read: 22172 HDFS Write: 1363 SUCCESS
Total MapReduce CPU Time Spent: 20 seconds 230 msec
OK
Time taken: 547.47 seconds

有点慢。。。。可能是因为分配内存太少,条件好的可以适当多分配内存。

查看hive插入的数据

:~$ hdfs dfs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x   - username supergroup          0 2022-09-21 23:09 /user/hive/warehouse/hive.db

踩坑点1:报错Could not create directory /usr/local/hive/logs

可能出错的原因有很多:配置文件错误,用户名不匹配等等

本次出错原因在于在解压hive时将文件分配给root角色,使用chomd命令修改目录所属角色即可解决。

# root角色
su root
​
# 修改文件权限
chown -R username fullpath

  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值