新的比较详细的hive安装教程,包含:hadoop的安装配置,hive安装详细教程,大数据下hive详细配置和简单使用和hive元数据迁移相关安装配置操作

简介:这是大数据里安装hive比较详细的教程,包括安装hive时hadoop的相关配置,hive安装,hive 之后的mysql安装,hive的元素配置到mysql相关配置,hive常见属性的相关配置。

注意:在安装hive之前,请确保您的hadoop集群模式已经安装完毕并配置好。

hive相关内容官网地址

hive的官网地址请点击这里
hive的官网下载地址请点击这里
hive的官方文档查看请点击这里
hive的github地址请点击这里

安装之前的hadoop相关配置

ps:如果您的hadoop没有安装配置好,可以参考我的另外一篇博客:基于Linux(最小安装)平台的hadoop完全分布式系统搭建详细搭建过程

1、在hadoop的配置文件(core-site.xml)里增加如下配置

<!-- 配置该lqs(superUser)允许通过代理访问的主机节点 -->
<property>
        <name>hadoop.proxyuser.lqs.hosts</name>
        <value>*</value>
</property>
<!-- 配置该lqs(superUser)允许通过代理用户所属组 -->
<property>
        <name>hadoop.proxyuser.lqs.groups</name>
        <value>*</value>
</property>
<!-- 配置该lqs(superUser)允许通过代理的用户-->
<property>
        <name>hadoop.proxyuser.lqs.users</name>
        <value>*</value>
</property>

2、在hadoop的配置文件(yarn-site.xml)里增加如下配置

<!-- NodeManager使用内存数,默认8G,修改为4G内存 -->
<property>
	<description>Amount of physical memory, in MB, that can be allocated for containers. If set to -1 andyarn.nodemanager.resource.detect-hardware-capabilities is true, it isautomatically calculated(in case of Windows and Linux).In other cases, the default is 8192MB.
	可以分配的物理内存量,以 MB 为单位
	用于容器。 如果设置为 -1 和yarn.nodemanager.resource.detect-hardware-capabilities 是真的,它是自动计算(在 Windows 和 Linux 的情况下)。
	在其他情况下,默认值为 8192MB。
	</description>
	<name>yarn.nodemanager.resource.memory-mb</name>
	<value>4096</value>
</property>
<!-- 容器最小内存,默认512M -->
<property>
	<description>The minimum allocation for every container request at the RM	in MBs. Memory requests lower than this will be set to the value of this	property. Additionally, a node manager that is configured to have less memory	than this value will be shut down by the resource manager.
	RM 上每个容器请求的最小分配(以 MB 为单位)。 低于此值的内存请求将设置为此属性的值。 此外,配置为内存少于此值的节点管理器将被资源管理器关闭。
	</description>
	<name>yarn.scheduler.minimum-allocation-mb</name>
	<value>512</value>
</property>

<!-- 容器最大内存,默认8G,修改为4G -->
<property>
	<description>The maximum allocation for every container request at the RM	in MBs. Memory requests higher than this will throw an	InvalidResourceRequestException.
	RM 上每个容器请求的最大分配(以 MB 为单位)。 高于此值的内存请求将引发 InvalidResourceRequestException。
	</description>
	<name>yarn.scheduler.maximum-allocation-mb</name>
	<value>4096</value>
</property>

<!-- 虚拟内存检查,默认打开,修改为关闭 -->
<property>
	<description>Whether virtual memory limits will be enforced forcontainers.
	是否对容器实施虚拟内存限制。
	</description>
	<name>yarn.nodemanager.vmem-check-enabled</name>
	<value>false</value>
</property>


注意:两个文件修改完毕后,记得分发,然后重启集群

安装hive

1、解压apache-hive-3.1.2-bin.tar.gz到/home/lqs/module/目录下面

[lqs@bdc112 module]$ tar -zxvf apache-hive-3.1.2-bin.tar.gz -C /home/lqs/module/

2、修改解压后的apache-hive-3.1.2-bin名字为hive

[lqs@bdc112 module]$ mv apache-hive-3.1.2-bin/ hive

3、修改/etc/profile.d/my_env.sh,添加hive的环境变量

[lqs@bdc112 module]$ sudo vim /etc/profile.d/my_env.sh

添加以下配置信息:

#HIVE_HOME
export HIVE_HOME=/home/lqs/module/hive
export PATH=$PATH:$HIVE_HOME/bin

4、解决日志jar包冲突

[lqs@bdc112 module]$ cd hive/lib/
[lqs@bdc112 lib]$ pwd
/home/lqs/module/hive/lib
[lqs@bdc112 lib]$ mv log4j-slf4j-impl-2.10.0.jar log4j-slf4j-impl-2.10.0.jar.bak

5、初始化hive元数据库

[lqs@bdc112 hive]$ bin/schematool -dbType derby -initSchema

启动和使用hive

1、启动hive

[lqs@bdc112 hive]$ bin/hive

2、使用hive

#查看数据库
hive> show databases;
OK
default
Time taken: 0.545 seconds, Fetched: 1 row(s)

#查看数据表
hive> show tables;
OK
Time taken: 0.048 seconds

#创建数据表test
hive> create table test(id int);
OK
Time taken: 0.627 seconds


#查看数据表
hive> show tables;
OK
test
Time taken: 0.034 seconds, Fetched: 1 row(s)

# 向数据表test中插入值666
hive> insert into test values(666);
Query ID = lqs_20211219191533_25b60265-76a8-486f-9bbe-1cd0aa3d6fca
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1639912074286_0001, Tracking URL = http://bdc113:8088/proxy/application_1639912074286_0001/
Kill Command = /home/lqs/module/hadoop-3.1.3/bin/mapred job  -kill job_1639912074286_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-12-19 19:15:48,311 Stage-1 map = 0%,  reduce = 0%
2021-12-19 19:15:56,670 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 4.61 sec
2021-12-19 19:16:02,860 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.5 sec
MapReduce Total cumulative CPU time: 7 seconds 500 msec
Ended Job = job_1639912074286_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://bdc112:8020/user/hive/warehouse/test/.hive-staging_hive_2021-12-19_19-15-33_239_8332044782395566127-1/-ext-10000
Loading data to table default.test
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 7.5 sec   HDFS Read: 12690 HDFS Write: 205 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 500 msec
OK
Time taken: 31.221 seconds

#查看来自表test的全部值
hive> select * from test;
OK
666
Time taken: 0.177 seconds, Fetched: 1 row(s)

3、在xshell窗口中开启另一个窗口开启Hive,在/tmp/lqs目录下监控hive.log文件
具体操作步骤见我的另外一篇博客:[lqs@bdc112 hive]$ rm -rf derby.log metastore_db
[lqs@bdc112 hive]$ hadoop fs -rm -r /user

mysql安装

1、检查当前系统是否安装过Mysql

[lqs@bdc112 hive]$ rpm -qa | grep mariadb
mariadb-libs-5.5.56-2.el7.x86_64
# 如果存在,用下面的命令删除
[lqs@bdc112 hive]$ sudo rpm -e --nodeps mariadb-libs
#再次查看已经没有了
[lqs@bdc112 hive]$ rpm -qa | grep mariadb
[lqs@bdc112 hive]$ 

2、解压mysql安装包

[lqs@bdc112 software]$ mkdir mysql-lib-sources

[lqs@bdc112 software]$ tar -xf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar -C mysql-lib-sources

3、在mysql-lib-sources目录下执行rpm安装

注意:mysql的安装,请安装我下面的安装顺序安装!!!

[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-common-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
准备中...                          ################################# [100%]
正在升级/安装...
   1:mysql-community-common-5.7.28-1.e################################# [100%]
[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-libs-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
准备中...                          ################################# [100%]
正在升级/安装...
   1:mysql-community-libs-5.7.28-1.el7################################# [100%]
[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
准备中...                          ################################# [100%]
正在升级/安装...
   1:mysql-community-libs-compat-5.7.2################################# [100%]
[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-client-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
准备中...                          ################################# [100%]
正在升级/安装...
   1:mysql-community-client-5.7.28-1.e################################# [100%]
[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-server-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
错误:依赖检测失败:
	libaio.so.1()(64bit) 被 mysql-community-server-5.7.28-1.el7.x86_64 需要
	libaio.so.1(LIBAIO_0.1)(64bit) 被 mysql-community-server-5.7.28-1.el7.x86_64 需要
	libaio.so.1(LIBAIO_0.4)(64bit) 被 mysql-community-server-5.7.28-1.el7.x86_64 需要

#这里报错是因为的linux是最小安装,需要在安装依赖即可
[lqs@bdc112 mysql-lib-sources]$ sudo yum install -y libaio
已加载插件:fastestmirror
Determining fastest mirrors
epel/x86_64/metalink                                                                | 5.2 kB  00:00:00     
 * base: mirrors.aliyun.com
 * epel: mirror.sjtu.edu.cn
 * extras: mirrors.aliyun.com
 * updates: mirrors.aliyun.com
base                                                                                | 3.6 kB  00:00:00     
epel                                                                                | 4.7 kB  00:00:00     
extras                                                                              | 2.9 kB  00:00:00     
updates                                                                             | 2.9 kB  00:00:00     
epel/x86_64/primary_db         FAILED                                           B/s | 1.5 MB  --:--:-- ETA 
http://ftp.riken.jp/Linux/fedora/epel/7/x86_64/repodata/f3f0a7c647d09319af31f75720ff0ac5d209036393c09581c0bc281b3c7bf80c-primary.sqlite.bz2: [Errno 14] HTTP Error 404 - Not Found
正在尝试其它镜像。
To address this issue please refer to the below wiki article 

https://wiki.centos.org/yum-errors

If above article doesn't help to resolve this issue please use https://bugs.centos.org/.

(1/3): epel/x86_64/updateinfo                                                       | 1.0 MB  00:00:00     
(2/3): updates/7/x86_64/primary_db                                                  |  13 MB  00:00:02     
epel/x86_64/primary_db         FAILED                                          
http://ftp.jaist.ac.jp/pub/Linux/Fedora/epel/7/x86_64/repodata/f3f0a7c647d09319af31f75720ff0ac5d209036393c09581c0bc281b3c7bf80c-primary.sqlite.bz2: [Errno 14] HTTP Error 404 - Not Found
正在尝试其它镜像。
(3/3): epel/x86_64/primary_db                                                       | 7.0 MB  00:00:00     
正在解决依赖关系
--> 正在检查事务
---> 软件包 libaio.x86_64.0.0.3.109-13.el7 将被 安装
--> 解决依赖关系完成

依赖关系解决

===========================================================================================================
 Package                架构                   版本                             源                    大小
===========================================================================================================
正在安装:
 libaio                 x86_64                 0.3.109-13.el7                   base                  24 k

事务概要
===========================================================================================================
安装  1 软件包

总下载量:24 k
安装大小:38 k
Downloading packages:
libaio-0.3.109-13.el7.x86_64.rpm                                                    |  24 kB  00:00:00     
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
警告:RPM 数据库已被非 yum 程序修改。
  正在安装    : libaio-0.3.109-13.el7.x86_64                                                           1/1 
  验证中      : libaio-0.3.109-13.el7.x86_64                                                           1/1 

已安装:
  libaio.x86_64 0:0.3.109-13.el7                                                                           

完毕!
[lqs@bdc112 mysql-lib-sources]$ sudo rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm 
警告:mysql-community-server-5.7.28-1.el7.x86_64.rpm: 头V3 DSA/SHA1 Signature, 密钥 ID 5072e1f5: NOKEY
准备中...                          ################################# [100%]
正在升级/安装...
   1:mysql-community-server-5.7.28-1.e################################# [100%]

4、删除/etc/my.cnf文件中datadir指向的目录下的所有内容,如果有内容的情况下
查看datadir的值

[lqs@bdc112 mysql-lib-sources]$ cat /etc/my.cnf
# For advice on how to change settings please see
# http://dev.mysql.com/doc/refman/5.7/en/server-configuration-defaults.html

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock

# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0

log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

删除/var/lib/mysql目录下的所有内容

[lqs@bdc112 mysql-lib-sources]$ cd /var/lib/mysql
[lqs@bdc112 mysql]$ sudo rm -rf *

5、初始化数据库

[lqs@bdc112 ~]$ sudo mysqld --initialize --user=mysql

6、查看临时生成的root用户的密码

[lqs@bdc112 ~]$ sudo cat /var/log/mysqld.log

在这里插入图片描述

7、启动MySQL服务

[lqs@bdc112 ~]$ sudo systemctl start mysqld
#查看mysql服务的状态
[lqs@bdc112 ~]$ sudo systemctl status mysqld
● mysqld.service - MySQL Server
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since 日 2021-12-19 20:26:52 CST; 8s ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 5292 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
  Process: 5274 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 5295 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─5295 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

12月 19 20:26:51 bdc112 systemd[1]: Starting MySQL Server...
12月 19 20:26:52 bdc112 systemd[1]: Started MySQL Server.

8、登录MySQL数据库

[lqs@bdc112 ~]$ mysql -u root -p
Enter password: #这里输入刚刚的临时密码
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.28

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

9、必须先修改root用户的密码,否则执行其他的操作会报错

mysql> set password = password("912811");
Query OK, 0 rows affected, 1 warning (0.00 sec)

10、修改mysql库下的user表中的root用户允许任意ip连接

mysql> update mysql.user set host='%' where user='root';
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

hive元数据配置到mysql

拷贝驱动

#将MySQL的JDBC驱动拷贝到Hive的lib目录下
[lqs@bdc112 lib]$ cp /home/lqs/software/mysql-connector-java-5.1.37.jar ./

配置Metastore到mysql

1、在$HIVE_HOME/conf目录下新建hive-site.xml文件

[lqs@bdc112 conf]$ vim hive-site.xml

添加如下内容

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
    <!-- jdbc连接的URL -->
    <property>
        <name>javax.jdo.option.ConnectionURL</name>
        <value>jdbc:mysql://bdc112:3306/metastore?useSSL=false</value>
	</property>

    <!-- jdbc连接的Driver-->
    <property>
        <name>javax.jdo.option.ConnectionDriverName</name>
        <value>com.mysql.jdbc.Driver</value>
	</property>

	<!-- jdbc连接的username-->
    <property>
        <name>javax.jdo.option.ConnectionUserName</name>
        <value>root</value>
    </property>

    <!-- jdbc连接的password -->
    <property>
        <name>javax.jdo.option.ConnectionPassword</name>
        <value>912811</value>
	</property>

    <!-- Hive默认在HDFS的工作目录 -->
    <property>
        <name>hive.metastore.warehouse.dir</name>
        <value>/user/hive/warehouse</value>
    </property>
</configuration>

2、登录mysql

[lqs@bdc112 conf]$ mysql -u root -p

3、新建Hive元数据库

mysql> create database metastore;
Query OK, 1 row affected (0.00 sec)

mysql> quit;

4、初始化Hive元数据库

[lqs@bdc112 conf]$ schematool -initSchema -dbType mysql -verbose

最后出现一下内容则代表初始化成功
在这里插入图片描述

再次启动Hive

1、启动hive

[lqs@bdc112 hive]$ bin/hive

2、使用hive

hive> show databases;
OK
default
Time taken: 0.578 seconds, Fetched: 1 row(s)
hive> show tables;
OK
Time taken: 0.045 seconds
hive> create table test (id int);
OK
Time taken: 0.563 seconds
hive> show tables;
OK
test
Time taken: 0.048 seconds, Fetched: 1 row(s)

#向test表插入值1
hive> insert into test values(1);
Query ID = lqs_20211219204400_e606cdc4-700f-4811-99af-20cbfcbd2d86
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1639912074286_0002, Tracking URL = http://bdc113:8088/proxy/application_1639912074286_0002/
Kill Command = /home/lqs/module/hadoop-3.1.3/bin/mapred job  -kill job_1639912074286_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-12-19 20:44:13,205 Stage-1 map = 0%,  reduce = 0%
2021-12-19 20:44:24,640 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 12.23 sec
2021-12-19 20:44:29,823 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 14.57 sec
MapReduce Total cumulative CPU time: 14 seconds 570 msec
Ended Job = job_1639912074286_0002
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://bdc112:8020/user/hive/warehouse/test/.hive-staging_hive_2021-12-19_20-44-00_691_7023649240777723976-1/-ext-10000
Loading data to table default.test
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 14.57 sec   HDFS Read: 12686 HDFS Write: 199 SUCCESS
Total MapReduce CPU Time Spent: 14 seconds 570 msec
OK
Time taken: 30.778 seconds

#查看刚刚插入的内容
hive> select * from test;
OK
1
Time taken: 0.19 seconds, Fetched: 1 row(s)

使用元数据服务的方式访问Hive

1、在hive-site.xml文件中添加如下配置信息

<!-- 指定存储元数据要连接的地址 -->
<property>
    <name>hive.metastore.uris</name>
    <value>thrift://bdc112:9083</value>
</property>

2、启动metastore

[lqs@bdc112 hive]$ hive --service metastore

注意: 启动后窗口不能再操作,需打开一个新的shell窗口做别的操作

3、启动hive

[lqs@bdc112 hive]$ bin/hive

使用JDBC方式访问Hive

1、在hive-site.xml文件中添加如下配置信息

<!-- 指定hiveserver2连接的host -->
<property>
    <name>hive.server2.thrift.bind.host</name>
    <value>bdc112</value>
</property>
    
<!-- 指定hiveserver2连接的端口号 -->
<property>
    <name>hive.server2.thrift.port</name>
    <value>10000</value>
</property>
<!-- hiveserver2的高可用参数,开启此参数可以提高hiveserver2的启动速度 -->
<property>
    <name>hive.server2.active.passive.ha.enable</name>
    <value>true</value>
</property>

2、启动hiveserver2

[lqs@bdc112 hive]$ bin/hive --service hiveserver2

3、启动beeline客户端

[lqs@bdc112 hive]$ bin/beeline -u jdbc:hive2://bdc112:10000 -n lqs

如果这里报错:

Connecting to jdbc:hive2://bdc112:10000
21/12/19 21:05:23 [main]: WARN jdbc.HiveConnection: Failed to connect to bdc112:10000
Could not open connection to the HS2 server. Please check the server URI and if the URI is correct, then ask the administrator to check the server status.
Error: Could not open client transport with JDBC Uri: jdbc:hive2://bdc112:10000: java.net.ConnectException: 拒绝连接 (Connection refused) (state=08S01,code=0)
Beeline version 3.1.2 by Apache Hive

请参考我的另外一篇博客:Connecting to jdbc:hive2://bdc112:10000 21/12/19 21:05:23 [main]: WARN jdbc.HiveConnection: Failed t

看到如下界面则配置成功

编写hive服务启动脚本

1、相关命令字段说明:

(1) 前台启动的方式导致需要打开多个shell窗口,可以使用如下方式后台方式启动
nohup: 放在命令开头,表示不挂起,也就是关闭终端进程也继续保持运行状态
/dev/null:是Linux文件系统中的一个文件,被称为黑洞,所有写入改文件的内容都会被自动丢弃
2>&1 : 表示将错误重定向到标准输出上
&: 放在命令结尾,表示后台运行
一般会组合使用: nohup [xxx命令操作]> file 2>&1 &,表示将xxx命令运行的结果输出到file中,并保持命令启动的进程在后台运行。
如上命令可以不掌握。

[lqs@bdc112 hive]$ nohup hive --service metastore 2>&1 &
[lqs@bdc112 hive]$ nohup hive --service hiveserver2 2>&1 &

2、编写脚本来管理hive相关服务的启动和关闭

[lqs@bdc112 conf]$ cd /home/lqs/bin/
[lqs@bdc112 bin]$ touch myhiveservices.sh
[lqs@bdc112 bin]$ vim myhiveservices.sh

然后输入以下内容:

#!/bin/bash
HIVE_LOG_DIR=$HIVE_HOME/logs
if [ ! -d $HIVE_LOG_DIR ]
then
	mkdir -p $HIVE_LOG_DIR
fi
#检查进程是否运行正常,参数1为进程名,参数2为进程端口
function check_process()
{
    pid=$(ps -ef 2>/dev/null | grep -v grep | grep -i $1 | awk '{print $2}')
    ppid=$(netstat -nltp 2>/dev/null | grep $2 | awk '{print $7}' | cut -d '/' -f 1)
    echo $pid
    [[ "$pid" =~ "$ppid" ]] && [ "$ppid" ] && return 0 || return 1
}

function hive_start()
{
    metapid=$(check_process HiveMetastore 9083)
    #2>&1 日志的合并,把错误和正确的日志合并到一起
    cmd="nohup hive --service metastore >$HIVE_LOG_DIR/metastore.log 2>&1 &"
    [ -z "$metapid" ] && eval $cmd || echo "Metastroe服务已启动"
    server2pid=$(check_process HiveServer2 10000)
    cmd="nohup hive --service hiveserver2 >$HIVE_LOG_DIR/hiveServer2.log 2>&1 &"
    [ -z "$server2pid" ] && eval $cmd || echo "HiveServer2服务已启动"
}

function hive_stop()
{
metapid=$(check_process HiveMetastore 9083)
    [ "$metapid" ] && kill $metapid || echo "Metastore服务未启动"
    server2pid=$(check_process HiveServer2 10000)
    [ "$server2pid" ] && kill $server2pid || echo "HiveServer2服务未启动"
}

case $1 in
    "start")
        hive_start
        ;;
    "stop")
        hive_stop
        ;;
    "restart")
        hive_stop
        sleep 2
        hive_start
        ;;
    "status")
        check_process HiveMetastore 9083 >/dev/null && echo "Metastore服务运行正常" || echo "Metastore服务运行异常"
        check_process HiveServer2 10000 >/dev/null && echo "HiveServer2服务运行正常" || echo "HiveServer2服务运行异常"
        ;;
*)
    echo Invalid Args!
    echo 'Usage: '$(basename $0)' start|stop|restart|status'
    ;;
esac

3、添加执行权限

[lqs@bdc112 bin]$ chmod +x myhiveservices.sh

4、后台启动hive服务

[lqs@bdc112 bin]$ myhiveservices.sh start

hive常用交货命令

1、查看hive相关命令

[lqs@bdc112 hive]$ bin/hive -help
which: no hbase in (/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/lqs/module/jdk1.8.0_212/bin:/home/lqs/module/hadoop-3.1.3/bin:/home/lqs/module/hadoop-3.1.3/sbin:/home/lqs/module/apache-maven-3.6.3/bin:/home/lqs/module/zookeeper-3.5.7/bin:/home/lqs/module/hive/bin:/home/lqs/.local/bin:/home/lqs/bin)
Hive Session ID = 86edfbff-6711-4ea8-b5b0-c27900d604bc
usage: hive
 -d,--define <key=value>          Variable substitution to apply to Hive
                                  commands. e.g. -d A=B or --define A=B
    --database <databasename>     Specify the database to use
 -e <quoted-query-string>         SQL from command line
 -f <filename>                    SQL from files
 -H,--help                        Print help information
    --hiveconf <property=value>   Use value for given property
    --hivevar <key=value>         Variable substitution to apply to Hive
                                  commands. e.g. --hivevar A=B
 -i <filename>                    Initialization SQL file
 -S,--silent                      Silent mode in interactive shell
 -v,--verbose                     Verbose mode (echo executed SQL to the
                                  console)

2、在hive命令行里创建一个表student,并插入1条数据

hive> use default
hive> insert into table user1 values(666,"法外狂徒");
Query ID = lqs_20211220115945_ccf3809c-d9a5-440b-8916-3b76f4d2d659
Total jobs = 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1639965104167_0001, Tracking URL = http://bdc113:8088/proxy/application_1639965104167_0001/
Kill Command = /home/lqs/module/hadoop-3.1.3/bin/mapred job  -kill job_1639965104167_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2021-12-20 12:00:01,118 Stage-1 map = 0%,  reduce = 0%
2021-12-20 12:00:09,606 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.77 sec
2021-12-20 12:00:25,035 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 24.43 sec
MapReduce Total cumulative CPU time: 24 seconds 430 msec
Ended Job = job_1639965104167_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://bdc112:8020/user/hive/warehouse/user1/.hive-staging_hive_2021-12-20_11-59-45_450_3726745333681390326-1/-ext-10000
Loading data to table default.user1
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 24.43 sec   HDFS Read: 15211 HDFS Write: 253 SUCCESS
Total MapReduce CPU Time Spent: 24 seconds 430 msec
OK
Time taken: 43.157 seconds

3、“-e”不进入hive的交互窗口执行sql语句

[lqs@bdc112 hive]$ bin/hive -e "select * from user1;"

在这里插入图片描述
4、“-f”执行脚本中sql语句
【1】在/home/lqs/module/hive/下创建data目录,并在data目录下创建hivetest.sql文件

[lqs@bdc112 hive]$ vim data/hivetest.sql

在hivetest.sql文件中写入一下内容:

select * from user1;

【2】执行文件中的sql语句

[lqs@bdc112 hive]$ bin/hive -f /home/lqs/module/hive/data/hivetest.sql

在这里插入图片描述
【3】执行文件中的sql语句并将结果写入文件中

[lqs@bdc112 hive]$ bin/hive -f /home/lqs/module/hive/data/hivetest.sql > /home/lqs/module/hive/data/hivetest_result.txt

在这里插入图片描述

Hive其他命令操作

1、退出hive窗口

#新版的hive中没区别了,在以前的版本是有的:
#exit:先隐性提交数据,再退出;
#quit:不提交数据,退出;
hive>exit;
hive>quit;

2、在hive cli命令窗口中如何查看hdfs文件系统

hive> dfs -ls /;
Found 2 items
drwxrwx---   - lqs supergroup          0 2021-12-20 11:59 /tmp
drwxr-xr-x   - lqs supergroup          0 2021-12-20 11:59 /user

3、查看在hive中输入的所有历史命令

#记住要进入当前自己用户目录里面
[lqs@bdc112 hive]$ cd /home/lqs/
[lqs@bdc112 ~]$ cat .hivehistory 
use default
;
create table user(id int,name string);
create table user1(id int,name string);
insert into table user1 values(666,"法外狂徒");
exit
;
dfs -ls /;
exit
;

4、查看在beeline中输入的所有历史命令

#记住要进入当前自己用户目录里面
[lqs@bdc112 hive]$ cd /home/lqs/
[lqs@bdc112 ~]$ cat .beeline/history

在这里插入图片描述

Hive常见属性配置(修改配置需要先关两个hive服务)

Hive运行日志信息配置

注意:改之前请一定停掉您所有的hive进程!!!

1、Hive的log默认存放在/tmp/lqs/hive.log目录下(当前用户名下)
2、修改hive的log存放日志到/home/lqs/module/hive/logs
修改步骤:
setup1:修改$HIVE_HOME/conf/hive-log4j.properties.template文件名称为hive-log4j.properties

[lqs@bdc112 conf]$ pwd
/home/lqs/module/hive/conf
[lqs@bdc112 conf]$ mv hive-log4j2.properties.template hive-log4j2.properties

setup2:在hive-log4j.properties文件中修改log存放位置

[lqs@bdc112 conf]$ vim hive-log4j2.properties

在这里插入图片描述
改为后再次启动的效果如下图
在这里插入图片描述

Hive启动jvm堆内存设置

新版本的hive启动的时候,默认申请的jvm堆内存大小为256M,jvm堆内存申请的太小,导致后期开启本地模式,执行复杂的sql时经常会报错:java.lang.OutOfMemoryError: Java heap space,因此最好提前调整一下HADOOP_HEAPSIZE这个参数。

1、(1) 修改$HIVE_HOME/conf下的hive-env.sh.template为hive-env.sh

[lqs@bdc112 conf]$ pwd
/home/lqs/module/hive/conf
[lqs@bdc112 conf]$ mv hive-env.sh.template hive-env.sh

2、将hive-env.sh其中的参数 export HADOOP_HEAPSIZE=1024的注释放开。
注意:记得一定要重新启动hive

注意:
	如果yarn-site.xml中没有添加虚拟内存校验的配置,记得一定要添加,已经添加了就不用管了。
[lqs@bdc112 conf]$ vim /home/lqs/module/hadoop-3.1.3/etc/hadoop/yarn-site.xml

没有的话输入以下内容:

 <!-- 虚拟内存检查,默认打开,修改为关闭 -->
<property>
        <description>Whether virtual memory limits will be enforced forcontainers.
                是否对容器实施虚拟内存限制。
        </description>
        <name>yarn.nodemanager.vmem-check-enabled</name>
        <value>false</value>
</property>

在这里插入图片描述

打印当前库和表

在hive-site.xml中加入如下两个配置:

<!-- 打印当前表头 -->
<property>
    <name>hive.cli.print.header</name>
    <value>true</value>
</property>
<!-- 打印当前库 -->
<property>
    <name>hive.cli.print.current.db</name>
    <value>true</value>
</property>

参数配置方式

1、查看当前所有的配置信息
hive (default)> set;
在这里插入图片描述
2、参数的配置三种方式
【1】配置文件方式

默认配置文件:hive-default.xml
用户自定义配置文件:hive-site.xml
注意:用户自定义配置会覆盖默认配置。另外,Hive也会读入Hadoop的配置,因为Hive是作为Hadoop的客户端启动的,Hive的配置会覆盖Hadoop的配置。配置文件的设定对本机启动的所有Hive进程都有效。

【2】命令行参数方式
启动Hive时,可以在命令行添加-hiveconf param=value来设定参数。
例如:

[lqs@bdc112 hive]$ bin/hive -hiveconf mapreduce.job.reduces=12;

在这里插入图片描述

或者

[lqs@bdc112 hive]$ bin/beeline -u jdbc:hive2://bdc112:10000 -n lqs -hiveconf mapreduce.job.reduces=12;

在这里插入图片描述

注意:仅对本次hive启动有效
查看参数设置:

hive (default)> set mapreduce.job.reduces;

【3】参数声明方式
可以在HQL中使用SET关键字设定参数
例如:

hive (default)> set mapreduce.job.reduces=12;

注意:仅对本次hive启动有效。
查看参数设置

hive (default)> set mapreduce.job.reduces;

在这里插入图片描述
注意:
上述三种设定方式的优先级依次递增。即配置文件<命令行参数<参数声明。注意某些系统级的参数,例如log4j相关的设定,必须用前两种方式设定,因为那些参数的读取在会话建立以前已经完成了。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小雏菊的成长

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值