原文地址:https://blog.csdn.net/hawkzy/article/details/86472449
一 准备工作
安装Hadoop 3.1.1
安装JDK 1.8或更高版本
Hadoop已经能正常启动,启动过程中无exception或error信息
下载hive 3.1.0:http://mirror.bit.edu.cn/apache/hive/
下载下来应该是tar包的形式:apache-hive-3.1.0-bin.tar.gz
找一个较早的hive版本源码包,在其bin文件夹下必须含有hive.cmd等文件
这个比较困难,我找的是apache-hive-0.14.0-src.tar.gz,非常老的版本。
下载mariadb-10.3.10(或MySQL),以及mysql jdbc驱动:mysql-connector-java-8.0.12.jar
二 安装和配置HIVE
解压apache-hive-3.1.0-bin.tar.gz到D:\hive-3.1.0
配置Hive环境变量:
配置Hive环境变量
hive path
备份D:\hive-3.1.0\bin文件夹。解压apache-hive-0.14.0-src.tar.gz,复制bin及其子目录ext下的所有cmd文件到D:\hive-3.1.0\bin及其相应子文件夹下。
在D:\hive-3.1.0\conf\下新建hive-site.xml文件,修改内容如下:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- 下面这几个是将mysql作为metastore存储的配置 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<!-- 下面的hive是后面创建MariaDB时建立的数据库名 -->
<value>jdbc:mysql://localhost:3306/hive?serverTimezone=UTC</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<!-- 数据库连接名 -->
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<!-- 数据库连接密码 -->
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/hive-3.1.0/iotmp/scratch_dir</value>
<description>Local scratch space for Hive jobs</description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/hive-3.1.0/iotmp/resources_dir/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<property>
<name>hive.querylog.location</name>
<value>/hive-3.1.0/iotmp/querylog_dir</value>
<description>Location of Hive run time structured log file</description>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/hive-3.1.0/iotmp/operation_dir</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
</configuration>
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
注意上面的绝对路径/hive-3.1.0是我在D盘的安装目录,“/”会自动指定到当前运行环境的根目录,由于我是在D盘运行,因此这样写是可以的,各位可以根据自己的需要重新配置。
修改hive-log4j2.properties文件
将文件hive-log4j2.properties.template复制一份,重命名为hive-log4j2.properties,并修改内容如下:
status = INFO
name = HiveLog4j2
packages = org.apache.hadoop.hive.ql.log
# list of properties
property.hive.log.level = INFO
property.hive.root.logger = DRFA
property.hive.log.dir = hive_log
property.hive.log.file = hive.log
property.hive.perflogger.log.level = INFO
# list of all appenders
appenders = console, DRFA
# console appender
appender.console.type = Console
appender.console.name = console
appender.console.target = SYSTEM_ERR
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = %d{ISO8601} %5p [%t] %c{2}: %m%n
# daily rolling file appender
appender.DRFA.type = RollingRandomAccessFile
appender.DRFA.name = DRFA
appender.DRFA.fileName = ${hive.log.dir}/${hive.log.file}
# Use %pid in the filePattern to append <process-id>@<host-name> to the filename if you want separate log files for different CLI session
appender.DRFA.filePattern = ${hive.log.dir}/${hive.log.file}.%d{yyyy-MM-dd}
appender.DRFA.layout.type = PatternLayout
appender.DRFA.layout.pattern = %d{ISO8601} %5p [%t] %c{2}: %m%n
appender.DRFA.policies.type = Policies
appender.DRFA.policies.time.type = TimeBasedTriggeringPolicy
appender.DRFA.policies.time.interval = 1
appender.DRFA.policies.time.modulate = true
appender.DRFA.strategy.type = DefaultRolloverStrategy
appender.DRFA.strategy.max = 30
# list of all loggers
loggers = NIOServerCnxn, ClientCnxnSocketNIO, DataNucleus, Datastore, JPOX, PerfLogger, AmazonAws, ApacheHttp
logger.NIOServerCnxn.name = org.apache.zookeeper.server.NIOServerCnxn
logger.NIOServerCnxn.level = WARN
logger.ClientCnxnSocketNIO.name = org.apache.zookeeper.ClientCnxnSocketNIO
logger.ClientCnxnSocketNIO.level = WARN
logger.DataNucleus.name = DataNucleus
logger.DataNucleus.level = ERROR
logger.Datastore.name = Datastore
logger.Datastore.level = ERROR
logger.JPOX.name = JPOX
logger.JPOX.level = ERROR
logger.AmazonAws.name=com.amazonaws
logger.AmazonAws.level = INFO
logger.ApacheHttp.name=org.apache.http
logger.ApacheHttp.level = INFO
logger.PerfLogger.name = org.apache.hadoop.hive.ql.log.PerfLogger
logger.PerfLogger.level = ${hive.perflogger.log.level}
# root logger
rootLogger.level = ${hive.log.level}
rootLogger.appenderRefs = root
rootLogger.appenderRef.root.ref = ${hive.root.logger}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
将MySQL jdbc驱动(mysql-connector-java-8.0.12.ja)放到D:\hive-3.1.0\lib文件夹下。
三 安装和配置MariaDB
将MariaDB解压到D:\mariadb-10.3.10,并配置系统环境变量:
mysql home
mysql path
初始化配置:
注意,这里及后面建立的数据库名hive,用户名root,密码123456是跟必须前面hive-site.xml配置保持一致的。
D:\mariadb-10.3.10\bin> .\mysql_install_db.exe --datadir=D:\mariadb-10.3.10\db --service=MyDB --password=123456
Running bootstrap
2018-10-18 19:49:44 0 [Note] D:\mariadb-10.3.10\bin\mysqld.exe (mysqld 10.3.10-MariaDB) starting as process 17000 ...
Removing default user
Setting root password
Creating my.ini file
Registering service 'MyDB'
Creation of the database was successful
1
2
3
4
5
6
7
8
若提示:FATAL ERROR: OpenSCManager failed 是因为需要以管理员身份运行
配置数据库服务:
D:\mariadb-10.3.10\bin> .\mysqld.exe --install MariaDB
Service successfully installed.
1
2
启动服务:
D:\mariadb-10.3.10\bin> net start MariaDB
MariaDB 服务正在启动 .
MariaDB 服务已经启动成功。
1
2
3
以空密码登录,成功:
D:\mariadb-10.3.10\bin> .\mysql -uroot
Welcome to the MariaDB monitor. Commands end with ; or \g.
Your MariaDB connection id is 9
Server version: 10.3.10-MariaDB mariadb.org binary distribution
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
1
2
3
4
5
6
7
8
切换数据库:
MariaDB [(none)]> use mysql;
Database changed
1
2
更改密码:
MariaDB [mysql]> update user set password=password("123456") where user="root";
Query OK, 3 rows affected (0.001 sec)
Rows matched: 3 Changed: 3 Warnings: 0
1
2
3
刷新,让权限立即生效:
MariaDB [mysql]> flush privileges;
Query OK, 0 rows affected (0.000 sec)
1
2
退出:
MariaDB [mysql]> quit;
Bye
1
2
四 在MariaDB初始化Hive数据库
创建数据库
刚才配置Hive修改的hive-site.xml,里面有数据库名,用户名,密码,因为是配置连接mysql,此时需要在mysql里创建数据库hive。登录MariaDB后,执行:
create database hive
设置字符集:
MariaDB [(none)]> alter database hive default character set latin1;
Query OK, 1 row affected (0.001 sec)
1
2
执行数据库建表脚本:
在HIVE的目录下(D:\hive-3.1.0\scripts\metastore\upgrade\mysql\)有个README文件,非常详细的说明了如何配置HIVE升级最新的数据库到mysql,但对于我们第一次安装的其实很简单,就是在MySQL里执行一个SQL脚本即可,注意脚本的Hive版本要和实际安装的一致:
MariaDB [hive]> SOURCE D:/hive-3.1.0/scripts/metastore/upgrade/mysql/hive-schema-3.1.0.mysql.sql;
Query OK, 0 rows affected (0.026 sec)
... ....
1
2
3
创建好后的效果:
执行use hive后:
MariaDB [hive]> show databases;
+--------------------+
| Database |
+--------------------+
| hive |
| information_schema |
| mysql |
| performance_schema |
| test |
+--------------------+
5 rows in set (0.029 sec)
MariaDB [hive]> show tables;
+-------------------------------+
| Tables_in_hive |
+-------------------------------+
| aux_table |
| bucketing_cols |
| cds |
... ...
| wm_trigger |
| write_set |
+-------------------------------+
74 rows in set (0.001 sec)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
五 在hadoop下配置hive仓库
按照官网说明(https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-InstallingHivefromaStableRelease),在hadoop下配置hive仓库,如果出现提示“No such file or directory”,那就需要一级一级创建目录:
D:\hadoop-3.1.1\bin>hadoop fs -mkdir /tmp
D:\hadoop-3.1.1\bin>hadoop fs -mkdir /user/hive
D:\hadoop-3.1.1\bin>hadoop fs -mkdir /user/hive/warehouse
1
2
3
创建好后可以查看:
D:\hadoop-3.1.1\bin>hadoop fs -ls /
Found 3 items
drwxr-xr-x - hawkzy supergroup 0 2018-10-17 10:04 /input
drwxr-xr-x - hawkzy supergroup 0 2018-10-19 14:15 /tmp
drwxr-xr-x - hawkzy supergroup 0 2018-10-19 14:17 /user
D:\hadoop-3.1.1\bin>hadoop fs -ls /user/hive
Found 1 items
drwxr-xr-x - hawkzy supergroup 0 2018-10-19 14:17 /user/hive/warehouse
1
2
3
4
5
6
7
8
9
将上述文件夹修改为可读写属性:
D:\hadoop-3.1.1\bin\hadoop fs -chmod g+w /tmp
D:\hadoop-3.1.1\bin\hadoop fs -chmod g+w /user/hive/warehouse
1
2
六 启动Hive
启动Hive前,确保MariaDB已经启动起来。
启动hive的metastore服务:
D:\hive-3.1.0\bin>hive --service metastore -hiveconf hive.root.logger=DEBUG
1
启动hive的hiveserver2服务:
D:\hive-3.1.0\bin>hive --service hiveserver2
1
启动hive的的控制台:
D:\hive-3.1.0\bin>hive --service cli
1
在控制台下可以创建table,查询table内容,操作方式跟mysql操作SQL一样
创建的table,可以在hadoop下执行命令:
D:\hadoop-3.1.1\bin>hadoop fs -ls /user/hive/warehouse
Found 1 items
drwxr-xr-x - hawkzy supergroup 0 2018-10-24 00:43 /user/hive/warehouse/test_table