写在前面
内嵌模式,使用的是内嵌的Derby数据库来存储元数据,该模式不需要外连数据库,也不需要额外起Metastore服务,数据库和Metastore服务都嵌入在主Hive Server的进程中,这种安装模式配置简单也是HIVE默认的,但是该模式一次只能有一个客户端连接,一般适用于学习不适用于生产。
下载链接
官网下载:http://archive.apache.org/dist/hive/hive-3.1.0/apache-hive-3.1.0-bin.tar.gz
安装步骤
① 下载
② 解压
tar -zxvf apache-hive-3.1.0-bin.tar.gz
③ 环境变量
在“/etc/profile.d”目录中,创建一个文件 rills-atlas.sh ,内容如下:
### 1. SET HADOOP ENVIRONMENT ###
HADOOP_HOME=/opt/software/hadoop-3.1.1
HIVE_HOME=/opt/software/apache-hive-3.1.0-bin
PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin
export HADOOP_HOME HIVE_HOME PATH
④ 验证
[root@hm profile.d]# hive --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/software/apache-hive-3.1.0-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/software/hadoop-3.1.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 3.1.0
Git git://vgargwork.local/Users/vgarg/repos/hive.apache.master -r bcc7df95824831a8d2f1524e4048dfc23ab98c19
Compiled by vgarg on Mon Jul 23 16:02:03 PDT 2018
From source with checksum 147d8070369f8c672407753089777fd1
[root@hm profile.d]# hiveserver2 --version
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/software/apache-hive-3.1.0-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/software/hadoop-3.1.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive 3.1.0
Git git://vgargwork.local/Users/vgarg/repos/hive.apache.master -r bcc7df95824831a8d2f1524e4048dfc23ab98c19
Compiled by vgarg on Mon Jul 23 16:02:03 PDT 2018
From source with checksum 147d8070369f8c672407753089777fd1
[root@hm profile.d]#
⑤ 配置
使用Derby的内嵌模式,同时配置hive-site.xml。
[root@hm apache-hive-3.1.0-bin]# ll
总用量 236
drwxr-xr-x 3 root root 157 6月 2 16:28 bin
drwxr-xr-x 2 root root 4096 6月 2 16:28 binary-package-licenses
drwxr-xr-x 2 root root 4096 6月 2 18:00 conf
-rw-r--r-- 1 root root 20030 6月 2 17:55 derby.log
drwxr-xr-x 4 root root 34 6月 2 16:28 examples
drwxr-xr-x 7 root root 68 6月 2 16:28 hcatalog
drwxr-xr-x 2 root root 44 6月 2 16:28 jdbc
drwxr-xr-x 4 root root 12288 6月 2 16:28 lib
-rw-r--r-- 1 root root 20798 5月 23 2018 LICENSE
drwxr-xr-x 5 root root 133 6月 2 17:59 metastore_db
-rw-r--r-- 1 root root 230 5月 23 2018 NOTICE
-rw-r--r-- 1 root root 167884 7月 19 2018 RELEASE_NOTES.txt
drwxr-xr-x 4 root root 35 6月 2 16:28 scripts
drwxr-xr-x 3 root root 42 6月 2 17:55 ${system:java.io.tmpdir}
drwxr-xr-x 2 root root 6 6月 2 17:47 warehouse_db
[root@hm apache-hive-3.1.0-bin]# pwd
/opt/software/apache-hive-3.1.0-bin
[root@hm apache-hive-3.1.0-bin]# cp conf/{hive-default.xml.template,hive-site.xml}
[root@hm apache-hive-3.1.0-bin]# vim conf/hive-site.xml
修改后的文件如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- Derby 内嵌模式元数据存储路径,不存在则自动创建 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=/opt/software/apache-hive-3.1.0-bin/metastore_db;create=true</value>
</property>
<!-- Derby 内嵌模式的驱动类 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
</property>
<!-- Derby 内嵌模式真实数据的本地存储路径 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/opt/software/apache-hive-3.1.0-bin/warehouse_db</value>
</property>
<!-- 查询结果显示列名 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<!-- 显示当前数据库名称 -->
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<!-- 元数据检查,比较耗内存,测试时关闭,生产中须设为true -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!-- Metastore 服务地址,它是Hive2 之后提供元数据访问的进程 -->
<!-- 如VALUE 为空即<value/>,Hive 则直连MySQL,无须启动Metastore 进程 -->
<!--
<property>
<name>hive.metastore.uris</name>
<value>thrift://0.0.0.0:9083</value>
</property>
-->
<!-- HiveServer2 可让JDBC、Python 等向Hive 提交任务 -->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<!--
避免 Internal error processing get_current_notificationEventId
或者在Hadoop 的core-site.xml 添加
hadoop.proxyuser.hive.hosts=HS2_HOST
hadoop.proxyuser.hive.groups=*
-->
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/hive/local</value>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/hive/resources</value>
</property>
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/tmp/hive/operation_logs</value>
</property>
</configuration>
⑥ 初始化
// 1. 先删除旧的数据目录:
rm -rf metastore_db warehouse_db
// 2. 初始化
schematool -dbType derby -initSchema --verbose
⑦ 验证
[root@hm ~]# hive
which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/software/hadoop-3.1.1/bin:/opt/software/hadoop-3.1.1/sbin:/opt/software/apache-hive-3.1.0-bin/bin:/opt/software/jdk1.8.0_251/bin:/opt/software/jdk1.8.0_251/jre/bin:/opt/software/hadoop-3.1.1/bin:/opt/software/hadoop-3.1.1/sbin:/opt/software/apache-hive-3.1.0-bin/bin:/opt/software/jdk1.8.0_251/bin:/opt/software/jdk1.8.0_251/jre/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/software/apache-hive-3.1.0-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/software/hadoop-3.1.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = d7fcbf5f-1826-4cb1-9bb5-41bd1ea0f9e5
Logging initialized using configuration in jar:file:/opt/software/apache-hive-3.1.0-bin/lib/hive-common-3.1.0.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID = 73bd111e-3824-4bf4-931b-d10065ad2620
hive (default)> show databases;
OK
database_name
default
Time taken: 0.525 seconds, Fetched: 1 row(s)
hive (default)>