文章目录
1 前言
1.1 入门介绍
1.1.1 hive 架构
1.1.2 hive与hadoop关系
1.1.3 hive与传统数据库对比
hive主要是用于海量数据的离线数据分析
1.2 Client
Hive允许clinet连接的方式有CLI(hive shell)、jdbc/odbc(java访问hive)、WebUI(浏览器访问hive),jdbc访问的中间件为Thrift软件框架。DDL、DQL、DML整体仿写仿写的sql语句。
- client需要下载安装包
- jdbc/odbc 也可以连接到hive
- 目前流行 hiveServer2/beeline
- 做基于用户名和密码的安全校验
- Web Gui
- hive提供了一套简单的web页面
1.3 metastore
元数据:包括表名、表所属的数据库(默认是default)、表的拥有者、列/分区字段、表的类型(是否是外部表)、表的数据所在目录等。
- 一般需要借助于其他的数据载体(数据库)
- 主要用于存放数据库的建表语句等信息。
- 推荐使用MySQL
1.4 Driver
元数据存储在数据库中,默认存在自带的derby数据库(单用户局限性),推荐MySQL。
- 解释器(SQL Parser):将SQL字符串转换成抽象语法AST,这一步一般通过第三方工具完成,如ANTLR,对AST进行语法分析,比如表是否存在、字段是否存在、SQL语义是否有误。
- 编译器(Physical Plan):将AST编译生成逻辑执行计划。
- 优化器(Query Optimizer):对逻辑执行计划进行优化。
- 执行器(Execution):将逻辑执行计划转换成可以运行的物理计划,对于hive来说,就是MR/Spark。
1.5 数据处理
Hive的数据存储在HDFS中,计算由MR完成,HDFS和MR是源码级别的整合。
解释器、编译器、优化器完成HQL查询语句从词法分析、语法分析、编译、优化以及查询计划的生成。
2 集群安装
安装前,请确认已经安装MySQL、Hadoop HA+Yarn
下载地址:https://hive.apache.org/downloads.html
2.1 创建 Hive 仓库目录
另外,我们必须使用下面的HDFS命令创建/ tmp和/ user / hive / warehouse(又名hive.metastore.warehouse.dir),并将它们设置为chmod g + w,然后才能在Hive中创建表:
在HDFS中新建/tmp 和 /usr/hive/warehouse 两个文件目录,并对同组用户增加写权限。
hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -chmod g+w /user/hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs dfs -mkdir /user/hduser
hdfs dfs -ls /
hdfs dfs -ls /user
其他方法
hadoop fs -mkdir /user/hive
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -mkdir /tmp
hadoop fs -chmod 777 /user/hive
hadoop fs -chmod 777 /user/hive/warehouse
hadoop fs -chmod 777 /tmp
bin/hadoop fs -mkdir /tmp
bin/hadoop fs -mkdir -p /usr/hive/warehouse
bin/hadoop fs -chmod g+w /tmp
bin/hadoop fs -chmod g+w /usr/hive/warehouse
总结
hdfs dfs -mkdir -p /hive/warehouse
hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -mkdir /user/hduser
hadoop fs -mkdir /user/hive
hadoop fs -mkdir /user/hive/warehouse
hadoop fs -mkdir /tmp
bin/hadoop fs -mkdir /tmp
bin/hadoop fs -mkdir -p /usr/hive/warehouse
hdfs dfs -chmod g+w /user/hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs -chmod 777 /user/hive
hdfs -chmod 777 /user/hive/warehouse
hdfs -chmod 777 /tmp
hdfs -chmod g+w /tmp
hdfs -chmod g+w /usr/hive/warehouse
最新
hdfs dfs -mkdir -p /hive/warehouse
hdfs dfs -mkdir /tmp
hdfs dfs -chmod g+w /hive/warehouse
hdfs dfs -chmod g+w /tmp
2.2 配置系统环境变量
vim ~/.bashrc
或
vim /etc/profile
# hive
export HIVE_HOME=/usr/local/hive/apache-hive-2.3.9-bin
export PATH=$PATH:$HIVE_HOME/bin
2.3 配置 hive-env.sh
cp $HIVE_HOME/conf/hive-env.sh.template $HIVE_HOME/conf/hive-env.sh
vim $HIVE_HOME/conf/hive-env.sh
修改或直接添加到最后
HADOOP_HOME=$HADOOP_HOME
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
2.4 创建hive-site.sh
2.4.1 默认derby作为元数据库
配置可以直接跳过
初始化数据库
schematool -dbType derby -initSchema
2.4.2 MySQL作为元数据库
2.4.2.1 安装MySQL
docker run -d \
-p 3309:3306 \
-v /home/mysql/conf:/etc/mysql/conf.d \
-v /home/mysql/data:/var/lib/mysql \
-e MYSQL_ROOT_PASSWORD=root_hive \
--name mysql_hive mysql:5.7
数据库版本:
数据库名:metastore
用户名:root
密码:root_hive
创建hive数据库
create database hive default charset=utf8mb4 collate=utf8mb4_unicode_ci;
2.4.2.2 添加驱动mysql-connector
由于hive的metadata库不一定是MySQL,因此需要不同的库来支持,需要确认hive>lib
目录下有mysql链接驱动mysql-connector-java-版本.jar
,mysql5.6以上推荐使用connector-java 8.0以上,
下载地址:https://downloads.mysql.com/archives/c-j
cp mysql-connector-java-8.0.15.jar $HIVE_HOME/lib
修改权限
chmod 755 $HIVE_HOME/lib/mysql-connector-java-8.0.15.jar
2.4.2.3 写配置hive-site.sh
vim $HIVE_HOME/conf/hive-site.xml
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- jdbc连接 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3309/hive?createDatabaseIfNotExist=true</value>
</property>
<!-- jdbc驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<!--<value>com.mysql.jdbc.Driver</value> -->
<value>com.mysql.cj.jdbc.Driver</value>
</property>
<!-- 数据库用户名 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- 数据库密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root_hive</value>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!-- 美化打印数据 -->
<property>
<name>hive.cli.print.header</name>
<value>false</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>false</value>
</property>
<!-- hiveserver2 -->
<property>
<name>hive.server2.webui.host</name>
<value>slave1</value>
<description>
The host address the HiveServer2 WebUI will listen on
</description>
</property>
<property>
<name>hive.server2.webui.port</name>
<value>10002</value>
</property>
<!--数据存储位置-->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
</property>
</configuration>
初始化数据库
schematool -dbType mysql -initSchema
2.5 修改hadoop core-site.xml
vim $HADOOP_HOME/etc/hadoop/core-site.xml
增加如下内容
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--hadoop员配置-->
<!--配置hdfs默认的命名-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<!--配置操作hdfs缓冲区大小-->
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<!--配置临时目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/bigdata/tmp</value>
</property>
<!-- hive辅助配置 -->
<!-- 该参数表示可以通过httpfs接口hdfs的ip地址限制 -->
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<!-- 通过httpfs接口访问的用户获得的群组身份 -->
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
</configuration>
2.6 配置日志与调试模式
hive日志的配置文件放在hive的conf目录,配置文件名:hive-log4j2.properties
。
log4j2中,日志共有8个级别,按照从低到高为:
ALL < TRACE < DEBUG < INFO < WARN < ERROR < FATAL < OFF
,一般使用DEBUG < INFO < WARN < ERROR
这四个级别
cp $HIVE_HOME/conf/hive-log4j2.properties.template $HIVE_HOME/conf/hive-log4j2.properties
vim $HIVE_HOME/conf/hive-log4j2.properties
重要配置
# 日志默认保存在 /tmp/当前用户/hive.log 即 /tmp/root/hive.log
property.hive.log.dir = /tmp/hive.log
开启调试模式
cd /usr/local/hive-3.1.2
# for HiveCLI (deprecated)
bin/hive --hiveconf hive.root.logger=DEBUG,console
bin/hiveserver2 --hiveconf hive.root.logger=DEBUG,console
2.7 拷贝到其他节点
hive
scp -r /usr/local/hive/ root@slave1:/usr/local/
scp -r /usr/local/hive/ root@slave2:/usr/local/
hadoop core-site.xml
scp -r root@slave01:$HADOOP_HOME/etc/hadoop/core-site.xml $HADOOP_HOME/etc/hadoop
scp -r root@slave02:$HADOOP_HOME/etc/hadoop/core-site.xml $HADOOP_HOME/etc/hadoop
2.8 Guava包
注:此操作在2.7执行完操作
首先删除hadoop中的guava-*
包,多台节点
rm -rf $HADOOP_HOME/share/hadoop/common/lib/guava-*.jar
rm -rf $HADOOP_HOME/share/hadoop/hdfs/lib/guava-*.jar
将hive的Guava拷贝给hadoop
cp $HIVE_HOME/lib/guava-*.jar $HADOOP_HOME/share/hadoop/common/lib/
cp $HIVE_HOME/lib/guava-*.jar $HADOOP_HOME/share/hadoop/hdfs/lib/
2.9 客户端配置文件
slave02也作为客户端节点
vim $HIVE_HOME/conf/hive-site.xml
<!--数据存储位置-->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<!-- 美化打印数据 -->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://master:9083<value/>
</property>
2.10 启动集群
2.10.1 启动 zookeeper 与hdfs
zkServer.sh start
启动 hdfs+yarn
start-all.sh
2.10.2 启动 hive
启动
$HIVE_HOME/bin/hive --service matastore
nohup hive --service metastore > /dev/null 2>&1 &
连接
$HIVE_HOME/bin/hive
2.10.3 启动 hiveServer2
启动
$HIVE_HOME/bin/hiveserver2
nohup hiveserver2 > /dev/null 2>&1 &
nohup $HIVE_HOME/bin/hiveserver2 > /usr/local/hive/logs/hive_runtime_log.log < /dev/null
web地址:(具体地址详见 hive-site.xml)
http://slave1:10002
连接:
beeline -u jdbc:hive2://master:10000
# 指定数据库
beeline -u jdbc:hive2://slave1:10000/default
# 指定用户名、密码
beeline -u jdbc:hive2://slave1:10000/default root hive123
-------------结束----------------
1.4 单机版配置
1.4.1 配置 hive-env.sh
sudo cp /usr/local/hive-3.1.2/conf/hive-env.sh.template /usr/local/hive-3.1.2/conf/hive-env.sh
vim /usr/local/hive-3.1.2/conf/hive-env.sh
添加hadoop目录
HADOOP_HOME=/usr/local/hadoop-3.2.2
1.4.2 下载 Apache Derby(此步骤可以跳过)
cd /tmp
wget http://archive.apache.org/dist/db/derby/db-derby-10.13.1.1/db-derby-10.13.1.1-bin.tar.gz
sudo tar xvzf db-derby-10.13.1.1-bin.tar.gz -C /usr/local
设置Derby环境
vim ~/.bashrc
export DERBY_HOME=/usr/local/db-derby-10.13.1.1-bin
export PATH=$PATH:$DERBY_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbytools.jar
source ~/.bashrc
我们需要在$ DERBY_HOME目录中创建一个名为data的目录来存储Metastore数据。
sudo mkdir $DERBY_HOME/data
1.4.3 配置 metadata库
Hive的metadata库可以有多种选择,如下给出几种方案。
cp /usr/local/hive-3.1.2/conf/hive-default.xml.template /usr/local/hive-3.1.2/conf/hive-site.xml
vim /usr/local/hive-3.1.2/conf/hive-site.xml
1.4.3.1 方案一
数据库配置(可以跳过使用默认数据库)
1.4.3.2 方案二 derby
<!--JDBC元数据仓库连接字符串-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:;databaseName=metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<!--JDBC元数据仓库驱动类名-->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!--元数据仓库用户名-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>APP</value>
<description>Username to use against metastore database</description>
</property>
<!--元数据仓库密码-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mine</value>
<description>password to use against metastore database</description>
</property>
1.4.3.3 方案三 mysql
首先创建mysql数据库
create database metastore default charset=utf8mb4 collate=utf8mb4_unicode_ci;
或
create database metastore;
由于hive的metadata库不一定是MySQL,因此需要不同的库来支持,需要确认hive>lib
目录下有mysql链接驱动mysql-connector-java-版本.jar
下载地址:https://dev.mysql.com/downloads/windows/installer/5.7.html
修改权限
chmod 755 mysql-connector-java-8.0.13.jar
vim hive-site.xml
jdbc:mysql://slave01:3306/metastore?useSSL=false
jdbc:mysql://node1:3306/metastore?createDatabaseIfNotExist=true&useUnicode=true&characterEnconding=UTF-8
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!-- jdbc连接 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://slave01:3306/metastore?useSSL=false</value>
</property>
<!-- jdbc驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!-- 数据库用户名 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<!-- 数据库密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>slave01</value>
<description>
Bind host on which to run the HiveServer2 Thrift service
绑定运行 HiveServer2 Thrift 服务的主机.
</description>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
</configuration>
1.4.4 创建 jpox.properties 文件
创建一个名为jpox.properties的文件,并将以下行添加到其中:
javax.jdo.PersistenceManagerFactoryClass =
org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema = false
org.jpox.validateTables = false
org.jpox.validateColumns = false
org.jpox.validateConstraints = false
org.jpox.storeManagerType = rdbms
org.jpox.autoCreateSchema = true
org.jpox.autoStartMechanismMode = checked
org.jpox.transactionIsolation = read_committed
javax.jdo.option.DetachAllOnCommit = true
javax.jdo.option.NontransactionalRead = true
javax.jdo.option.ConnectionDriverName = org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL = jdbc:derby://hadoop1:1527/metastore_db;create = true
javax.jdo.option.ConnectionUserName = APP
javax.jdo.option.ConnectionPassword = mine
1.4.5 Hive 文件夹授权
我们需要为Hive文件夹设置权限
sudo chown -R charles /usr/local/hive-3.1.2/
1.4.6 Metastore模式初始化
schemaTool completed
为成功标志
schematool -dbType <数据库类型> -initSchema
# derby
schematool -dbType derby -initSchema
# mysql
schematool -dbType mysql -initSchema
如果配置有修改请重新初始化
rm derby.log
rm -rd metastore_db
schematool -dbType derby -initSchema
1.4.7 启动Hive
usr/local/hive-3.1.2/bin/hive
1.5 集群配置
1.8.1 master
数据库
用户名:hive
密码:123456
jdbc:mysql://192.168.18.1:3306/hive?serverTimezone=Asia/Shanghai
jdbc:mysql://node1:3306/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEnconding=UTF-8
<!--hive的元数据库连接配置-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://slave01:3306/hive?createDatabaseIfNotExist=true&useUnicode=true&characterEnconding=UTF-8</value>
</property>
<!--hive的元数据库选择jdbc驱动-->
<!--
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
</property>
-->
<!-- jdbc驱动 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<!--hive的元数据库登录的用户名-->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<!--hive的元数据库登录的密码-->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>slave01</value>
<description>
Bind host on which to run the HiveServer2 Thrift service
绑定运行 HiveServer2 Thrift 服务的主机.
</description>
</property>
<!--hive的监听端口-->
<property>
<name>hive.metastore.port</name>
<value>9083</value>
<description>Hive metastore listener port</description>
</property>
<!--这里是配置hive能够连接jdbc,很实用-->
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'binary'.</description>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
<description>
Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'http'.
当 hive.server2.transport.mode 为 'http' 时 HiveServer2 Thrift 接口的端口号。
</description>
</property>
1.8.2 slave 客户端
<!-- Hive Execution Parameters -->
<property>
<name>hive.metastore.local</name>
<value>false</value>
</property>
<!-- 仓库默认数据库的位置 -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<!-- 远程元存储的 Thrift URI。 Metastore 客户端用于连接到远程 Metastore。 -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://slave01:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
<!-- -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>Enforce metastore schema version consistency. True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures proper metastore schema migration. (Default) False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.</description>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
<description>
Expects one of [binary, http]. Transport mode of HiveServer2.
期望 [binary, http] 之一。 HiveServer2的传输方式
</description>
</property>
<!-- 绑定运行 HiveServer2 Thrift 服务的主机 -->
<property>
<name>hive.server2.thrift.bind.host</name>
<value>slave01</value>
<description>Bind host on which to run the HiveServer2 Thrift service.</description>
</property>
<property>
<name>hive.driver.parallel.compilation</name>
<value>false</value>
<description>
Whether to enable parallel compilation of the queries between sessions and within the same session on HiveServer2. The default is false.
是否在 HiveServer2 上的会话之间和同一会话内启用查询的并行编译。 默认值为假。
</description>
</property>
<property>
<name>hive.server2.metrics.enabled</name>
<value>false</value>
<description>
Enable metrics on the HiveServer2.
在 HiveServer2 上启用指标
</description>
</property>
<property>
<name>hive.server2.thrift.http.port</name>
<value>10001</value>
<description>
Port number of HiveServer2 Thrift interface when hive.server2.transport.mode is 'http'.
当 hive.server2.transport.mode 为 'http' 时 HiveServer2 Thrift 接口的端口号
</description>
</property>
<property>
<name>hive.server2.thrift.http.path</name>
<value>cliservice</value>
<description>
Path component of URL endpoint when in HTTP mode.
在 HTTP 模式下 URL 端点的路径组件。
</description>
</property>
<property>
<name>hive.server2.thrift.max.message.size</name>
<value>104857600</value>
<description>
Maximum message size in bytes a HS2 server will accept.
HS2 服务器将接受的最大消息大小(以字节为单位)。
</description>
</property>
<property>
<name>hive.server2.thrift.http.max.idle.time</name>
<value>1800s</value>
<description>Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is msec if not specified. Maximum idle time for a connection on the server when in HTTP mode.</description>
</property>
<property>
<name>hive.server2.thrift.http.worker.keepalive.time</name>
<value>60s</value>
<description>Expects a time value with unit (d/day, h/hour, m/min, s/sec, ms/msec, us/usec, ns/nsec), which is sec if not specified. Keepalive time for an idle http worker thread. When the number of workers exceeds min workers, excessive threads are killed after this time interval.</description>
</property>
<!-- Hive Metastore 监听端口 -->
<property>
<name>hive.metastore.port</name>
<value>9083</value>
<description>Hive metastore listener port</description>
</property>
<!-- -->
<property>
<name>mapreduce.job.queuename</name>
<value>etl</value>
<description>
Used to specify name of Hadoop queue to which
jobs will be submitted. Set to empty string to let Hadoop choose the queue.
</description>
</property>
1.8.3 启动与连接
启动
#!/usr/bin/env bash
#后台启动hive service
nohup hive --service metastore >> /data/logs/hive/meta.log 2>&1 &
#后台启动hive 的jdbc service
nohup hive --service hiveserver2 >> /data/logs/hive/hiveserver2.log 2>&1 &
连接
在这里插入代码片
1.8 踏上踩坑之路
报错1:google一个包 版本不一致
包含com.google.common.bse.Precond
hadoop和hive的两个guava.jar版本不一致
两个位置分别位于下面两个目录:
- ls /usr/local/hive-3.1.2/lib/
- ls /usr/local/hadoop-3.2.2/share/hadoop/common/lib/
删除版本低的 这里不绝对
/usr/local/hadoop-3.2.2/share/hadoop/common/lib/
rm -rf /usr/local/hive-3.1.2/lib/guava-19.0.jar
cp /usr/local/hadoop-3.2.2/share/hadoop/common/lib/guava-27.0-jre.jar /usr/local/hive-3.1.2/lib/
报错2 未知字符
Exception in thread "main" java.lang.RuntimeException: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8
at [row,col,system-id]: [3215,96,"file:/usr/local/hive-3.1.2/conf/hive-site.xml"]
vim /usr/local/hive-3.1.2/conf/hive-site.xml +3215
删除未知字符
报错3 (_resources)
Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Unable to create directory ${system:java.io.tmpdir}/${hive.session.id}_resources
<property>
<name>hive.downloaded.resources.dir</name>
<!--
<value>${system:java.io.tmpdir}/${hive.session.id}_resources</value>
-->
<value>/home/hduser/hive/tmp/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
报错4 (%7D)
找不到路径,需要指定聚堆路径
Exception in thread "main" java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
<property>
<name>hive.exec.local.scratchdir</name>
<!--
<value>${system:java.io.tmpdir}/${system:user.name}</value>
-->
<value>/tmp/mydir</value>
<description>Local scratch space for Hive jobs</description>
</property>
报错5
HiveException java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
rm -rf $HIVE_HOME/metastore_db
schematool -initSchema -dbType derby
报错 6
datanucleus.schema.aytiCreateTables
<property>
<name>datanuleus.schema.autoCreateAll</name>
<value>true</value>
</property>
新加属性
<property>
<name>system:java.io.tmpdir</name>
<value>/tmp/local/hive-3.1.2/tmp</value>
</property>
<property>
<name>system:user.name</name>
<value>charles</value>
</property>
https://www.douyin.com/user/MS4wLjABAAAAC5NbA1JCeuUfq9xXyi9UkRPlxTXxli4XWux9vqF46D4zcl0vlbs_8dmajmgLwAuJ?is_full_screen=0&modal_id=7037406616899112222&vid=7037406499148238091