文章目录
第五部分 Hive安装部署
安装部署Hive
wget http://archive.apache.org/dist/hive/hive-2.3.3/apache-hive-2.3.3-bin.tar.gz
tar -zxvf apache-hive-2.3.3-bin.tar.gz -C /opt/modules
//修改包名
$ mv apache-hive-2.3.3-bin apache-hive-2.3.3
==配置hive的环境变量(root用户)==
vi /etc/profile
export HIVE_HOME="/opt/modules/apache-hive-2.3.3"
export HIVE_CONF_DIR="/opt/modules/apache-hive-2.3.3/conf"
在 export PATH后面添加 H I V E H O M E / b i n : ‘ ‘ ‘ e x p o r t P A T H = HIVE_HOME/bin: ```export PATH= HIVEHOME/bin:‘‘‘exportPATH=HIVE_HOME/bin:$PATH```
————————————————
版权声明:本文为CSDN博主「数据的星辰大海」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qq_37554565/article/details/90477492
配置hive-site.xml
- 使用hive-default.xml.template,创建hive-site.xml
//进入hive/conf文件夹下
$ cd apache-hive-2.3.3/conf/
//拷贝hive-default.xml.template ,重命名为 hive-site.xml
$ cp hive-default.xml.template hive-site.xml
————————————————
- 修改元数据数据库地址, javax.jdo.option.ConnectionURL;
//修改hive-site.xml配置
$ vi hive-site.xml
//1. 按 i 键,进入编辑状态
//2. 按 / 键,查找 javax.jdo.option.ConnectionURL
//修改如下
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf8&useSSL=false&serverTimezone=UTC</value>
</property>
————————————————
3. 修改元数据数据库驱动,javax.jdo.option.ConnectionDriverName;
// 按 / 键,查找 javax.jdo.option.ConnectionDriverName
//修改如下
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
————————————————
4. 修改元数据数据库用户名,javax.jdo.option.ConnectionUserName;
//按 / 键,查找 javax.jdo.option.ConnectionUserName
//修改如下
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
————————————————
5. 元数据数据库登陆密码,javax.jdo.option.ConnectionPassword;
//按 / 键, 查找 javax.jdo.option.ConnectionPassword
//修改如下
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>jereh123</value>
</property>
————————————————
6. 修改hive数据仓库存储地址(在hdfs上具体存储地址),hive.metastore.warehouse.dir;
//按 / 键,查找 hive.metastore.warehouse.dir
//默认 /user/hive/warehouse ,这里不进行调整
//修改如下
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
————————————————
7. 配置其他路径;
//1. 配置 hive.querylog.location
//修改如下
<property>
<name>hive.querylog.location</name>
<value>/opt/tmp/hive</value>
</property>
//2. 配置 hive.server2.logging.operation.log.location
//修改如下
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>/opt/tmp/hive/operation_logs</value>
</property>
//3. 配置 hive.exec.local.scratchdir????
//修改如下
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/tmp/hive</value>
</property>
//4. 配置 hive.downloaded.resources.dir
//修改如下
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/tmp/_resources</value>
</property>
————————————————
配置hive-log4j2.properties错误日志
拷贝 hive-log4j2.properties.template ,并命名为 hive-log4j2.properties
//按 esc 键,退出编辑;
//按 wq 键,保存编辑;
//查看 /conf
$ ll
//拷贝 hive-log4j2.properties.template ,并命名为 hive-log4j2.properties
$ cp hive-log4j2.properties.template hive-log4j2.properties
//编辑 hive-log4j2.properties
$ vi hive-log4j2.properties
//按 i 键,进入编辑状态
//配置输出log文件
property.hive.log.dir = /opt/tmp/hive/operation_logs
————————————————
修改hive-log4j2.properties,配置hive的log
配置下面的参数(如果没有logs目录,在hive根目录下创建它
修改hive-env.sh
cp hive-env.sh.template hive-env.sh
因为Hive使用了 Hadoop, 需要在 hive-env.sh 文件中指定 Hadoop 安装路径:
vim hive-env.sh
在打开的配置文件中,添加如下几行:
export JAVA_HOME=/opt/modules/jdk1.8.0_261/
export HADOOP_HOME=/opt/modules/hadoop-2.7.2/
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HIVE_HOME=/opt/modules/apache-hive-2.3.3
export HIVE_CONF_DIR=$HIVE_HOME/conf
export HIVE_AUX_JARS_PATH=$HIVE_HOME/lib
初始化hive元数据
- 初始化schema,schematool -dbType mysql -initSchema;
//使用 schematool, 初始化 schema
[hadoop@bigdata-senior01 apache-hive-2.3.3]$ pwd
/opt/modules/apache-hive-2.3.3
[hadoop@bigdata-senior01 modules]$ cp mysql-connector-java-5.1.49.jar apache-hive-2.3.3/lib/
[hadoop@bigdata-senior01 modules]$ ll apache-hive-2.3.3/lib/mysql*
-rw-r--r-- 1 hadoop hadoop 1006904 Jan 4 22:53 apache-hive-2.3.3/lib/mysql-connector-java-5.1.49.jar
-rw-r--r-- 1 hadoop hadoop 7954 Dec 19 2016 apache-hive-2.3.3/lib/mysql-metadata-storage-0.9.2.jar
[hadoop@bigdata-senior01 modules]$
[hadoop@bigdata-senior01 apache-hive-2.3.3]$ bin/schematool -dbType mysql -initSchema
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/modules/apache-hive-2.3.3/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/modules/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Metastore connection URL: jdbc:mysql://localhost:3306/hive?autoReconnect=true&useUnicode=true&createDatabaseIfNotExist=true&characterEncoding=utf8&useSSL=false&serverTimezone=UTC
Metastore Connection Driver : com.mysql.jdbc.Driver
Metastore connection User: root
Starting metastore schema initialization to 2.3.0
Initialization script hive-schema-2.3.0.mysql.sql
Initialization script completed
schemaTool completed
[hadoop@bigdata-senior01 apache-hive-2.3.3]$
[hadoop@hadoop000 apache-hive-2.3.3]$
处理Driver异常
-
- 补充初始化 schema 时,会出现Underlying cause: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver异常,这里是因为缺少mysql连接驱动,这里使用 mysql-connector-java 5.1.38;
//下载mysql-connector的jar包
cd /opt/modules/apache-hive-2.3.3/lib
wget
[hadoop@hadoop000 lib]$ wget http://central.maven.org/maven2/mysql/mysql-connector-java/5.1.38/mysql-connector-java-5.1.38.
//检查mysql驱动是否下载完成
[hadoop@localhost lib]$ ll | grep mysql
-rw-rw-r--. 1 hadoop hadoop 983911 Dec 16 2015 mysql-connector-java-5.1.38.jar
-rw-r--r--. 1 hadoop hadoop 7954 Dec 20 2016 mysql-metadata-storage-0.9.2.jar
再次执行初始化,出现 schemaTool completed ,说明初始化成功;
————————————————
3. 登录mysql,查看初始化后的hive库
//登录mysql
# mysql -uroot -pjereh123
//查看数据库
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| hive |
| mysql |
| performance_schema |
| sys |
+--------------------+
5 rows in set (0.00 sec)
//使用 hive
mysql> use hive;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Database changed
//查看hive下,所有表,发现是 57 张,说明初始化成功
mysql> show tables;
+---------------------------+
| Tables_in_hive |
+---------------------------+
| AUX_TABLE |
| BUCKETING_COLS |
| CDS |
| COLUMNS_V2 |
| COMPACTION_QUEUE |
| COMPLETED_COMPACTIONS |
| COMPLETED_TXN_COMPONENTS |
| DATABASE_PARAMS |
| DBS |
| DB_PRIVS |
| DELEGATION_TOKENS |
| FUNCS |
| FUNC_RU |
| GLOBAL_PRIVS |
| HIVE_LOCKS |
| IDXS |
| INDEX_PARAMS |
| KEY_CONSTRAINTS |
| MASTER_KEYS |
| NEXT_COMPACTION_QUEUE_ID |
| NEXT_LOCK_ID |
| NEXT_TXN_ID |
| NOTIFICATION_LOG |
| NOTIFICATION_SEQUENCE |
| NUCLEUS_TABLES |
| PARTITIONS |
| PARTITION_EVENTS |
| PARTITION_KEYS |
| PARTITION_KEY_VALS |
| PARTITION_PARAMS |
| PART_COL_PRIVS |
| PART_COL_STATS |
| PART_PRIVS |
| ROLES |
| ROLE_MAP |
| SDS |
| SD_PARAMS |
| SEQUENCE_TABLE |
| SERDES |
| SERDE_PARAMS |
| SKEWED_COL_NAMES |
| SKEWED_COL_VALUE_LOC_MAP |
| SKEWED_STRING_LIST |
| SKEWED_STRING_LIST_VALUES |
| SKEWED_VALUES |
| SORT_COLS |
| TABLE_PARAMS |
| TAB_COL_STATS |
| TBLS |
| TBL_COL_PRIVS |
| TBL_PRIVS |
| TXNS |
| TXN_COMPONENTS |
| TYPES |
| TYPE_FIELDS |
| VERSION |
| WRITE_SET |
+---------------------------+
57 rows in set (0.00 sec)
————————————————
启动hive
- 使用hiveserver2启动hive,启动前需要修改hadoop代理访问权限:
找到hadoop的core-site.xml,添加如下配置:
[hadoop@bigdata-senior01 hadoop]$ pwd
/opt/modules/hadoop-2.8.5/etc/hadoop
[hadoop@bigdata-senior01 hadoop]$ vim core-site.xml
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
————————————————
2. hive的访问方式分为两种:
1). beeline,该方式仅支持本地机器进行操作;
启动方式如下:$ bin/beeline -u jdbc:hive2://127.0.0.1:10000 -n hadoop
-n : 代理用户
-u : 请求地址
2). hiveserver2,该方式可提供不同的机器进行调用;
启动方式如下:$ bin/hiveservice2
查看是否已开放端口:netstat -ant | grep 10000
[hadoop@bigdata-senior01 apache-hive-2.3.3]$ pwd
/opt/modules/apache-hive-2.3.3
[hadoop@bigdata-senior01 apache-hive-2.3.3]$ bin/hiveserver2
which: no hbase in (/opt/modules/hadoop-2.8.5/bin:/opt/modules/hadoop-2.8.5/sbin:/opt/modules/jdk1.8.0_261/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/mysql/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/opt/modules/jdk1.8.0_261//bin:/opt/modules/hadoop-2.8.5//bin:/opt/modules/hadoop-2.8.5//sbin)
2022-01-04 22:58:41: Starting HiveServer2
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/modules/apache-hive-2.3.3/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/modules/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
————————————————
Hive(2.3.3,Hive其他版本需自行编译Linkis),安装的机器必须支持执行hive -e "show databases"命令
[hadoop@bigdata-senior01 apache-hive-2.3.3]$ hive -e "show databases"
which: no hbase in (/opt/modules/apache-hive-2.3.3/bin:/opt/modules/hadoop-2.8.5/bin:/opt/modules/hadoop-2.8.5/sbin:/opt/modules/jdk1.8.0_261/bin:/opt/modules/hadoop-2.8.5/bin:/opt/modules/hadoop-2.8.5/sbin:/opt/modules/jdk1.8.0_261/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/mysql/bin:/home/hadoop/.local/bin:/home/hadoop/bin:/opt/modules/jdk1.8.0_261//bin:/opt/modules/hadoop-2.8.5//bin:/opt/modules/hadoop-2.8.5//sbin:/opt/mysql/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/modules/apache-hive-2.3.3/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/modules/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in file:/opt/modules/apache-hive-2.3.3/conf/hive-log4j2.properties Async: true
参考
集群搭建–安装apache-hive-2.3.4
装Apache Hive-2.3.3
https://blog.csdn.net/weixin_33955681/article/details/92958527)