在windows10上安装apache-hive-3.1.3

一、hive介绍

hive是什么:hive是基于Hadoop的一个数据仓库工具,用来进行数据提取、转化、加载,这是一种可以存储、查询和分析存储在Hadoop中的大规模数据的机制。hive数据仓库工具能将结构化的数据文件映射为一张数据库表,并提供SQL查询功能,能将SQL语句转变成MapReduce任务来执行,hive 是一种底层封装了Hadoop 的数据仓库处理工具,使用类SQL 的hiveSQL 语言实现数据查询。

二、hive下载

https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/

三、配置hive的环境变量

08f2234aa0a3467cad6ee4607caceb0c.png31ea2ede0a9a41fc860b42a215f8ff88.png

四、修改hive中配置文件的参数

4.1、修改hive-env.sh中的参数

# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=D:\bigdata\hadoop
 
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=D:\bigdata\apache-hive-3.1.3-bin\conf
 
# Folder containing extra libraries required for hive compilation/execution can be controlled by:
export HIVE_AUX_JARS_PATH=D:\bigdata\apache-hive-3.1.3-bin\lib

4.2、在mysql上创建一个hive的数据库,在后面的配置中会用到

4.3、hive是运行在hadoop之上的,需要通过hadoop创建几个文件夹

4.4、修改hive-site.xml文件配置

<property>
		<name>hive.metastore.warehouse.dir</name>
		<value>/user/hive/warehouse</value>
		<description>location of default database for the warehouse</description>
	</property>
 
<!--hive的临时数据目录,指定的位置在hdfs上的目录-->
	<property>
		<name>hive.exec.scratchdir</name>
		<value>/tmp/hive</value>
		<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/&lt;username&gt; is created, with ${hive.scratch.dir.permission}.</description>
	</property>
 
<!-- scratchdir 本地目录 -->
	<property>
		<name>hive.exec.local.scratchdir</name>
		<value>D:/bigdata/apache-hive-3.1.3-bin/my_hive/scratch_dir</value>
		<description>Local scratch space for Hive jobs</description>
	</property>
 
<!-- resources_dir 本地目录 -->
	<property>
		<name>hive.downloaded.resources.dir</name>
		<value>D:/bigdata/apache-hive-3.1.3-bin/my_hive/resources_dir/${hive.session.id}_resources</value>
		<description>Temporary local directory for added resources in the remote file system.</description>
	</property>
 
<!-- querylog 本地目录 -->
	<property>
		<name>hive.querylog.location</name>
		<value>D:/bigdata/apache-hive-3.1.3-bin/my_hive/querylog_dir</value>
		<description>Location of Hive run time structured log file</description>
	</property>
 
<!-- operation_logs 本地目录 -->
	<property>
		<name>hive.server2.logging.operation.log.location</name>
		<value>D:/bigdata/apache-hive-3.1.3-bin/my_hive/operation_logs_dir</value>
		<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
	</property>
 
<!-- 数据库连接地址配置 -->
	<property>
		<name>javax.jdo.option.ConnectionURL</name>
		<value>jdbc:mysql://localhost:3306/hive?serverTimezone=UTC&amp;useSSL=false&amp;allowPublicKeyRetrieval=true</value>
		<description>
		JDBC connect string for a JDBC metastore.
		</description>
	</property>
 
<!-- 数据库驱动配置 -->
	<property>
		<name>javax.jdo.option.ConnectionDriverName</name>
		<value>com.mysql.cj.jdbc.Driver</value>
		<description>Driver class name for a JDBC metastore</description>
	</property>
 
<!-- 数据库用户名 -->
	<property>
		<name>javax.jdo.option.ConnectionUserName</name>
		<value>root</value>
		<description>Username to use against metastore database</description>
	</property>
 
<!-- 数据库访问密码 -->
	<property>
		<name>javax.jdo.option.ConnectionPassword</name>
		<value>123456</value>
		<description>password to use against metastore database</description>
	</property>
 
<!-- 解决 Caused by: MetaException(message:Version information not found in metastore. ) -->
	<property>
		<name>hive.metastore.schema.verification</name>
		<value>false</value>
		<description>
		Enforce metastore schema version consistency.
		True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
		schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
		proper metastore schema migration. (Default)
		False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
		</description>
	</property>

<!-- 自动创建全部 -->
<!-- hive Required table missing : "DBS" in Catalog""Schema" 错误 -->
	<property>
		<name>datanucleus.schema.autoCreateAll</name>
		<value>true</value>
		<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description>
	</property>

五、启动hive

默认情况下现在3.1.3版本是不存在window下的cmd命令。需要下载apache-hive-2.2.0-src.tar.gz这个包,把bin下面对应的cmd拷贝到hive中去

启动hive之前,先需要启动hadoop。然后再hive的bin目录下执行hive.cmd start

在执行命令的之后,出现了这样的情况

Beeline version 3.1.3 by Apache Hive
Hive Session ID = 5f6198d8-1183-4500-9cba-830da0127197
2023-11-09 01:24:22,203 INFO SessionState: Hive Session ID = 5f6198d8-1183-4500-9cba-830da0127197
Error applying authorization policy on hive configuration: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /tmp/hive/用户名/5f6198d8-1183-4500-9cba-830da0127197. Name node is in safe mode.
The reported blocks 49 has reached the threshold 0.9990 of total blocks 49. The minimum number of live datanodes is not required. In safe mode extension. Safe mode will be turned off automatically in 3 seconds. NamenodeHostName:localhost
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1612)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1599)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3437)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1166)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:742)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1213)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1089)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1012)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3026)

手动关闭安全模式

hadoop dfsadmin -safemode leave
D:\bigdata\hadoop\sbin>hadoop dfsadmin -safemode leave
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.
Safe mode is OFF

启动hive,在hive的bin下面执行hive.cmd start

5800f7e6a6f54ff3b8a1f5bf25c3b66a.png

执行过程中遇到这种先不用管,这个是hive与hadoop版本不匹配的问题 

2023-11-09 01:28:25,800 INFO session.SessionState: Resetting thread name to  main
2023-11-09 01:28:25,807 INFO conf.HiveConf: Using the default value passed in for log id: 255c0661-6506-4a5f-bbad-f48b74f972dd
2023-11-09 01:28:25,807 INFO session.SessionState: Updating thread name to 255c0661-6506-4a5f-bbad-f48b74f972dd main
2023-11-09 01:28:25,836 INFO conf.HiveConf: Using the default value passed in for log id: 255c0661-6506-4a5f-bbad-f48b74f972dd
2023-11-09 01:28:25,837 INFO session.SessionState: Resetting thread name to  main
Beeline version 3.1.3 by Apache Hive
hive>

这样说明启动成功了。

 

 

 

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值