一、hive介绍
Hive是基于Hadoop的一个数据仓库工具,可以将结构化的数据文件映射为一张数据库表,并提供类SQL查询功能(HQL)。
二、hive的下载
https://mirrors.tuna.tsinghua.edu.cn/apache/hive/hive-3.1.3/
三、jdk和hive环境变量配置
export JAVA_HOME=/usr/local/jdk1.8.0_391
export JRE_HOME=/usr/local/jdk1.8.0_391/jre
export HBASE_HOME=/usr/local/bigdata/hbase-2.5.6-hadoop3
export HADOOP_HOME=/usr/local/bigdata/hadoop-3.3.6
export HIVE_HOME=/usr/local/bigdata/apache-hive-3.1.3-bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JAR_HOME/lib
export PATH=.:$JAVA_HOME/bin:$JRE_HOME/bin:$HADOOP_HOME/bin:$HIVE_HOME/bin:$HBASE_HOME/bin:$PYTHON_HOME/bin:$PATH
四、hive内部文件配置
4.1、创建文件
在hive下面创建 hive_log,在 hive_log下创建
drwxrwxrwx. 3 root root 23 11月 10 10:23 operation_logs
drwxrwxrwx. 2 root root 6 11月 10 09:26 querylog
drwxrwxrwx. 2 root root 6 11月 10 09:26 resources
drwxrwxrwx. 2 root root 6 11月 10 11:35 scratch
4.2、hive-site.xml文件配置
<!--hive的临时数据目录,指定的位置在hdfs上的目录-->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<!--hive的临时数据目录,指定的位置在hdfs上的目录-->
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<description>HDFS root scratch dir for Hive jobs which gets created with write all (733) permission. For each connecting user, an HDFS scratch dir: ${hive.exec.scratchdir}/<username> is created, with ${hive.scratch.dir.permission}.</description>
</property>
<!-- scratchdir 本地目录 -->
<property>
<name>hive.exec.local.scratchdir</name>
<value>D:/bigdata/apache-hive-3.1.3-bin/hive_log/scratch_dir</value>
<description>Local scratch space for Hive jobs</description>
</property>
<!-- resources_dir 本地目录 -->
<property>
<name>hive.downloaded.resources.dir</name>
<value>D:/bigdata/apache-hive-3.1.3-bin/hive_log/resources_dir/${hive.session.id}_resources</value>
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<!-- querylog 本地目录 -->
<property>
<name>hive.querylog.location</name>
<value>D:/bigdata/apache-hive-3.1.3-bin/hive_log/querylog_dir</value>
<description>Location of Hive run time structured log file</description>
</property>
<!-- operation_logs 本地目录 -->
<property>
<name>hive.server2.logging.operation.log.location</name>
<value>D:/bigdata/apache-hive-3.1.3-bin/hive_log/operation_logs_dir</value>
<description>Top level directory where operation logs are stored if logging functionality is enabled</description>
</property>
<!-- 数据库连接地址配置 -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive?serverTimezone=UTC&useSSL=false&allowPublicKeyRetrieval=true</value>
<description>
JDBC connect string for a JDBC metastore.
</description>
</property>
<!-- 数据库驱动配置 -->
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.cj.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<!-- 数据库用户名 -->
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>Username to use against metastore database</description>
</property>
<!-- 数据库访问密码 -->
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!-- 解决 Caused by: MetaException(message:Version information not found in metastore. ) -->
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>
Enforce metastore schema version consistency.
True: Verify that version information stored in is compatible with one from Hive jars. Also disable automatic
schema migration attempt. Users are required to manually migrate schema after Hive upgrade which ensures
proper metastore schema migration. (Default)
False: Warn if the version information stored in metastore doesn't match with one from in Hive jars.
</description>
</property>
<!-- 自动创建全部 -->
<!-- hive Required table missing : "DBS" in Catalog""Schema" 错误 -->
<property>
<name>datanucleus.schema.autoCreateAll</name>
<value>true</value>
<description>Auto creates necessary schema on a startup if one doesn't exist. Set this to false, after creating it once.To enable auto create also set hive.metastore.schema.verification=false. Auto creation is not recommended for production use cases, run schematool command instead.</description>
</property>
五、hive启动
在启动hive之前,需要先启动hadoop,安装和启动hadoop参考以下文章
在hive/bin下面输入./hive start启动,启动成功后的界面
[root@node4 bin]# ./hive start
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/bigdata/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/bigdata/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/bigdata/apache-hive-3.1.3-bin/lib/log4j-slf4j-impl-2.17.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/bigdata/hadoop-3.3.6/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID = 67bf5ef7-3368-45d4-9827-122eb23706d8
Logging initialized using configuration in jar:file:/usr/local/bigdata/apache-hive-3.1.3-bin/lib/hive-common-3.1.3.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
六、hive注意事项
6.1、jdk版本,需要1.8,高于1.8会有兼容性问题
6.2、与hadoop的版本也有关联关系