一、Hive安装部署
(1)把apache-hive-3.1.3-bin.tar.gz上传到Linux的/opt/software目录下
(2)解压apache-hive-3.1.3-bin.tar.gz到/opt/moudle/目录下面
[appview@master software]$ tar -zxvf apache-hive-3.1.3-bin.tar.gz -C /opt/moudle/
(3)修改apache-hive-3.1.3-bin.tar.gz的名称为hive-3.1.3
[appview@master moudle]$ cd /opt/moudle/ [appview@master moudle]$ mv apache-hive-3.1.3-bin hive-3.1.3
(4)修改/home/appview/.bashrc,添加环境变量
[appview@master ~]$ vim .bashrc ## 添加以下内容 #HIVE_HOME export HIVE_HOME=/opt/moudle/hive-3.1.3 export PATH=$PATH:$HIVE_HOME/bin ### source以下,使环境变量生效 [appview@master ~]$ source ~/.bashrc
(5)解决日志Jar包冲突,进入/opt/moudle/hive-3.1.3/lib目录
[appview@master lib]$ mv log4j-slf4j-impl-2.17.1.jar log4j-slf4j-impl-2.17.1.jar.bak 或者直接删掉 [appview@master lib]$ rm log4j-slf4j-impl-2.17.1.jar
二、Hive元数据配置到postgresql
(1)将postgresql-42.2.14.jar的JDBC驱动包传到/opt/moudle/hive-3.1.3/lib
(2)新建Hive元数据库
--创建用户并指定密码 create user hive with password 'Psbc_2024'; --创建数据库 create database hivemeta_db owner hive; --赋予权限 grant all privileges on database hivemeta_db to hive; ### 连接psql登录创建数据库,执行以上命令,看到以下内容我们就创建成功了 [appview@master ~]$ psql "host=自己的ip地址 port=5432 user=postgres password=Psbc_2024 dbname=postgres" psql (14.7) Type "help" for help. postgres=# create user hive with password 'Psbc_2024'; CREATE ROLE postgres=# create database hivemeta_db owner hive; CREATE DATABASE postgres=# grant all privileges on database hivemeta_db to hive; GRANT postgres=# postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -------------+------------+----------+-------------+-------------+--------------------------- grafana | grafana | UTF8 | zh_CN.UTF-8 | zh_CN.UTF-8 | =Tc/grafana + | | | | | grafana=CTc/grafana hivemeta_db | hive | UTF8 | zh_CN.UTF-8 | zh_CN.UTF-8 | =Tc/hive + | | | | | hive=CTc/hive postgres | postgres | UTF8 | zh_CN.UTF-8 | zh_CN.UTF-8 | template0 | postgres | UTF8 | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres
(3)配置Metastore到postgresql
1)在$HIVE_HOME/conf目录下新建hive-site.xml文件,以下是我的配置文件,记得改成自己的ip和数据库名
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <!-- pg用户名 --> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>Psbc_2024</value> </property> <property> <name>hive.server2.thrift.client.user</name> <value>hive</value> </property> <property> <name>hive.server2.thrift.client.password</name> <value>Psbc_2024</value> </property> <!-- pg库存储元数据 --> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:postgresql://你的ip:5432/hivemeta_db</value> </property> <!-- pg driver --> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>org.postgresql.Driver</value> </property> <!-- 数仓存储路径(HDFS) --> <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> <!-- 指定metastore地址和端口 --> <property> <name>hive.metastore.uris</name> <value>thrift://master:9083</value> </property> <!-- hiveserver2服务绑定的host及端口号 --> <property> <name>hive.server2.thrift.bind.host</name> <value>master</value> </property> <property> <name>hive.server2.thrift.port</name> <value>10000</value> </property> <property> <name>hive.server2.enable.doAS</name> <value>false</value> </property> <property> <name>metastore.metastore.event.db.notification.api.auth</name> <value>false</value> </property> <!-- hive指定多字符分隔符的依赖jar --> <property> <name>hive.aux.jars.path</name> <value>/opt/moudle/hive-3.1.3/lib/hive-contrib-3.1.3.jar</value> </property> <property> <name>hive.support.concurrency</name> <value>true</value> </property> <property> <name>hive.enforce.bucketing</name> <value>true</value> </property> <property> <name>hive.exec.dynamic.partition.mode</name> <value>nonstrict</value> </property> <property> <name>hive.txn.manager</name> <value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value> </property> <property> <name>hive.compactor.initiator.on</name> <value>true</value> </property> <property> <name>hive.compactor.worker.threads</name> <value>1</value> </property> <property> <name>datanucleus.schema.autoCreateAll</name> <value>true</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>hive.security.authorization.enabled</name> <value>true</value> <description>开启hive权限认证</description> </property> <property> <name>hive.security.authorization.createtable.owner.grants</name> <value>ALL</value> <description>设置用户对自己创建的表拥有所有权限</description> </property> <property> <name>hive.security.authorization.task.factory</name> <value>org.apache.hadoop.hive.ql.parse.authorization.HiveAuthorizationTaskFactoryImpl</value> <description>出现报错The current builtin authorization in Hive is incomplete and disabled 解决方法</description> </property> <property> <name>hive.users.in.admin.role</name> <value>appview</value> </property> <property> <name>hive.server2.support.dynamic.service.discovery</name> <value>true</value> </property> <property> <name>hive.server2.zookeeper.namespace</name> <value>hive2_zk</value> </property> <property> <name>hive.zookeeper.quorum</name> <value>master:2181,salve1:2181,salve2:2181</value> </property> <property> <name>hive.zookeeper.client.port</name> <value>2181</value> </property> <!-- 解决找不到TezConfiguartion类的WARN异常 --> <property> <name>hive.server2.active.passive.ha.enable</name> <value>true</value> </property> <!-- hive查询日志记录 --> <property> <name>hive.querylog.location</name> <value>/opt/moudle/hive-3.1.3/tmp/querylog</value> <description>Location of Hive run time structured log file</description> </property> <!-- hive执行job的本地tmp目录 --> <property> <name>hive.exec.local.scratchdir</name> <value>/opt/moudle/hive-3.1.3/tmp/scratch</value> <description>Local scratch space for Hive jobs</description> </property> <!-- hive下载资源临时目录 --> <property> <name>hive.downloaded.resources.dir</name> <value>/opt/moudle/hive-3.1.3/tmp/${hive.session.id}_resources</value> <description>Temporary local directory for added resources in the remote file system.</description> </property> <!-- hive on tez配置 --> <property> <name>hive.execution.engine</name> <value>tez</value> </property> <property> <name>hive.tez.container.size</name> <value>1024</value> </property> <!-- 查询结果显示列名 --> <property> <name>hive.cli.print.header</name> <value>true</value> </property> <!-- 在命令行显示当前所使用的数据库 --> <property> <name>hive.cli.print.current.db</name> <value>true</value> </property> </configuration>
(3)初始化PostgreSQL
[appview@master hive-3.1.3]$ cd /opt/moudle/hive-3.1.3/ [appview@master hive-3.1.3]$ bin/schematool -dbType postgres -initSchema -verbose
三、启动Hive客户端
(1)因为不是配置本地的库,需要先启动hivemetastore执行hive命令报错,想要远程连接的可以启动hiveserver2
[appview@master hive-3.1.3]$ bin/hive --service metastore & [appview@master hive-3.1.3]$ bin/hive --service hiveserver2 & [appview@master hive-3.1.3]$ hive ###启动hive
(2) 测试
$ bin/hive hive> show databases; OK default Time taken: 0.067 seconds, Fetched: 1 row(s) hive> show tables; OK Time taken: 0.092 seconds
四、结语
看到以上测试的已经完成了,注意的是我在hive-site.xml中配的计算引擎是tez,如果不安装tez的话,把上面的的<!-- hive on tez配置 -->下那一段配置内容删掉或注释就可以了