前提条件
安装好Hadoop,可参考:安装hadoop3
安装好mysql,并可以远程连接,可参考:安装mysql5.7
步骤
下载apache-hive-3.1.2-bin.tar.gz
下载地址:Index of /dist/hive
解压
[hadoop@node2 installfile]$ tar -zxvf apache-hive-3.1.2-bin.tar.gz -C ~/soft
重命名
[hadoop@node2 installfile]$ cd ~/soft/ [hadoop@node2 soft]$ ls apache-hive-3.1.2-bin flume hadoop-3.1.3 hbase-2.4.11 jdk1.8.0_212 kafka sqoop zookeeper-3.5.7 [hadoop@node2 soft]$ mv apache-hive-3.1.2-bin hive
添加环境变量
[hadoop@node2 soft]$ sudo nano /etc/profile.d/my_env.sh
添加如下内容:
#HIVE_HOME export HIVE_HOME=/home/hadoop/soft/hive export PATH=$PATH:$HIVE_HOME/bin
让环境变量生效
[hadoop@node2 soft]$ source /etc/profile
拷贝mysql驱动,驱动下载地址:https://mvnrepository.com/
[hadoop@node2 soft]$ cp ~/installfile/06.Mysql/mysql-connector-java-5.1.49.jar $HIVE_HOME/lib
配置hive-site.xml
[hadoop@node2 conf]$ nano hive-site.xml
内容如下:
<?xml version="1.0" encoding="UTF-8" standalone="no"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true&useSSL=false</value> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>root</value> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>000000</value> </property> <property> <name>hive.metastore.schema.verification</name> <value>false</value> </property> <property> <name>hive.server2.thrift.bind.host</name> <value>node2</value> <description>Bind host on which to run the HiveServer2 Thrift service.</description> </property> <property> <name>hive.cli.print.header</name> <value>true</value> <description>Whether to print the names of the columns in query output.</description> </property> <property> <name>hive.cli.print.current.db</name> <value>true</value> <description>Whether to include the current database in the Hive prompt.</description> </property> <property> <name>hive.metastore.event.db.notification.api.auth</name> <value>false</value> <description> Should metastore do authorization against database notification related APIs such as get_next_notification. If set to true, then only the superusers in proxy settings have the permission </description> </property> <property> <name>hive.cli.print.header</name> <value>true</value> <description>Whether to print the names of the columns in query output.</description> </property> </configuration>
初始化hive
[hadoop@node2 conf]$ schematool -initSchema -dbType mysql -verbose
看到schemaTool completed
为成功。
初始化动作,其实是在mysql建出hive所需要的表,用navicat查看mysql中建了hive所需的相关表。
启动Hadoop集群
[hadoop@node2 conf]$ myhadoop.sh start
进入hive命令行
[hadoop@node2 conf]$ hive
测试
# 创建数据库 hive (default)> create database test; 2022-05-06 09:39:22,889 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] conf.HiveConf: Using the default value passed in for log id: bed6ef3e-bf70-4009-8a8a-4d3f03371005 ... OK 2022-05-06 09:39:23,407 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] ql.Driver: OK 2022-05-06 09:39:23,407 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] ql.Driver: Concurrency mode is disabled, not creating a lock manager Time taken: 0.475 seconds # 查看数据库 hive (default)> show databases; ... 2022-05-06 09:39:42,944 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir 2022-05-06 09:39:42,984 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] mapred.FileInputFormat: Total input files to process : 1 2022-05-06 09:39:43,009 INFO [bed6ef3e-bf70-4009-8a8a-4d3f03371005 main] exec.ListSinkOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_LIST_SINK_0:2, default test Time taken: 0.158 seconds, Fetched: 2 row(s) # 退出命令行 hive (default)> quit; 2022-05-06 09:43:35,048 INFO [main] conf.HiveConf: Using the default value passed in for log id: bed6ef3e-bf70-4009-8a8a-4d3f03371005
可以看到,hive命令行执行后,输出日志很多INFO级别的日志,所以需要修改一下日志级别为WARN级别,方法如下:
cd $HIVE_HOME/conf cat > log4j.properties <<EOL log4j.rootLogger=WARN, CA log4j.appender.CA=org.apache.log4j.ConsoleAppender log4j.appender.CA.layout=org.apache.log4j.PatternLayout log4j.appender.CA.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n EOL
再次测试,就没有INFO级别的日志输出了。
[hadoop@node2 conf]$ hive ... hive (default)> show databases; OK database_name default test Time taken: 0.451 seconds, Fetched: 2 row(s) hive (default)> quit; [hadoop@node2 conf]$
完成!enjoy it!