记录一下本地的hive测试环境安装过程
1. 下载所需安装包,选择编译好的二进制类型即可
hive
hadoop
mysql
jdk
jdbc connector
2. 移动到本机的任意目录下,我选择的是 /opt/local/ 下面,解压并重命名文件夹
3. 配置Hadoop
- 修改配置文件
core-site.xml
<configuration>
<!-- 指定 HDFS 中 NameNode 的地址 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
<!-- 指定 Hadoop 运行时产生的文件目录, 默认目录: /tmp/hadoop-${user.name} -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/local/hadoop-2.10.0/data/tmp</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<!-- 配置 HDFS 的备份文件数量, 默认数量是3, 伪分布式, 配置1就行 -->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
- 格式化hdfs
/opt/local/hadoop-2.10.0/bin目录下执行
./hdfs namenode -format
启动namenode
/opt/local/hadoop-2.10.0/sbin目录下执行
bash hadoop-daemon.sh start namenode
启动datanode
/opt/local/hadoop-2.10.0/sbin目录下执行
bash hadoop-daemon.sh start datanode
查看是否启动
jps
-
访问hadoop管理web平台
地址:http://localhost:50070/ -
运行下mapreduce测试例子
$ mkdir input
$ cd input
$ touch f{1..3}.txt
$ echo "hello hadoop" > f1.txt
$ echo "hello java" > f2.txt
$ echo "hello world" > f3.txt
$ hadoop fs -mkdir -p /hadoop_test/input/
$ hadoop fs -put input/* /hadoop_test/input/
$ yarn jar hadoop-mapreduce-examples-2.10.0.jar wordcount /hadoop_test/input/ /hadoop_test/output/
$ hadoop fs -cat /hadoop_test/output/part-r-00000
4.安装MySQL
详情可见:https://blog.csdn.net/WYF209594/article/details/105807430
唯一不同的地方是,生成root临时密码后,出现如下提示:
修改密码语句如下:
alter user USER() identified by '123456';
5.安装hive
1、配置hive环境变量
在 ~/.zshrc 文件下增加如下环境变量代码
export HIVE_HOME='/opt/local/hive-2.3.7'
export PATH=$PATH:$HIVE_HOME/bin
2、配置hive-env.sh文件
基本没有啥特殊的配置
export HADOOP_HOME=/opt/local/hadoop-2.10.0
export HIVE_CONF_DIR=/opt/local/hive-2.3.7/conf
export HIVE_AUX_JARS_PATH=/opt/local/hive-2.3.7/lib
3、将jdbc connector对应的jar包复制到hive的lib目录下
/opt/local/hive-2.3.7/lib/mysql-connector-java-8.0.21.jar
4、配置hive-site.xml文件
比较关键的点是,配置使用mysql作为hive的元数据存储介质
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<!--Default数据仓库原始位置是在hdfs上:/user/hive/warehouse路径下-->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/hive/warehouse</value>
<!--先在HDFS创建目录-->
<description>location of default database for the warehouse</description>
</property>
<!--Hive用来存储不同阶段的MapReduce的执行计划的目录,同时也存储中间输出结果-->
<property>
<name>hive.exec.scratchdir</name>
<value>/tmp/hive</value>
<!--先在HDFS创建目录-->
</property>
<property>
<name>hive.scratch.dir.permission</name>
<value>777</value>
<description>The permission for the user specific scratch directories that get created.</description>
</property>
<!--当Hive运行在本地模式时配置-->
<property>
<name>hive.exec.local.scratchdir</name>
<value>/opt/local/hive-2.3.7/tmp/hive/root</value>
<!--先在本地创建目录-->
<description>Local scratch space for Hive jobs</description>
</property>
<!--远程资源下载的临时目录-->
<property>
<name>hive.downloaded.resources.dir</name>
<value>/opt/local/hive-2.3.7/tmp/resources</value>
<!--先在本地创建目录-->
<description>Temporary local directory for added resources in the remote file system.</description>
</property>
<!--配置Metastore到MySql-->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/metastore?createDatabaseIfNotExist=true&useSSL=false</value>
<!--XML需要转义,&转义为&-->
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
<description>password to use against metastore database</description>
</property>
<!--查询后显示当前数据库,以及查询表的头信息配置-->
<property>
<name>hive.cli.print.header</name>
<value>true</value>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
</configuration>
然后就可以了,首次运行启动hive后,会在mysql中自动创建相应的库表来存放hive的元数据了,具体存放在了mysql的metastore库下面