安装好JDK Hadoop Mysql
JDK8+
Hadoop2.7.7
下载安装包并解压
apache-hive-1.2.2-bin.tar.gz
apache-tez-0.9.1-bin.tar.gz
全部解压到 /home/hadoop/module
安装Hive
1.配置hive-env.sh
export HADOOP_HOME=/home/hadoop/module/hadoop-2.7.7
export HIVE_CONF_DIR=/home/hadoop/module/hive-1.2.2/conf
export TEZ_HOME=/home/hadoop/module/tez-0.9.1
export TEZ_JARS=""
for jar in `ls $TEZ_HOME |grep jar`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/$jar
done
for jar in `ls $TEZ_HOME/lib`; do
export TEZ_JARS=$TEZ_JARS:$TEZ_HOME/lib/$jar
done
#我自己增加了lzo压缩,需要的自行编译安装,不需要的可以去掉
export HIVE_AUX_JARS_PATH=$HADOOP_HOME/share/hadoop/common/hadoop-lzo-0.4.21-SNAPSHOT.jar$TEZ_JARS
2.向/home/hadoop/module/hive-1.2.2/lib 中添加mysql驱动包: mysql-connector-java-5.1.47.jar
3.在/home/hadoop/module/hive-1.2.2/conf新建hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://{你自己的mysql主机地址}:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>root</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<!--hive使用tez作为执行引擎-->
<property>
<name>hive.execution.engine</name>
<value>tez</value>
</property>
</configuration>
安装Tez
1.在/home/hadoop/module/hive-1.2.2/conf下新建tez-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>tez.lib.uris</name>
<value>${fs.defaultFS}/share/tez/tez-0.9.1,${fs.defaultFS}/share/tez/tez-0.9.1/lib</value>
</property>
<property>
<name>tez.lib.uris.classpath</name>
<value>${fs.defaultFS}/share/tez/tez-0.9.1,${fs.defaultFS}/share/tez/tez-0.9.1/lib</value>
</property>
<property>
<name>tez.use.cluster.hadoop-libs</name>
<value>true</value>
</property>
<property>
<name>tez.history.logging.service.class</name>
<value>org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService</value>
</property>
</configuration>
2.上传tez类库文件,为yarn共享
执行 hadoop fs -mkdir -p /share/tez ,创建共享目录
执行 hadoop fs -put /home/hadoop/module/tez-0.9.1/ /share/tez ,上传
执行 hadoop fs -ls /share/tez ,检查
测试
1.启动hive
[hadoop@saas-mid-02 hive-1.2.2]$ bin/hive
2.创建表
hive> create table person(
id int,
name string);
OK
Time taken: 0.307 seconds
3.插入数据
hive> insert into person values(1,“laozhang”);
若出现报错如下:
Status: Failed
Application application_1582349588934_0001 failed 2 times due to AM Container for appattempt_1582349588934_0001_000002 exited with exitCode: -103
For more detailed output, check application tracking page:http://saas-mid-03:8088/cluster/app/application_1582349588934_0001Then, click on links to logs of each attempt.
Diagnostics: Container [pid=5962,containerID=container_1582349588934_0001_02_000001] is running beyond virtual memory limits. Current usage: 258.0 MB of 1 GB physical memory used; 2.7 GB of 2.1 GB virtual memory used. Killing container.
在 /home/hadoop/module/hadoop-2.7.7/etc/hadoop 目录下修改 mapred-site.xml 追加以下
<!--根据自己机器物理内存对应修改map、reduce任务的使用内存-->
<property>
<name>mapreduce.map.memory.mb</name>
<value>1536</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx1024M</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>3072</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx2560M</value>
</property>
在 /home/hadoop/module/hadoop-2.7.7/etc/hadoop 目录下修改 yarn-site.xml 追加以下
<!--当运行时所需内存超过虚拟内存限制是否强制关闭container-->
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<!--设置容器的在内存限制时,虚拟内存与物理内存之间的比率-->
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
</configuration>
4.查询数据
hive> select * from person;