环境说明
- Hadoop-2.6.0-cdh5.7.0
- JDK1.7
- MySQL5.6
- mysql-connector-java-5.1.45
安装包下载及解压
下载地址:http://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.7.0.tar.gz
解压:tar -zxvf hive-1.1.0-cdh5.7.0.tar.gz
配置环境变量
hadoop:hadoop:/home/hadoop:>vi .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
export HIVE_HOME=/home/hadoop/app/hive-1.1.0-cdh5.7.0
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HIVE_HOME/bin:$PATH
修改配置文件
- hive-env.sh
hadoop:hadoop:/home/hadoop/app/hive-1.1.0-cdh5.7.0/conf:>vi hive-env.sh
# else
# export HADOOP_OPTS="$HADOOP_OPTS -XX:NewRatio=12 -Xms10m -XX:MaxHeapFreeRatio=40 -XX:MinHeapFreeRatio=15 -XX:-UseGCOverheadLimit"
# fi
# fi
# The heap size of the jvm stared by hive shell script can be controlled via:
#
# export HADOOP_HEAPSIZE=1024
#
# Larger heap size may be required when running queries over large number of files or partitions.
# By default hive shell scripts use a heap size of 256 (MB). Larger heap size would also be
# appropriate for hive server (hwi etc).
# Set HADOOP_HOME to point to a specific hadoop install directory
HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.7.0
# Hive Configuration Directory can be controlled by:
- hive-site.xml (mysql中应该设置一个root用户,且密码为123456。这个文件可以直接创建一个)
hadoop:hadoop:/home/hadoop/app/hive-1.1.0-cdh5.7.0/conf:>vi hive-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hivedb?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
</configuration>
mysql-connector-java-5.1.45驱动
- 将驱动移动到hive目录下的lib目录中
hadoop:hadoop:/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib:>mv ../../../software/mysql-connector-java-5.1.45-bin.jar ../lib/
启动HIVE
-启动Hadoop
hadoop:hadoop:/home/hadoop:>start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
18/01/02 22:18:38 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop]
hadoop: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop.out
hadoop: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop.out
18/01/02 22:18:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-resourcemanager-hadoop.out
hadoop: starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/yarn-hadoop-nodemanager-hadoop.out
hadoop:hadoop:/home/hadoop:>jps
4554 ResourceManager
4761 Jps
4378 SecondaryNameNode
4120 NameNode
4652 NodeManager
4229 DataNode
- 查看MySQL是否运行
hadoop:mysqladmin:/usr/local/mysql:>service mysql status
MySQL running (5671) [ OK ]
- 启动hive
hadoop:hadoop:/home/hadoop:>hive
which: no hbase in (/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/bin:/home/hadoop/app/hadoop-2.6.0-cdh5.7.0/sbin:/home/hadoop/app/hive-1.1.0-cdh5.7.0/bin:/usr/lib64/qt-3.3/bin:/usr/java/jdk1.7.0_80/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin)
Logging initialized using configuration in jar:file:/home/hadoop/app/hive-1.1.0-cdh5.7.0/lib/hive-common-1.1.0-cdh5.7.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive>
简单测试
- 创建一个名为helloword的表
hive> create table helloword(id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t';
- 插入数据
hive> load data local inpath '/home/hadoop/data/hello' into table helloword;
Loading data to table default.helloword
Table default.helloword stats: [numFiles=1, totalSize=73]
OK
Time taken: 3.388 seconds
- 查询
hive> select * from helloword;
OK
1 spark
2 hello
3 tao
4 bao
5 hello
6 i
7 am
8 spark
9 to
10 hadoop
Time taken: 0.87 seconds, Fetched: 10 row(s)
若泽大数据交流群:671914634