1. 前提条件:
已经部署好hadoop,zookeeper,hbase,mysql数据库。
2. hive软件下载:
http://archive.cloudera.com/cdh4/cdh/4/,版本是hive-0.10.0-cdh4.4.0.tar。
cdh版本时cloudera公司提供的。
3. 解压软件
[hadoop@VM6 hadoop]$ tar -xvf hive-0.10.0-cdh4.4.0.tar.gz
[hadoop@VM6 hadoop]$ ln -s hive-0.10.0-cdh4.4.0 hive
4. 设置hive-env.sh环境变量
[hadoop@vm6 conf]$ pwd
/app/hadoop/hbase/conf
[hadoop@vm6 conf]$ vi hive-env.sh
export JAVA_HOME=/app/hadoop/jdk --追加一行,java home的目录
HADOOP_HOME=${bin}/../../hadoop --设置hadoop目录
5. 设置hive-site.xml
[hadoop@cnsz032232 conf]$ pwd
/app/hadoop/hive/conf
[hadoop@cnsz032232 conf]$ vi hive-site.xml
<!--
MySQL数据库连接配置:
-->
javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3888/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore
javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore
javax.jdo.option.ConnectionUserName
hive
username to use against metastore database
javax.jdo.option.ConnectionPassword
xxxxx
password to use against metastore database
<!--
Hbase集群节点信息:这里是hive需要访问的hbase的信息。
-->
hbase.zookeeper.quorum
CNSZ032232,CNSZ032233,CNSZ032234,CNSZ032235,CNSZ032236
<!--
Hive默认访问的目录:这里不配置的话,会导致使用hive出现错误:java.io.FileNotFoundException: File does not exist: hdfs://VM6:30000/app/hadoop/hive-0.10.0-cdh4.4.0/lib/hive-builtins-0.10.0-cdh4.4.0.jar
-->
k
fs.default.name
file:///
<!--
默认hive数据库的位置。
-->
hive.metastore.warehouse.dir
hdfs://CNSZ032232:30000/user/hive/warehouse
location of default database for the warehouse
6. 将hbase的lib拷贝到hive的lib目录下。hive访问hbase需要hbase的类包。
[hadoop@cnsz032232 hadoop]$ pwd
/wls/hadoop/hadoop
[hadoop@cnsz032232 hadoop]$ cp hbase/lib/* hive/lib
7. 检查验证
[hadoop@cnsz032232 hive]$ pwd
/app/hadoop/hive
[hadoop@cnsz032232 hive]$ bin/hive shell 登陆hive
hive>show databases;
OK
default
Time taken: 6.429 seconds
hive>
8. 验证创建hbase表
在hbase中创建表hive_tab
create 'hive_tab','username','userdesc'
put 'hive_tab','1','username','peter'
put 'hive_tab','1','userdesc','worker'
put 'hive_tab','2','username','james'
put 'hive_tab','2','userdesc','president'
put 'hive_tab','3','username','oscar'
put 'hive_tab','3','userdesc','diamond'
在hive中创建外部表
create external table hive_tab
(key int,username map,userdesc map)
stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
with serdeproperties ("hbase.columns.mapping" ="username:,userdesc:")
tblproperties ("hbase.table.name" = "hive_tab");
查询表hive_tab
hive> select * from hive_tab;
OK
1 {"":"peter"} {"":"worker"}
2 {"":"james"} {"":"president"}
3 {"":"oscar"} {"":"diamond"}
Time taken: 0.148 seconds
hive> select count(*) from hive_tab;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Execution log at: /tmp/hadoop/hadoop_20161228150202_b0da2f9a-0b8d-499d-986b-030a3c7d6d10.log
Job running in-process (local Hadoop)
2016-12-28 15:02:56,245 null map = 100%, reduce = 100%
Ended Job = job_local1860113246_0001
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
OK
3
Time taken: 10.738 seconds
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/25105315/viewspace-2131531/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/25105315/viewspace-2131531/