Hive部署

最新推荐文章于 2022-08-03 16:31:27 发布

cqr0591

最新推荐文章于 2022-08-03 16:31:27 发布

阅读量80

点赞数

文章标签：大数据数据库 java

1. 前提条件：

已经部署好hadoop，zookeeper，hbase，mysql数据库。

2. hive软件下载：

http://archive.cloudera.com/cdh4/cdh/4/，版本是hive-0.10.0-cdh4.4.0.tar。
cdh版本时cloudera公司提供的。

3. 解压软件

[hadoop@VM6 hadoop]$ tar -xvf hive-0.10.0-cdh4.4.0.tar.gz
[hadoop@VM6 hadoop]$ ln -s hive-0.10.0-cdh4.4.0 hive

4. 设置hive-env.sh环境变量
[hadoop@vm6 conf]$ pwd
/app/hadoop/hbase/conf
[hadoop@vm6 conf]$ vi hive-env.sh
export JAVA_HOME=/app/hadoop/jdk --追加一行，java home的目录
HADOOP_HOME=${bin}/../../hadoop --设置hadoop目录

5. 设置hive-site.xml
[hadoop@cnsz032232 conf]$ pwd
/app/hadoop/hive/conf
[hadoop@cnsz032232 conf]$ vi hive-site.xml

javax.jdo.option.ConnectionURL
jdbc:mysql://localhost:3888/hive?createDatabaseIfNotExist=true
JDBC connect string for a JDBC metastore

javax.jdo.option.ConnectionDriverName
com.mysql.jdbc.Driver
Driver class name for a JDBC metastore

javax.jdo.option.ConnectionUserName
hive
username to use against metastore database

javax.jdo.option.ConnectionPassword
xxxxx
password to use against metastore database

hbase.zookeeper.quorum
CNSZ032232,CNSZ032233,CNSZ032234,CNSZ032235,CNSZ032236

k
fs.default.name
file:///

hive.metastore.warehouse.dir
hdfs://CNSZ032232:30000/user/hive/warehouse
location of default database for the warehouse

6. 将hbase的lib拷贝到hive的lib目录下。hive访问hbase需要hbase的类包。
[hadoop@cnsz032232 hadoop]$ pwd
/wls/hadoop/hadoop
[hadoop@cnsz032232 hadoop]$ cp hbase/lib/* hive/lib

7. 检查验证
[hadoop@cnsz032232 hive]$ pwd
/app/hadoop/hive
[hadoop@cnsz032232 hive]$ bin/hive shell 登陆hive
hive>show databases;
OK
default
Time taken: 6.429 seconds
hive>

8. 验证创建hbase表

在hbase中创建表hive_tab
create 'hive_tab','username','userdesc'

put 'hive_tab','1','username','peter'
put 'hive_tab','1','userdesc','worker'
put 'hive_tab','2','username','james'
put 'hive_tab','2','userdesc','president'
put 'hive_tab','3','username','oscar'
put 'hive_tab','3','userdesc','diamond'

在hive中创建外部表
create external table hive_tab
(key int,username map,userdesc map)
stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
with serdeproperties ("hbase.columns.mapping" ="username:,userdesc:")
tblproperties ("hbase.table.name" = "hive_tab");

查询表hive_tab
hive> select * from hive_tab;
OK
1       {"":"peter"}    {"":"worker"}
2       {"":"james"}    {"":"president"}
3       {"":"oscar"}    {"":"diamond"}
Time taken: 0.148 seconds

hive> select count(*) from hive_tab;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapred.reduce.tasks=
WARNING: org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties files.
Execution log at: /tmp/hadoop/hadoop_20161228150202_b0da2f9a-0b8d-499d-986b-030a3c7d6d10.log
Job running in-process (local Hadoop)
2016-12-28 15:02:56,245 null map = 100%, reduce = 100%
Ended Job = job_local1860113246_0001
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
OK
3
Time taken: 10.738 seconds