apache-hive-1.2.1和hbase-1.2.2的整合(伪分布式)

最新推荐文章于 2022-03-25 14:52:31 发布

kenneth

最新推荐文章于 2022-03-25 14:52:31 发布

阅读量1.6k

点赞数

文章标签： hive hbase hadoop

本文链接：https://blog.csdn.net/linchunhua/article/details/54584961

版权

我的机器环境： hadoop2.6.0 的伪分布式 Hbase伪分布式环境

参考：hbase权威指南P240

1.启动hadoop和hbase

2.下载apache-hive-1.2.1

3.修改hive中conf下的hive-env.sh

# Set HADOOP_HOME to point to a specific hadoop install directory

HADOOP_HOME=/home/hadoop/hadoop

HBASE_HOME=/home/hadoop/hbase-1.2.2

# Hive Configuration Directory can be controlled by:

# export HIVE_CONF_DIR=

export HIVE_CLASSPATH=/home/hadoop/hbase-1.2.2/conf

# Folder containing extra ibraries required for hive compilation/execution can be controlled by:

export HIVE_AUX_JARS_PATH=/home/hadoop/hbase-1.2.2/lib

4.启动hive

备注：给通过hive给hbase建表时，如果出现下面的错误，需重新编译hive-hbase-handler-1.2.1.jar,替换hive/lib下的原jar包

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V

操作记录：

hadoop@ubuntu:~/apache-hive-1.2.1-bin/bin$ ./hive

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.2.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/home/hadoop/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/spark-1.6.1-bin-hadoop2.6/lib/spark-assembly-1.6.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/home/hadoop/hbase-1.2.2/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Logging initialized using configuration in jar:file:/home/hadoop/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties

hive> create table pokes(foo int,bar string);

Time taken: 3.432 seconds

hive> load data local inpath '/home/hadoop/apache-hive-1.2.1-bin/examples/files/kv1.txt' overwrite into table pokes;

Loading data to table default.pokes

Table default.pokes stats: [numFiles=1, numRows=0, totalSize=5812, rawDataSize=0]

Time taken: 1.353 seconds

hive> select * from pokes;

238 val_238

86 val_86

311 val_311

27 val_27

165 val_165

409 val_409

Time taken: 1.143 seconds, Fetched: 500 row(s)

hive> create table hbase_table_1(key int,value string)

> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

> with serdeproperties("hbase.columns.mapping"=":key,cf1:val")

> tblproperties("hbase.table.name"="hbase_hive_t1");

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org.apache.hadoop.hbase.HTableDescriptor.addFamily(Lorg/apache/hadoop/hbase/HColumnDescriptor;)V

针对这个错误，网上说这是不兼容造成的，网络上提供了两种解决方案：

1.换更高版本的hive 例如2.xx 可经试验发现问题依旧没有解决

2.重新编译hive-hbase-handler-1.2.1.jar，替换hive/lib中的同名包（此方法可行）

在网上有编译好的直接下载也可以： http://download.csdn.net/download/gao634209276/9530079

hive> create table hbase_table_1(key int,value string)

> stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

> with serdeproperties("hbase.columns.mapping"=":key,cf1:val")

> tblproperties("hbase.table.name"="hbase_hive_t1");

Time taken: 4.788 seconds

hive>

> ;

hive> insert overwrite table hbase_table_1 select * from pokes;

Query ID = hadoop_20170117004636_520fee8b-9d6c-4b41-88a5-a58402e0b6af

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

Starting Job = job_1484619043631_0001, Tracking URL = http://ubuntu:8088/proxy/application_1484619043631_0001/

Kill Command = /home/hadoop/hadoop/bin/hadoop job -kill job_1484619043631_0001

Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0

2017-01-17 00:47:53,388 Stage-0 map = 0%, reduce = 0%

2017-01-17 00:48:21,381 Stage-0 map = 100%, reduce = 0%, Cumulative CPU 6.54 sec

MapReduce Total cumulative CPU time: 6 seconds 540 msec

Ended Job = job_1484619043631_0001

MapReduce Jobs Launched:

Stage-Stage-0: Map: 1 Cumulative CPU: 7.34 sec HDFS Read: 15889 HDFS Write: 0 SUCCESS

Total MapReduce CPU Time Spent: 7 seconds 340 msec

Time taken: 108.485 seconds

hive> select count(*) from pokes;

Query ID = hadoop_20170117004939_099ed588-fbb4-4b9a-ac1c-1fb6259e7d11

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 1

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

set mapreduce.job.reduces=<number>

Starting Job = job_1484619043631_0002, Tracking URL = http://ubuntu:8088/proxy/application_1484619043631_0002/

Kill Command = /home/hadoop/hadoop/bin/hadoop job -kill job_1484619043631_0002

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1

2017-01-17 00:50:10,356 Stage-1 map = 0%, reduce = 0%

2017-01-17 00:50:30,514 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 2.94 sec

2017-01-17 00:50:49,055 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 6.38 sec

MapReduce Total cumulative CPU time: 6 seconds 380 msec

Ended Job = job_1484619043631_0002

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 6.38 sec HDFS Read: 12409 HDFS Write: 4 SUCCESS

Total MapReduce CPU Time Spent: 6 seconds 380 msec

500

Time taken: 72.3 seconds, Fetched: 1 row(s)

hive> select count(*) from hbase_table_1;

Query ID = hadoop_20170117005103_2fa584c7-0c2f-4b40-bc86-093f01e35a00

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 1

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

set mapreduce.job.reduces=<number>

Starting Job = job_1484619043631_0003, Tracking URL = http://ubuntu:8088/proxy/application_1484619043631_0003/

Kill Command = /home/hadoop/hadoop/bin/hadoop job -kill job_1484619043631_0003

Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1

2017-01-17 00:51:53,774 Stage-1 map = 0%, reduce = 0%

2017-01-17 00:52:16,564 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 6.42 sec

2017-01-17 00:52:36,997 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 9.93 sec

MapReduce Total cumulative CPU time: 9 seconds 930 msec

Ended Job = job_1484619043631_0003

MapReduce Jobs Launched:

Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 9.93 sec HDFS Read: 13551 HDFS Write: 4 SUCCESS

Total MapReduce CPU Time Spent: 9 seconds 930 msec

309

Time taken: 95.345 seconds, Fetched: 1 row(s)

hive> drop table pokes;

Time taken: 3.374 seconds

hive> select * from pokes;

FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'pokes'

hive> drop table hbase_table_1;

Time taken: 4.64 seconds

kenneth

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
apache-hive-1.2.1和hbase-1.2.2的整合(伪分布式)

我的机器环境： hadoop2.6.0 的伪分布式 Hbase伪分布式环境参考：hbase权威指南P2401.启动hadoop和hbase 2.下载apache-hive-1.2.13.修改hive中conf下的hive-env.sh# Set HADOOP_HOME to point to a specific hadoop install directoryHADOO
复制链接

扫一扫