概述;
Hadoop数据库,是一个分布式,可扩展的大数据存储。
当您需要对大数据进行随机,实时读/写访问时,请使用Apache HBase™。该项目的目标是托管非常大的表 - 数十亿行X百万列 - 在商品硬件集群上。Apache HBase是一个开源的,分布式的,版本化的非关系数据库,模仿Google的Bigtable: Chang等人的结构化数据分布式存储系统。正如Bigtable利用Google文件系统提供的分布式数据存储一样,Apache HBase在Hadoop和HDFS之上提供类似Bigtable的功能。
特点:
- 线性和模块化可扩展性。
- 严格一致的读写操作。
- 表的自动和可配置分片
- RegionServers之间的自动故障转移支持。
- 方便的基类,用于使用Apache HBase表支持Hadoop MapReduce作业。
- 易于使用的Java API,用于客户端访问。
- 阻止缓存和布隆过滤器以进行实时查询。
- 查询谓词通过服务器端过滤器下推
- Thrift网关和REST-ful Web服务,支持XML,Protobuf和二进制数据编码选项
- 可扩展的基于jruby(JIRB)的外壳
- 支持通过Hadoop指标子系统将指标导出到文件或Ganglia; 或通过JMX
分布式安装部署
修改hbase-env.sh
export JAVA_HOME=按照自己java路径修改
export HBASE_MANAGES_ZK=false不要用自带的
hbase-site.sh
-->
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>192.168.8.1xx:2181,192.168.8.x1x:2181,192.168.8.xx1:2181</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://192.168.8.128:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<name>hbase.zookeeper.property.datadirt</name>
<value>/opt/zookeeper-3.4.10/data</value>
</property>
</configuration>
1)启动命令:
start-hbase.sh stop-hbase.sh
2)Hbase shell
COMMAND GROUPS:
Group name: general
Commands: status, version
Group name: ddl
Commands: alter, create, describe, disable,drop, enable, exists, is_disabled, is_enabled, list
Group name: dml
Commands: count, delete, deleteall, get,get_counter, incr, put, scan, truncate
Group name: tools
Commands: assign, balance_switch, balancer,close_region, compact, flush, major_compact, move, split, unassign, zk_dump
Group name: replication
Commands: add_peer, disable_peer,enable_peer, remove_peer, start_replication, stop_replication
3)查看服务器状态
status 或
status 'master'
1 active master, 2 backup masters, 1 servers, 2 dead, 5.0000 average load
4)版本version
version
1.3.1, r930b9a55528fe45d8edce7af42fef2d35e77677a, Thu Apr 6 19:36:54 PDT 2017
5)查看表
hbase(main):011:0> list
list
TABLE
School
member
my_ns:Company
3 row(s) in 0.0090 seconds
=> ["School", "member", "my_ns:Company"
6)查看当前用户
whoami
hbase(main):013:0> whoami
root (auth:SIMPLE)
groups: root
7)表结构
row | info | info2 | ||||||
name | age | address | phone | high | wight | |||
00001 | Alex | 20 | shanghai | 88886666 | 180 | 160 | 2354431375 |
8)创建表
create ‘表名’, 列簇;
9)全表扫描
scan “表名” rowkey 行键 唯一 timestamp时间戳 cell单元格 column family 列族 column 列
10)向表中插入数据
put ‘表名’,‘rowkey’,‘列族:列名’,‘值’;
put 'student','1000000002','info:age','1'
0 row(s) in 0.0090 seconds
hbase(main):005:0> put 'student','1000000002','info:name','mia'
0 row(s) in 0.0100 seconds
hbase(main):006:0> put 'student','1000000002','info:adress','xiamen'
0 row(s) in 0.0060 seconds
hbase(main):007:0> scan 'student'
ROW COLUMN+CELL
1000000000 column=info:address, timestamp=1546092439298, value=shanghai
1000000000 column=info:age, timestamp=1546092414576, value=30
1000000000 column=info:name, timestamp=1546092292621, value=alex
1000000001 column=info:address, timestamp=1546092580547, value=shanghai
1000000001 column=info:age, timestamp=1546092629692, value=3
1000000001 column=info:name, timestamp=1546092597788, value=Amy
1000000002 column=info:adress, timestamp=1546092813710, value=xiamen
1000000002 column=info:age, timestamp=1546092771821, value=1
1000000002 column=info:name, timestamp=1546092787730, value=mia
11)查看表结构
describe 'student'
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL =>
'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
1 row(s) in 0.0640 seconds
12)筛选扫描
scan 'student',{STARTROW=>'1000000002'}
ROW COLUMN+CELL
1000000002 column=info:adress, timestamp=1546092813710, value=xiamen
1000000002 column=info:age, timestamp=1546092771821, value=1
1000000002 column=info:name, timestamp=1546092787730, value=mia
1 row(s) in 0.2220 seconds
hbase(main):002:0> scan 'student',{STARTROW=>'1000000000',STOPROW=>'10000000001'}
ROW COLUMN+CELL
1000000000 column=info:address, timestamp=1546092439298, value=shanghai
1000000000 column=info:age, timestamp=1546092414576, value=30
1000000000 column=info:name, timestamp=1546092292621, value=alex
1 row(s) in 0.0180 seconds
hbase(main):003:0> scan 'student',{STARTROW=>'1000000000',STOPROW=>'1000000002'}
ROW COLUMN+CELL
1000000000 column=info:address, timestamp=1546092439298, value=shanghai
1000000000 column=info:age, timestamp=1546092414576, value=30
1000000000 column=info:name, timestamp=1546092292621, value=alex
1000000001 column=info:address, timestamp=1546092580547, value=shanghai
1000000001 column=info:age, timestamp=1546092629692, value=3
1000000001 column=info:name, timestamp=1546092597788, value=Amy
2 row(s) in 0.0240 seconds
13)变更表信息
alter
14)删除表数据
hbase(main):009:0>deleteall 'student','1000000000'
0 row(s) in 0.0430 seconds
hbase(main):009:0> scan 'student'
ROW COLUMN+CELL
1000000001 column=info:address, timestamp=1546092580547, value=shanghai
1000000001 column=info:age, timestamp=1546092629692, value=3
1000000001 column=info:name, timestamp=1546092597788, value=Amy
1000000002 column=info:adress, timestamp=1546092813710, value=xiamen
1000000002 column=info:age, timestamp=1546092771821, value=1
1000000002 column=info:name, timestamp=1546092787730, value=mia
2 row(s) in 0.0190 seconds
5)删除具体的列数据
delete 'student','1000000001','info:name'
0 row(s) in 0.2140 seconds
hbase(main):002:0> scan 'student'
ROW COLUMN+CELL
1000000001 column=info:address, timestamp=1546092580547, value=shanghai
1000000001 column=info:age, timestamp=1546092629692, value=3
1000000002 column=info:adress, timestamp=1546092813710, value=xiamen
1000000002 column=info:age, timestamp=1546092771821, value=1
1000000002 column=info:name, timestamp=1546092787730, value=mia
16)清空表数据
truncate 'student'
Truncating 'student' table (it may take a while):
- Disabling table...
- Truncating table...
0 row(s) in 3.5340 seconds
hbase(main):004:0> list
TABLE
School
member
my_ns:Company
student
4 row(s) in 0.0170 seconds
=> ["School", "member", "my_ns:Company", "student"]
hbase(main):005:0> count 'student'
0 row(s) in 0.1460 seconds
=> 0
17)删除表
首先设置不可用状态
disable 表名;drop 表名
disable 'student'
0 row(s) in 2.3040 seconds
drop 'student'
0 row(s) in 1.2480 seconds
hbase(main):010:0> list
TABLE
School
member
my_ns:Company
18)查询表数据
get 'worker','info01:name'
COLUMN CELL
0 row(s) in 0.2000 seconds
hbase(main):002:0> scan 'worker'
ROW COLUMN+CELL
1111 column=info01:name, timestamp=1546095293608, value=Alex
1 row(s) in 0.0430 seconds
hbase(main):003:0> get 'worker','info01:name'
COLUMN CELL
0 row(s) in 0.0030 seconds
hbase(main):004:0> get 'worker','1111', 'info01:name'
COLUMN CELL
info01:name timestamp=1546095293608, value=Alex
1 row(s) in 0.0170 seconds
HBase读写操作
HBase写流程