1.基本概念
hbase: 是一个高可靠性、高性能、面向列、可伸缩的分布式存储系统。一个构建在HDFS上的分布式列存储,key/value系统。
数据类型单一: Hbase中的数据都是字符串,没有类型。
RowKey: 是表中每条记录的“主键”,方便快速查找
Column Family: 列族,包含一个或者多个相关列
Column: 属于某一个columnfamily
Region: 是HBase中分布式存储和负载均衡的最小单元.table在行的方向上分割为多个Region;Region按大小分割的,每个表开始只有一个region,随着数据增多,region不断增大,当增大到一个阈值的时候,region就会等分会两个新的region,之后会有越来越多的region。
HBase集群中的角色
一个或者多个主节点,Hmaster;
多个从节点,HregionServer;
HBase依赖项,zookeeper;
2.namespace
在HBase中,namespace命名空间指对一组表的逻辑分组,类似RDBMS中的database,HBase系统默认定义了两个缺省的namespace:
(1)hbase:系统内建表,包括namespace表和meta表
(2)default:用户建表时未指定namespace,默认放在此
3.hbase常用命令
1.使用hbase shell的终端登陆:
4.基本常用命令
(1)创建namespace:
create_namespace ‘命名空间名称’
hbase(main):002:0> create_namespace 'test'
Took 1.6539 seconds
(2)删除namespace
drop_namespace ‘命名空间名称’
hbase(main):003:0> drop_namespace 'test'
Took 0.9255 seconds
(3)查看namespace
describe_namespace ‘命名空间名称’
hbase(main):005:0> describe_namespace 'test'
DESCRIPTION
{NAME => 'test'}
Took 0.0304 seconds
=> 1
(4)列出所有namespace
list_namespace
hbase(main):006:0> list_namespace
NAMESPACE
default
hbase
test
3 row(s)
Took 0.0263 second
(5)查看namespace下的表
list_namespace_tables ‘命名空间名称’
hbase(main):011:0> list_namespace_tables 'test'
TABLE
a
ts
2 row(s)
Took 0.0277 seconds
=> ["a", "ts"]
(6)在namespace创建表
create ‘命名空间名称:表名’, ‘列族名1’
hbase(main):010:0> create 'test:ts', 'a1','a2'
Created table test:ts
Took 1.2572 seconds
=> Hbase::Table - test:ts
(7)在default命名空间下创建表
create ‘表名’,‘列族名1’,‘列族名2’,‘列族名N’
hbase(main):012:0> create 'ts', 'a1','a2'
Created table ts
Took 1.2603 seconds
=> Hbase::Table - ts
创建表时可预分regions
--create 'test_hbase' , {NAME => 'CASE1'},{NAME => 'CASE2'},{NAME => 'CASE3'} ,SPLITS => ['1', '2', '3', '4','5','6','7','8','9','0']
(8)查看所有表
list:查看default下的表
(9)描述表
describe ‘表名’
hbase(main):014:0> describe 'ts'
Table ts is ENABLED
ts
COLUMN FAMILIES DESCRIPTION
{NAME => 'a1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE
_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER
=> 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
{NAME => 'a2', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS => 'FALSE', CACHE
_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER
=> 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'false', PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
2 row(s)
Took 0.1769 seconds
(10)判断表存在
exists ‘表名’
hbase(main):015:0> exists 'ts'
Table ts does exist
Took 0.0069 seconds
=> true
(11)判断是否禁用启用表
is_enabled ‘表名’
is_disabled ‘表名’
hbase(main):016:0> is_enabled 'ts'
true
Took 0.0064 seconds
=> true
(12)添加记录
put ‘表名’,‘rowkey’,‘列族:列’,‘值’
hbase(main):018:0* put 'ts' ,'001' ,'a1:address','NANJING'
Took 0.2075 seconds
(13)查看所有记录
scan ‘表名’
hbase(main):019:0> scan 'ts'
ROW COLUMN+CELL
001 column=a1:address, timestamp=1615451052979, value=NANJING
1 row(s)
Took 0.0319 seconds
(14)查看表中的记录总数
count ‘表名’
hbase(main):020:0> count 'ts'
1 row(s)
Took 0.0165 seconds
=> 1
计算表的行数还可用:
hbase org.apache.hadoop.hbase.mapreduce.RowCounter 'hbase_mobile_1Q_test_001'
(15)删除一张表
第一步 disable ‘表名’,
第二步 drop ‘表名’
hbase(main):024:0> disable 'ts'
Took 0.8326 seconds
hbase(main):025:0> drop 'ts'
Took 0.4713 seconds
(16)清空表
truncate ‘表名’
hbase(main):027:0> truncate 'ts'
Truncating 'ts' table (it may take a while):
Disabling table...
Truncating table...
Took 2.0970 seconds