HBase学习二之Shell命令

安装HBase

下载HBase

    Hbase 下载地址:Apache Downloads

Hbase 配置

ZBMAC-2f32839f6:server yangyanping$ tar zxvf hbase-2.3.5-bin.tar.gz
ZBMAC-2f32839f6:server yangyanping$ cd hbase-2.3.5
ZBMAC-2f32839f6:hbase-2.3.5 yangyanping$ ls
CHANGES.md	LICENSE.txt	README.txt	bin		docs		lib
LEGAL		NOTICE.txt	RELEASENOTES.md	conf		hbase-webapps
ZBMAC-2f32839f6:hbase-2.3.5 yangyanping$ bin/start-hbase.sh 
+======================================================================+
|                    Error: JAVA_HOME is not set                       |
+----------------------------------------------------------------------+
| Please download the latest Sun JDK from the Sun Java web site        |
|     > http://www.oracle.com/technetwork/java/javase/downloads        |
|                                                                      |
| HBase requires Java 1.8 or later.                                    |
+======================================================================+
ZBMAC-2f32839f6:hbase-2.3.5 yangyanping$ /usr/libexec/java_home -V
Matching Java Virtual Machines (1):
    1.8.0_181, x86_64:	"Java SE 8"	/Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home

/Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home

修改配置文件/bin/hbase-config.sh 文件,导入JAVA_HOME配置

export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home

 HBase 启动

ZBMAC-2f32839f6:hbase-2.3.5 yangyanping$ bin/start-hbase.sh 
running master, logging to /Users/yangyanping/Downloads/server/hbase-2.3.5/bin/../logs/hbase-yangyanping-master-ZBMAC-2f32839f6.out
yangyanping@ZBMac-WP2HJYDWY ~ % jps
20816 HMaster
3200 Launcher
2596 
24652 Jps

hbase 目录结构

yangyanping@ZBMac-WP2HJYDWY hbase % tree 
.
├── CHANGES.md
├── LEGAL
├── LICENSE.txt
├── NOTICE.txt
├── README.txt
├── RELEASENOTES.md
├── bin
│   ├── chaos-daemon.sh
│   ├── considerAsDead.sh
│   ├── draining_servers.rb
│   ├── get-active-master.rb
│   ├── graceful_stop.sh
│   ├── hbase
│   ├── hbase-cleanup.sh
│   ├── hbase-common.sh
│   ├── hbase-config.cmd
│   ├── hbase-config.sh
│   ├── hbase-daemon.sh
│   ├── hbase-daemons.sh
│   ├── hbase-jruby
│   ├── hbase.cmd
│   ├── hirb.rb
│   ├── local-master-backup.sh
│   ├── local-regionservers.sh
│   ├── master-backup.sh
│   ├── region_mover.rb
│   ├── region_status.rb
│   ├── regionservers.sh
│   ├── replication
│   │   └── copy_tables_desc.rb
│   ├── rolling-restart.sh
│   ├── shutdown_regionserver.rb
│   ├── start-hbase.cmd
│   ├── start-hbase.sh
│   ├── stop-hbase.cmd
│   ├── stop-hbase.sh
│   ├── test
│   │   └── process_based_cluster.sh
│   └── zookeepers.sh
├── conf
│   ├── hadoop-metrics2-hbase.properties
│   ├── hbase-env.cmd
│   ├── hbase-env.sh
│   ├── hbase-policy.xml
│   ├── hbase-site.xml
│   ├── log4j-hbtop.properties
│   ├── log4j.properties
│   └── regionservers
├── docs
├── lib
├── logs
│   ├── SecurityAuth.audit
│   ├── hbase-yangyanping-master-ZBMac-WP2HJYDWY.log
│   └── hbase-yangyanping-master-ZBMac-WP2HJYDWY.out
└── tmp
    ├── hbase
    │   ├── MasterData
    │   │   ├── WALs
    │   │   │   └── 192.168.1.102,16000,1654267930582
    │   │   │       └── 192.168.1.102%2C16000%2C1654267930582.1654267933426
    │   │   ├── archive
    │   │   ├── data
    │   │   │   └── master
    │   │   │       └── store
    │   │   │           └── 1595e783b53d99cd5eef43b6debb2682
    │   │   │               ├── proc
    │   │   │               └── recovered.edits
    │   │   │                   └── 1.seqid
    │   │   └── oldWALs
    │   ├── WALs
    │   │   └── 192.168.1.102,16020,1654267932601
    │   │       ├── 192.168.1.102%2C16020%2C1654267932601.1654267937735
    │   │       └── 192.168.1.102%2C16020%2C1654267932601.meta.1654267934970.meta
    │   ├── archive
    │   ├── corrupt
    │   ├── data
    │   │   ├── default
    │   │   └── hbase
    │   │       ├── meta
    │   │       │   └── 1588230740
    │   │       │       ├── info
    │   │       │       ├── recovered.edits
    │   │       │       │   └── 1.seqid
    │   │       │       ├── rep_barrier
    │   │       │       └── table
    │   │       └── namespace
    │   │           └── 356d99eede0feaa0e99e593028f609a0
    │   │               ├── info
    │   │               └── recovered.edits
    │   │                   └── 1.seqid
    │   ├── hbase.id
    │   ├── hbase.version
    │   ├── mobdir
    │   ├── oldWALs
    │   └── staging
    └── zookeeper
        └── zookeeper_0
            └── version-2
                ├── log.1
                └── snapshot.0

332 directories, 2394 files

hbase shell

ZBMAC-2f32839f6:bin yangyanping$ ./hbase shell
2021-06-07 18:09:54,843 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.3.5, rfd3fdc08d1cd43eb3432a1a70d31c3aece6ecabe, Thu Mar 25 20:50:15 UTC 2021
Took 0.0010 seconds                                                                                                                                                           
hbase(main):001:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
Took 0.5337 seconds

WebUI

访问地址:http://localhost:16010/master-status

Shell 命令

创建表 

    语法:create <table>, {NAME => <family>, VERSIONS => <VERSIONS>}

       HBase 使用 create 命令来创建表,创建表时需要指明表名和列族名,如创建上表中的学生信息表 Student 的命令如下:

示例:Student 数据表
行键列族 info列族 course时间戳
namegenderclasschinesenglishmath
0001Tomman809085T2
0002Amy019589T1
0003Allenman029088T1
hbase:001:0> create 'student','info','course'
Created table student
Took 1.6881 seconds                                                                                                                       
=> Hbase::Table - student

这条命令建了名为 student 的表,表中包含两个列族,分别为 info 和 course。 注意在 HBase Shell 语法中,所有字符串参数都必须包含在单引号中,且区分大小写,如 student 和 Student 代表两个不同的表。

exists

创建表结构以后,可以使用 exists 命令查看此表是否存在,或使用 list 命令查看数据库中所有表,如下图所示。

命令格式:exists ‘表名’   

hbase:002:0> exists 'student'
Table student does exist                                                                                                                  
Took 0.2100 seconds                                                                                                                       
=> true

 命令格式:list 

hbase:005:0> list
TABLE                                                                                                                                     
student                                                                                                                                   
1 row(s)
Took 0.0112 seconds                                                                                                                       
=> ["student"]

查看表的基本信息

使用 describe 命令查看指定表的列族信息,如下图所示

命令格式:describe ‘表名’     

hbase:004:0> describe 'student'
Table student is ENABLED                                                                                                                  
student                                                                                                                                   
COLUMN FAMILIES DESCRIPTION                                                                                                               
{NAME => 'course', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NON
E', COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}   

{NAME => 'info', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE'
, COMPRESSION => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}     

2 row(s)
Quota is disabled
Took 0.1486 seconds  

修改列族 

首先修改列族的参数信息,如修改列族的版本。例如上面的 student 表,假设它的列族 info 的 VERSIONS 为 1,但是实际可能需要保存最近的 3 个版本,可使用以下命令完成:

命令格式:alter ‘表名’  ,{NAME => <family>, VERSIONS => <VERSIONS>}

hbase:006:0> alter 'student',{NAME =>'info',VERSIONS => 3}
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.0905 seconds  

修改多个列族的参数,形式与 create 命令类似。

这里要注意,修改已存有数据的列族属性时,HBase 需要对列族里所有的数据进行修改,如果数据量很大,则修改可能要占很长时间。

增加列族

如果需要在 student 表中新增一个列族 location,使用以下命令:

hbase:007:0> alter 'student','location'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.8666 seconds    

删除列族

如果要移除或者删除已有的列族,以下两条命令均可完成:

alter 'user','delete' => 'location'

alter 'user', { NAME => 'location', METHOD => 'delete' }

hbase:008:0> alter 'student','delete' => 'location'
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 1.8798 seconds    

另外,HBase 表至少要包含一个列族,因此当表中只有一个列族时,无法将其删除。

删除表

HBase 使用 drop 命令删除表,但是在删除表之前需要先使用 disable 命令禁用表。例如有一个 student 表,删除该表的完整流程如下:

hbase:009:0> disable 'student'
Took 0.7296 seconds                                                                                                                       
hbase:010:0> drop 'student'
Took 0.4195 seconds                                                                                                                       
hbase:011:0> exists 'student'
Table student does not exist                                                                                                              
Took 0.0137 seconds                                                                                                                       
=> false 

使用 disable 禁用表以后,可以使用 is_disable 查看表是否禁用成功。另外,如果只是想清空表中的所有数据,使用 truncate 命令即可,此命令相当于完成禁用表、删除表,并按原结构重新建立表操作:

truncate '表名' 

hbase:013:0> truncate 'student'
Truncating 'student' table (it may take a while):
Disabling table...
Truncating table...
Took 1.6392 seconds  

插入数据

HBase 使用 put 命令向数据表中插入数据,put 向表中增加一个新行数据,或覆盖指定行的数据。例如有以上结构的数据表,向其中插入一条数据的写法为:

hbase:014:0> put 'student','0001','info:name','Tom',1

在上述命令中:

  • 第一个参数student为表名;
  • 第二个参数0001为行键的名称,为字符串类型;
  • 第三个参数info:name为列族和列的名称,中间用冒号隔开。列族名必须是已经创建的,否则 HBase 会报错;列名是临时定义的,因此列族里的列是可以随意扩展的;
  • 第四个参数Tom为单元格的值。在 HBase 里,所有数据都是字符串的形式;
  • 最后一个参数1为时间戳,如果不设置时间戳,则系统会自动插入当前时间为时间戳。

student表有两个列族,info和course。其中info下有三个列name 和 gender,class。course 下有三个列chines,english和math。注意,向表中添加数据,在向HBase的表中添加数据的时候,只能一列一列的添加,不能同时添加多列

hbase:002:0> put 'student','0001','info:name','Tom',1
Took 0.0814 seconds                                                             
hbase:003:0> put 'student','0001','info:gender','man',1
Took 0.0308 seconds                                                             
hbase:004:0> put 'student','0001','info:class','01'
Took 0.0262 seconds                                                             
hbase:005:0> put 'student','0001','couse:chines',80
hbase:006:0> put 'student','0001','course:chines',80
Took 0.0151 seconds                                                             
hbase:007:0> put 'student','0002','info:name','杨艳平'
Took 0.0195 seconds    
hbase:008:0> put 'student','0002','info:gender','man'
Took 0.0660 seconds                                                             
hbase:009:0> put 'student','0002','info:class',02
Took 0.0085 seconds                                                             
hbase:010:0> put 'student','0002','course:math',100
Took 0.0149 seconds                                                                                                                                                                                                                                                                                                         

这样表结构就起来了,其实比较自由,列族里边可以自由添加子列很方便。如果列族下没有子列,加不加冒号都是可以的。

如果在添加数据的时候,需要手动的设置时间戳,则在put命令的最后加上相应的时间戳,时间戳是long类型的,所以不需要加引号   put 'user','001','info:name','tom'  1478053832459

查看表中的所有数据

HBase scan 命令用来查询全表数据,使用时只需指定表名即可。

hbase:012:0> scan 'student'
ROW                                           COLUMN+CELL                                                                                                                        
 0001                                         column=course:chines, timestamp=2022-06-07T00:10:46.843, value=80                                                                  
 0001                                         column=info:class, timestamp=2022-06-07T00:09:54.431, value=01                                                                     
 0001                                         column=info:gender, timestamp=1970-01-01T08:00:00.001, value=man                                                                   
 0001                                         column=info:name, timestamp=1970-01-01T08:00:00.001, value=Tom                                                                     
 0002                                         column=course:math, timestamp=2022-06-07T00:12:49.227, value=100                                                                   
 0002                                         column=info:class, timestamp=2022-06-07T00:12:34.077, value=2                                                                      
 0002                                         column=info:gender, timestamp=2022-06-07T00:12:17.515, value=man                                                                   
 0002                                         column=info:name, timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3                                    
2 row(s)
Took 0.0683 seconds   

从表中获取数据 

  • 查看其中某一个Key的数据
hbase:014:0> get 'student','0002'
COLUMN                                        CELL                                                                                                                               
 course:math                                  timestamp=2022-06-07T00:12:49.227, value=100                                                                                       
 info:class                                   timestamp=2022-06-07T00:12:34.077, value=2                                                                                         
 info:gender                                  timestamp=2022-06-07T00:12:17.515, value=man                                                                                       
 info:name                                    timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3                                                      
1 row(s)
Took 0.0316 seconds  
  • 获取指定行中指定列族下所有列的数据信息
    hbase:015:0> get 'student','0002','info'
    COLUMN                                        CELL                                                                                                                               
     info:class                                   timestamp=2022-06-07T00:12:34.077, value=2                                                                                         
     info:gender                                  timestamp=2022-06-07T00:12:17.515, value=man                                                                                       
     info:name                                    timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3                                                      
    1 row(s)
    Took 0.0638 seconds    
  • 获取指定行中指定列的数据信息

    hbase:016:0> get 'student','0002','info:name'
    COLUMN                                        CELL                                                                                                                               
     info:name                                    timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3                                                      
    1 row(s)
    Took 0.0933 seconds  
  •  查询表中有多少行 
    hbase:018:0> count 'student'
    2 row(s)
    Took 0.0623 seconds                                                                                                                                                              
    => 2
  •  HBase存储中文
hbase:021:0> get 'student','0002',{FORMATTER => 'toString'}
COLUMN                                        CELL                                                                                                                               
 course:math                                  timestamp=2022-06-07T00:12:49.227, value=100                                                                                       
 info:class                                   timestamp=2022-06-07T00:12:34.077, value=2                                                                                         
 info:gender                                  timestamp=2022-06-07T00:12:17.515, value=man                                                                                       
 info:name                                    timestamp=2022-06-07T00:11:14.077, value=杨艳平                                                                                       
1 row(s)
Took 0.1044 seconds                                                                                                           
hbase:022:0> get 'student','0002','info:name:toString'
COLUMN                                        CELL                                                                                                                               
 info:name                                    timestamp=2022-06-07T00:11:14.077, value=杨艳平                                                                                       
1 row(s)
Took 0.0333 seconds  

删除数据

HBase delete 命令可以从表中删除一个单元格或一个行集,语法与 put 类似,必须指明表名和列族名称,而列名和时间戳是可选的。

  • 例如,执行以下命令,将删除 student 表中行键为 0001 的 info 列族的所有数据:

hbase:023:0> delete 'student','0001','info'
Took 0.0741 seconds 

更新数据 

语法:put  'tableName' , ' rowName'  , 'colFamily:column' , 'new value'

hbase:024:0> put 'student','0002','info:class','03'
Took 0.0185 seconds    

过滤器 

在 HBase 中,get 和 scan 操作都可以使用过滤器来设置输出的范围,类似 SQL 里的 Where 查询条件。

使用 show_filter 命令可以查看当前 HBase 支持的过滤器类型,如下图所示

hbase:002:0> show_filters
DependentColumnFilter                                                                                                                                            
KeyOnlyFilter                                                                                                                                                    
ColumnCountGetFilter                                                                                                                                             
SingleColumnValueFilter                                                                                                                                          
PrefixFilter                                                                                                                                                     
SingleColumnValueExcludeFilter                                                                                                                                   
FirstKeyOnlyFilter                                                                                                                                               
ColumnRangeFilter                                                                                                                                                
ColumnValueFilter                                                                                                                                                
TimestampsFilter                                                                                                                                                 
FamilyFilter                                                                                                                                                     
QualifierFilter                                                                                                                                                  
ColumnPrefixFilter                                                                                                                                               
RowFilter                                                                                                                                                        
MultipleColumnPrefixFilter                                                                                                                                       
InclusiveStopFilter                                                                                                                                              
PageFilter                                                                                                                                                       
ValueFilter                                                                                                                                                      
ColumnPaginationFilter                                                                                                                                           
Took 0.0706 seconds                                                                                                                                              
=> #<Java::JavaUtil::HashMap::KeySet:0x66d3b881>

LIMIT

hbase:003:0> scan 'student',{LIMIT => 2,FORMATTER => 'toString'}
ROW                                                  COLUMN+CELL                                                                                                                                            
 0001                                                column=course:chines, timestamp=2022-06-07T00:10:46.843, value=80                                                                                      
 0001                                                column=info:class, timestamp=2022-06-07T00:09:54.431, value=01                                                                                         
 0001                                                column=info:gender, timestamp=1970-01-01T08:00:00.001, value=man                                                                                       
 0001                                                column=info:name, timestamp=1970-01-01T08:00:00.001, value=Tom                                                                                         
 0002                                                column=course:math, timestamp=2022-06-07T00:12:49.227, value=100                                                                                       
 0002                                                column=info:class, timestamp=2022-06-07T00:28:43.673, value=03                                                                                         
 0002                                                column=info:gender, timestamp=2022-06-07T00:12:17.515, value=man                                                                                       
 0002                                                column=info:name, timestamp=2022-06-07T00:11:14.077, value=杨艳平                                                                                         
2 row(s)
Took 0.0919 seconds 

行键过滤器

RowFilter 可以配合比较器和运算符,实现行键字符串的比较和过滤。例如,匹配行键中大于 0001 的数据,可使用 binary 比较器;匹配以 0001 开头的行键,可使用 substring 比较器,注意 substring 不支持大于或小于运算符。

实现上述匹配条件的过滤命令以及显示结果如下所示

hbase:001:0> scan 'student',FILTER =>"RowFilter(=,'substring:0002')"
ROW                                 COLUMN+CELL                                                                                         
 0002                               column=course:math, timestamp=2022-06-07T00:12:49.227, value=100                                    
 0002                               column=info:class, timestamp=2022-06-07T00:28:43.673, value=03                                      
 0002                               column=info:gender, timestamp=2022-06-07T00:12:17.515, value=man                                    
 0002                               column=info:name, timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3     
1 row(s)
Took 0.6226 seconds   

  解决中文乱码

hbase:002:0> scan 'student',{FILTER =>"RowFilter(=,'substring:0002')",FORMATTER => 'toString'}
ROW                                 COLUMN+CELL                                                                                         
 0002                               column=course:math, timestamp=2022-06-07T00:12:49.227, value=100                                    
 0002                               column=info:class, timestamp=2022-06-07T00:28:43.673, value=03                                      
 0002                               column=info:gender, timestamp=2022-06-07T00:12:17.515, value=man                                    
 0002                               column=info:name, timestamp=2022-06-07T00:11:14.077, value=杨艳平                                      
1 row(s)
Took 0.0838 seconds  

值过滤器 

在 HBase 的过滤器中也有针对单元格进行扫描的过滤器,即值过滤器,如下表所示。

值过滤器描述示例
ValueFilter值过滤器,找到符合值条件的键值对scan 'student', FILTER => "ValueFilter(=,'substring:curry')"

get 'student', '0001', FILTER => "ValueFilter(=,'substring:curry')"
SingleColumnValueFilter在指定的列族和列中进行比较的值过滤器scan 'student', Filter => "SingleColumnValueFilter('info', 'gender', =,'binary:man')"
SingleColumnValueExcludeFilter排除匹配成功的值scan 'student', Filter => "SingleColumnValueExcludeFilter('info', 'gender', =, 'binary:man')"
hbase:003:0> scan 'student',{FILTER => "SingleColumnValueFilter('info','gender',=,'binary:man')"}
ROW                                  COLUMN+CELL                                                                                              
 0001                                column=course:chines, timestamp=2022-06-07T00:10:46.843, value=80                                        
 0001                                column=info:class, timestamp=2022-06-07T00:09:54.431, value=01                                           
 0001                                column=info:gender, timestamp=1970-01-01T08:00:00.001, value=man                                         
 0001                                column=info:name, timestamp=1970-01-01T08:00:00.001, value=Tom                                           
 0002                                column=course:math, timestamp=2022-06-07T00:12:49.227, value=100                                         
 0002                                column=info:class, timestamp=2022-06-07T00:28:43.673, value=03                                           
 0002                                column=info:gender, timestamp=2022-06-07T00:12:17.515, value=man                                         
 0002                                column=info:name, timestamp=2022-06-07T00:11:14.077, value=\xE6\x9D\xA8\xE8\x89\xB3\xE5\xB9\xB3          
2 row(s)
Took 0.0785 seconds 

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值