HBase 的Shell 是操作HBase 的重要手段。
下面我们讲解下Hbase Shell 都能完成那些功能。
目录
启动Hbase Shell
hbase shell 通过如下命令调用
hbase shell
Hbase shell 是基于Jruby的,Jruby 是基于 Ruby 实现的Java 虚拟机。更确切地说,它使用的是交互式 Ruby Shell, 输入命令并快速得到响应。Hbase 使用Java 基本的API 扩展了 Ruby 脚本,并且继承了对历史记录和实现的内置支持,以及所有的Ruby 指令。
帮助
进入Hbase shell 后,我们可以输入 help 查看具体帮助信息。
[root@cdh-manager ~]# hbase shell
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0-cdh6.0.1, rUnknown, Wed Sep 19 09:14:00 PDT 2018
Took 0.0082 seconds
hbase(main):001:0> help
输出
HBase Shell, version 2.0.0-cdh6.0.1, rUnknown, Wed Sep 19 09:14:00 PDT 2018
Type 'help "COMMAND"', (e.g. 'help "get"' -- the quotes are necessary) for help on a specific command.
Commands are grouped. Type 'help "COMMAND_GROUP"', (e.g. 'help "general"') for help on a command group.
COMMAND GROUPS:
Group name: general
Commands: processlist, status, table_help, version, whoami
Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters
Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve
Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region, compact, compact_rs, compaction_state, flush, is_in_maintenance_mode, list_deadservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, trace, unassign, wal_roll, zk_dump
Group name: replication
Commands: add_peer, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_tableCFs, show_peer_tableCFs, update_peer_config
Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot
Group name: configuration
Commands: update_all_config, update_config
Group name: quotas
Commands: list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota
Group name: security
Commands: grant, list_security_capabilities, revoke, user_permission
Group name: procedures
Commands: abort_procedure, list_locks, list_procedures
Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
Group name: rsgroup
Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_rsgroup
SHELL USAGE:
Quote all names in HBase Shell such as table and column names. Commas delimit
command parameters. Type <RETURN> after entering a command to run it.
Dictionaries of configuration used in the creation and alteration of tables are
Ruby Hashes. They look like this:
{'key1' => 'value1', 'key2' => 'value2', ...}
and are opened and closed with curley-braces. Key/values are delimited by the
'=>' character combination. Usually keys are predefined constants such as
NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type
'Object.constants' to see a (messy) list of all constants in the environment.
If you are using binary keys or values and need to enter them in the shell, use
double-quote'd hexadecimal representation. For example:
hbase> get 't1', "key\x03\x3f\xcd"
hbase> get 't1', "key\003\023\011"
hbase> put 't1', "test\xef\xff", 'f1:', "\x01\x33\x40"
The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added.
For more on the HBase Shell, see http://hbase.apache.org/book.html
也可以在进入hbase shell 启动时,添加 -h 或 --help
[root@cdh-manager ~]# hbase shell --help
Java HotSpot(TM) 64-Bit Server VM warning: Using incremental CMS is deprecated and will likely be removed in a future release
Usage: shell [OPTIONS] [SCRIPTFILE [ARGUMENTS]]
-d | --debug Set DEBUG log levels.
-h | --help This help.
-n | --noninteractive Do not run within an IRB session
and exit with non-zero status on
first error.
退出
离开命令行,我们可以输入 exit 或者 quit
hbase(main):002:0> exit
You have mail in /var/spool/mail/root
[root@cdh-manager ~]#
hbase(main):001:0> quit
[root@cdh-manager ~]#
debug 模式
在使用hbase shell 的时候,我们可以开启 debug 模式,方便我们定位问题。
可以在调用 hbase shell 的时候就进入 debug 模式。
hbase shell --debug | -d
debug 模式下的shell
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:os.version=3.10.0-957.1.3.el7.x86_64
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:user.name=root
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Client environment:user.dir=/root
20/04/30 21:09:39 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=cdh-node2:2181,cdh-node1:2181,cdh-manager:2181 sessionTimeout=60000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$23/1428038829@3140dcc8
20/04/30 21:09:39 DEBUG zookeeper.ClientCnxn: zookeeper.disableAutoWatchReset is false
20/04/30 21:09:39 INFO zookeeper.ClientCnxn: Opening socket connection to server cdh-node2/192.168.75.132:2181. Will not attempt to authenticate using SASL (unknown error)
20/04/30 21:09:39 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.75.133:50606, server: cdh-node2/192.168.75.132:2181
20/04/30 21:09:39 DEBUG zookeeper.ClientCnxn: Session establishment request sent on cdh-node2/192.168.75.132:2181
20/04/30 21:09:39 INFO zookeeper.ClientCnxn: Session establishment complete on server cdh-node2/192.168.75.132:2181, sessionid = 0x171b02bc2e03139, negotiated timeout = 60000
20/04/30 21:09:39 DEBUG zookeeper.ClientCnxn: Reading reply sessionid:0x171b02bc2e03139, packet:: clientPath:/hbase/hbaseid serverPath:/hbase/hbaseid finished:false header:: 1,4 replyHeader:: 1,25769939879,0 request:: '/hbase/hbaseid,F response:: #ffffffff000146d61737465723a3136303030ffffffc3ffffffc6483fffffff455ffffff9d5550425546a2437316436613962372d353565342d343336632d613765382d623535633636393531636431,s{858906,25769803839,1587283923742,1587798705460,3,0,0,0,67,0,858906}
20/04/30 21:09:40 DEBUG util.ClassSize: Using Unsafe to estimate memory layout
20/04/30 21:09:40 DEBUG ipc.AbstractRpcClient: Codec=org.apache.hadoop.hbase.codec.KeyValueCodec@426c0486, compressor=null, tcpKeepAlive=true, tcpNoDelay=true, connectTO=10000, readTO=20000, writeTO=60000, minIdleTimeBeforeClose=120000, maxRetries=3, fallbackAllowed=false, bind address=null
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
Version 2.0.0-cdh6.0.1, rUnknown, Wed Sep 19 09:14:00 PDT 2018
Took 0.0223 seconds
hbase(main):001:0>
查询当前debug 模式,是否开启 --> debug ?
hbase(main):014:0> debug?
Debug mode is OFF
切换debug 模式,输入 debug
hbase(main):015:0> debug
Debug mode is ON
输入截至
我们在使用Hbase shell 中,有可能输入了错误的指令,这时候我们向停止指令。通过输入EOF 可以完成这个功能。
EOF
=================================
命令的基本规则
输入命令的时候,我们要遵循如下的规则。
引用名
命令行要求在使用表名和列名时必须通过单引号或者双引号对其进行引用。
引用值
命令行支持二级制,八进制,十六进制的输入和输出。用户在引用时必须使用双引号,否则Shell 将把它们解释成文本。
使用逗号分隔参数
参数之间要使用逗号进行分隔。
get 'testtable', 'row-1', 'colfam1:qual1'
Ruby散列属性
一些命令中需要设置键/值 对属性。使用Ruby 散列并按照以下方式来完成
{‘key1’=>'value1', 'key2'=>'value2', ...}
键/值对 需要被包括在花括号中,键/值 之间使用 “=>” 分隔。使用键/值模式赋值的属性通常是 NAME, VERSIONS 或 COMPRESSION, 并且不需要使用引号。
=================================
普通命令
这里我们主要讲解下面三个指令 help , status, version
help
每一个命令的使用方法都可以通过 help 'command' 来获取详细信息
例如
hbase(main):030:0> help 'status'
Show cluster status. Can be 'summary', 'simple', 'detailed', or 'replication'. The
default is 'summary'. Examples:
hbase> status
hbase> status 'simple'
hbase> status 'summary'
hbase> status 'detailed'
hbase> status 'replication'
hbase> status 'replication', 'source'
hbase> status 'replication', 'sink'
status
返回ClusterStatus 类中各种级别的信息。通过帮助可以查看简单 (simple), 总结(summary)和 详细 (detailed)状态。
version 返回当前版本信息,仓库版本和编译信息。
============================================
数据定义
数据定义语言DDL, Hbase 的 DDL 主要有以下指令
alter
使用 modifyTable 修改现有表结构
create
创建新表,
describe
打印HTableDscriptor 对象。
disable
禁用表
drop
删除表
enable
启用表
exists
检查表是否存在
is_disabled
检查表是否已经禁用。
is_enabled
检查表是否已经启用
list
返回所有表
============================================
数据操作
count
统计一张表的行数。内部使用了 scan
delete
删除一个单元格
deleteall
类似于delete 但不仅仅删除一列,主要会删除一个列族或列
get
获取一个单元格
get_counter
返回一个计数器数值
incr
给计数器+1
put
存储一个单元格
scan
扫描一个范围的数据。依赖于Scan类
truncate
清理一张表中的数据,相当于 disable, drop,create 在使用同一个模式下顺序执行,
============================================
工具
列出了一些工具类指令
assign
分配一个 region 到 一台 region 服务器中
hbase(main):040:0> help 'assign'
Assign a region. Use with caution. If region already assigned,
this command will do a force reassign. For experts only.
Examples:
hbase> assign 'REGIONNAME'
hbase> assign 'ENCODED_REGIONNAME'
balance_switch
切换负载均衡状态
hbase(main):041:0> help 'balance_switch'
Enable/Disable balancer. Returns previous balancer state.
Examples:
hbase> balance_switch true
hbase> balance_switch false
balancer
启动负载均衡
hbase(main):043:0> help 'balancer'
Trigger the cluster balancer. Returns true if balancer ran and was able to
tell the region servers to unassign all the regions to balance (the re-assignment itself is async).
Otherwise false (Will not run if regions in transition).
Parameter tells master whether we should force balance even if there is region in transition.
WARNING: For experts only. Forcing a balance may do more damage than repair
when assignment is confused
Examples:
hbase> balancer
hbase> balancer "force"
close_region
关闭一个 region
hbase(main):044:0> help 'close_region'
---------------------------------------------
DEPRECATED!!! Use 'unassign' command instead.
---------------------------------------------
unassign
hbase(main):045:0> help 'unassign'
Unassign a region. Unassign will close region in current location and then
reopen it again. Pass 'true' to force the unassignment ('force' will clear
all in-memory state in master before the reassign. If results in
double assignment use hbck -fix to resolve. To be used by experts).
Use with caution. For expert use only. Examples:
hbase> unassign 'REGIONNAME'
hbase> unassign 'REGIONNAME', true
hbase> unassign 'ENCODED_REGIONNAME'
hbase> unassign 'ENCODED_REGIONNAME', true
compact
开启某个region 或一张表的一部合并操作
hbase(main):047:0> help 'compact'
Compact all regions in passed table or pass a region row
to compact an individual region. You can also compact a single column
family within a region.
You can also set compact type, "NORMAL" or "MOB", and default is "NORMAL"
Examples:
Compact all regions in a table:
hbase> compact 'ns1:t1'
hbase> compact 't1'
Compact an entire region:
hbase> compact 'r1'
Compact only a column family within a region:
hbase> compact 'r1', 'c1'
Compact a column family within a table:
hbase> compact 't1', 'c1'
Compact table with type "MOB"
hbase> compact 't1', nil, 'MOB'
Compact a column family using "MOB" type within a table
hbase> compact 't1', 'c1', 'MOB'
flush
开启某个region 或一张表的异步强制合并操作
hbase(main):039:0> help 'flush'
Flush all regions in passed table or pass a region row to
flush an individual region or a region server name whose format
is 'host,port,startcode', to flush all its regions.
For example:
hbase> flush 'TABLENAME'
hbase> flush 'REGIONNAME'
hbase> flush 'ENCODED_REGIONNAME'
hbase> flush 'REGION_SERVER_NAME'
major_compact
开启某个region 或一张表的一部强制合并操作
hbase(main):048:0> help 'major_compact'
Run major compaction on passed table or pass a region row
to major compact an individual region. To compact a single
column family within a region specify the region name
followed by the column family name.
Examples:
Compact all regions in a table:
hbase> major_compact 't1'
hbase> major_compact 'ns1:t1'
Compact an entire region:
hbase> major_compact 'r1'
Compact a single column family within a region:
hbase> major_compact 'r1', 'c1'
Compact a single column family within a table:
hbase> major_compact 't1', 'c1'
Compact table with type "MOB"
hbase> major_compact 't1', nil, 'MOB'
Compact a column family using "MOB" type within a table
hbase> major_compact 't1', 'c1', 'MOB'
move
移动一个region 到不同的服务器中
hbase(main):049:0> help 'move'
Move a region. Optionally specify target regionserver else we choose one
at random. NOTE: You pass the encoded region name, not the region name so
this command is a little different to the others. The encoded region name
is the hash suffix on region names: e.g. if the region name were
TestTable,0094429456,1289497600452.527db22f95c8a9e0116f0cc13c680396. then
the encoded region name portion is 527db22f95c8a9e0116f0cc13c680396
A server name is its host, port plus startcode. For example:
host187.example.com,60020,1289493121758
Examples:
hbase> move 'ENCODED_REGIONNAME'
hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME'
split
拆分一个region 或一张表
hbase(main):050:0> help 'split'
Split entire table or pass a region to split individual region. With the
second parameter, you can specify an explicit split key for the region.
Examples:
split 'tableName'
split 'namespace:tableName'
split 'regionName' # format: 'tableName,startKey,id'
split 'tableName', 'splitKey'
split 'regionName', 'splitKey'
zk_dump
转存 Zookeeper 固有信息到 HBase 中,这是内部类提供的特殊功能,HBase Master 的 Web UI 也提供了类似的信息。
hbase(main):051:0> help 'zk_dump'
Dump status of HBase cluster as seen by ZooKeeper.
============================================
复制
add_peer
增加复制单元
hbase(main):052:0> help 'add_peer'
A peer can either be another HBase cluster or a custom replication endpoint. In either case an id
must be specified to identify the peer.
For a HBase cluster peer, a cluster key must be provided and is composed like this:
hbase.zookeeper.quorum:hbase.zookeeper.property.clientPort:zookeeper.znode.parent
This gives a full path for HBase to connect to another HBase cluster.
An optional parameter for state identifies the replication peer's state is enabled or disabled.
And the default state is enabled.
An optional parameter for namespaces identifies which namespace's tables will be replicated
to the peer cluster.
An optional parameter for table column families identifies which tables and/or column families
will be replicated to the peer cluster.
Notice: Set a namespace in the peer config means that all tables in this namespace
will be replicated to the peer cluster. So if you already have set a namespace in peer config,
then you can't set this namespace's tables in the peer config again.
Examples:
hbase> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase"
hbase> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase", STATE => "ENABLED"
hbase> add_peer '1', CLUSTER_KEY => "server1.cie.com:2181:/hbase", STATE => "DISABLED"
hbase> add_peer '2', CLUSTER_KEY => "zk1,zk2,zk3:2182:/hbase-prod",
TABLE_CFS => { "table1" => [], "table2" => ["cf1"], "table3" => ["cf1", "cf2"] }
hbase> add_peer '2', CLUSTER_KEY => "zk1,zk2,zk3:2182:/hbase-prod",
NAMESPACES => ["ns1", "ns2", "ns3"]
hbase> add_peer '2', CLUSTER_KEY => "zk1,zk2,zk3:2182:/hbase-prod",
NAMESPACES => ["ns1", "ns2"], TABLE_CFS => { "ns3:table1" => [], "ns3:table2" => ["cf1"] }
For a custom replication endpoint, the ENDPOINT_CLASSNAME can be provided. Two optional arguments
are DATA and CONFIG which can be specified to set different either the peer_data or configuration
for the custom replication endpoint. Table column families is optional and can be specified with
the key TABLE_CFS.
hbase> add_peer '6', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint'
hbase> add_peer '7', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
DATA => { "key1" => 1 }
hbase> add_peer '8', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
CONFIG => { "config1" => "value1", "config2" => "value2" }
hbase> add_peer '9', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" },
hbase> add_peer '10', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
TABLE_CFS => { "table1" => [], "ns2:table2" => ["cf1"], "ns3:table3" => ["cf1", "cf2"] }
hbase> add_peer '11', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
DATA => { "key1" => 1 }, CONFIG => { "config1" => "value1", "config2" => "value2" },
TABLE_CFS => { "table1" => [], "ns2:table2" => ["cf1"], "ns3:table3" => ["cf1", "cf2"] }
hbase> add_peer '12', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.MyReplicationEndpoint',
CLUSTER_KEY => "server2.cie.com:2181:/hbase"
Note: Either CLUSTER_KEY or ENDPOINT_CLASSNAME must be specified. If ENDPOINT_CLASSNAME is specified, CLUSTER_KEY is
optional and should only be specified if a particular custom endpoint requires it.
disable_peer
hbase(main):054:0> help 'disable_peer'
Stops the replication stream to the specified cluster, but still
keeps track of new edits to replicate.
Examples:
hbase> disable_peer '1'
enable_peer
hbase(main):056:0> help 'enable_peer'
Restarts the replication to the specified peer cluster,
continuing from where it was disabled.
Examples:
hbase> enable_peer '1'
remove_peer
hbase(main):057:0> help 'remove_peer'
Stops the specified replication stream and deletes all the meta
information kept about it. Examples:
hbase> remove_peer '1'
start_replication
开启复制进程
stop_replication
关闭复制进程
============================================
其他
有时候希望脚本交互式执行,并且可以立即得到返回值
echo "status" | bin/hbase shell