HBase Shell 常用操作

最新推荐文章于 2018-10-20 17:28:00 发布

Janvn

最新推荐文章于 2018-10-20 17:28:00 发布

阅读量3.1k

点赞数

分类专栏： Hbase

Hbase 专栏收录该内容

2 篇文章 0 订阅

订阅专栏

转自：http://debugo.com/hbase-shell-cmds/

HBase Shell是HBase的一个命令行工具，我们可以通过它对HBase进行维护操作。我们可以使用sudo -u hbase hbase shell来进入HBase shell。
在HBase shell中，可以使用status, version和whoami分别获得当前服务的状态、版本、登录用户和验证方式。

HBase shell中的帮助命令非常强大，使用help获得全部命令的列表，使用help ‘command_name’获得某一个命令的详细信息。例如：

1. 命名空间

在HBase系统中，命名空间namespace指的是一个HBase表的逻辑分组，同一个命名空间中的表有类似的用途，也用于配额和权限等设置进行安全管控。
HBase默认定义了两个系统内置的预定义命名空间：
• hbase：系统命名空间，用于包含hbase的内部表
• default：所有未指定命名空间的表都自动进入该命名空间
我们可以通过create_namespace命令来建立命名空间

 
         1 
       
         2 
       
        > 
          
        create 
        _namespace 
          
        'debugo_ns' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        2.0910 
          
        seconds

通过drop_namespace来删除命名空间

 
         1 
       
         2 
       
        > 
          
        drop 
        _namespace 
          
        'debugo_ns' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        1.9540 
          
        seconds

通过alter_namespac改变表的属性，其格式如下：
alter_namespace 'my_ns', {METHOD => 'set', 'PROPERTY_NAME' => 'PROPERTY_VALUE'}
显示命名空间以及设定的元信息：

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        describe 
        _namespace 
          
        'debugo_ns' 
       
        DESCRIPTION 
                                                                     
        { 
        NAME 
          
        = 
        > 
          
        'debugo_ns' 
        } 
                                                           
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        1.9540 
          
        seconds

显示所有命名空间

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
        > 
          
        list_namespace 
       
        NAMESPACE 
                                                                       
        debugo_ns                                                               
       
        default 
                                                                         
        hbase 
                                                                           
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0910 
          
        seconds

在HBase下建表需要使用create table_name, column_family1, 这个命令:

这个时候这个表是创建在default下面。如果需要在debugo_ns这个命名空间下面建表，则需要使用create namespace:table_name这种方式:

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        > 
          
        create 
        _namespace 
          
        'debugo_ns' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        2.0910 
          
        seconds 
       
        create 
          
        'debugo_ns:users' 
        , 
          
        'info' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.4640 
          
        seconds 
       
        = 
        > 
          
        Hbase 
        :: 
        Table 
          
        - 
          
        debugo_ns 
        : 
        users

List命令可以列出当前HBase实例中的所有表，支持使用正则表达式来匹配。

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        list_namespace 
        _tables 
          
        'debugo_ns' 
       
        TABLE                                                                   
       
        users 
                                                                           
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0400 
          
        seconds

使用list_namespace_tables也可以直接输出某个命名空间下的所有表

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        list_namespace 
        _tables 
          
        'debugo_ns' 
       
        TABLE                                                                   
       
        users 
                                                                           
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0400 
          
        seconds

2. DDL语句

首先是建立HBase表，上面我们已经用过create命令了。它后面的第一个参数是表名，然后是一系列列簇的列表。每个列簇中可以独立指定它使用的版本数，数据有效保存时间（TTL），是否开启块缓存等信息。

表也可以在创建时指定它预分割(pre-splitting)的region数和split方法。在表初始建立时，HBase只分配给这个表一个region。这就意味着当我们访问这个表数据时，我们只会访问一个region server，这样就不能充分利用集群资源。HBase提供了一个工具来管理表的region数，即org.apache.hadoop.hbase.util.RegionSplitter和HBase shell中create中的split的配置项。例如：

 
         1 
       
        > 
          
        create 
          
        't2' 
        , 
          
        'f1' 
        , 
          
        { 
        NUMREGIONS 
          
        = 
        > 
          
        3 
        , 
          
        SPLITALGO 
          
        = 
        > 
          
        'HexStringSplit' 
        }

我们通过describe 来查看这个表中的元信息：

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        describe 
          
        't2' 
       
        DESCRIPTION                                                  
        ENABLED 
       
        't2' 
        , 
          
        { 
        NAME 
          
        = 
        > 
          
        'f1' 
        , 
          
        DATA_BLOCK_ENCODING 
          
        = 
        > 
          
        'NONE' 
        , 
          
        BLOOMFILTER 
          
        = 
        > 
          
        'ROW' 
        , 
          
        REPLIC  
        true 
          
        ATION_SCOPE 
          
        = 
        > 
          
        '0' 
        , 
          
        VERSIONS 
          
        = 
        > 
          
        '1' 
        , 
          
        COMPRESSION 
          
        = 
        > 
          
        'NONE' 
        , 
          
        MIN_VERSIONS 
          
        = 
        > 
          
        '0' 
        , 
          
        TTL 
          
        = 
        > 
          
        'FOREVER' 
        , 
          
        KEEP_DELETED_CELLS 
          
        = 
        > 
          
        'false' 
        , 
          
        BLOCKSIZE 
          
        = 
        > 
          
        '65536' 
        , 
          
        IN 
        _MEMOR 
          
        Y 
          
        = 
        > 
          
        'false' 
        , 
          
        BLOCKCACHE 
          
        = 
        > 
          
        'true' 
        } 
       
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0690 
          
        seconds

通过enable和disable来启用/禁用这个表,相应的可以通过is_enabled和is_disabled来检查表是否被禁用。

使用exists来检查表是否存在

使用alter来改变表的属性，比如改变列簇的属性, 这涉及将信息更新到所有的region。在过去的版本中，alter操作需要先把table禁用，而在当前版本已经不需要。

另外一个非常常用的操作是添加和删除列簇：

或者：

删除表需要先将表disable。

3. put与get

在HBase shell中，我们可以通过put命令来插入数据。例如我们新创建一个表，它拥有id、address和info三个列簇，并插入一些数据。列簇下的列不需要提前创建，在需要时通过
:
来指定即可。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
        > 
          
        create 
          
        'member' 
        , 
        'id' 
        , 
        'address' 
        , 
        'info' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.4570 
          
        seconds 
       
        = 
        > 
          
        Hbase 
        :: 
        Table 
         – 
          
        member 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'id' 
        , 
        '11' 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'info:age' 
        , 
        '27' 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'info:birthday' 
        , 
        '1987-04-04' 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'info:industry' 
        , 
          
        'it' 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'address:city' 
        , 
        'beijing' 
       
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'address:country' 
        , 
        'china' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
          
        'id' 
        , 
          
        '21' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
        'info:age' 
        , 
          
        '26' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
        'info:birthday' 
        , 
          
        '1988-05-09 ' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
        'info:industry' 
        , 
          
        'it' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
        'address:city' 
        , 
          
        'beijing' 
       
        put 
          
        'member' 
        , 
          
        'Sariel' 
        , 
        'address:country' 
        , 
          
        'china' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
          
        'id' 
        , 
          
        '22' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
        'info:age' 
        , 
          
        '26' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
        'info:birthday' 
        , 
          
        '1988-09-14 ' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
        'info:industry' 
        , 
          
        'it' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
        'address:city' 
        , 
          
        'beijing' 
       
        put 
          
        'member' 
        , 
          
        'Elvis' 
        , 
        'address:country' 
        , 
          
        'china'

获取一个id的所有数据

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
        > 
          
        get 
          
        'member' 
        , 
          
        'Sariel' 
       
        COLUMN                            
        CELL                                                                                         
       
        address 
        : 
        city                     
        timestamp 
        = 
        1425871035382 
        , 
          
        value 
        = 
        beijing                                                       
       
        address 
        : 
        country                  
        timestamp 
        = 
        1425871035424 
        , 
          
        value 
        = 
        china                                                         
       
        id 
        : 
                                      
        timestamp 
        = 
        1425871035176 
        , 
          
        value 
        = 
        21 
                                                                    
        info 
        : 
        age                         
        timestamp 
        = 
        1425871035225 
        , 
          
        value 
        = 
        26 
                                                                    
        info 
        : 
        birthday                    
        timestamp 
        = 
        1425871035296 
        , 
          
        value 
        = 
        1988 
        - 
        05 
        - 
        09 
                                                            
        info 
        : 
        industry                    
        timestamp 
        = 
        1425871035334 
        , 
          
        value 
        = 
        it 
                                                                    
        6 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0530 
          
        seconds

获得一个id，一个列簇（一个列）中的所有数据:

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
        > 
          
        get 
          
        'member' 
        , 
          
        'Sariel' 
        , 
          
        'info' 
       
        COLUMN                            
        CELL                                                                                         
       
        info 
        : 
        age                         
        timestamp 
        = 
        1425871035225 
        , 
          
        value 
        = 
        26 
                                                                    
        info 
        : 
        birthday                    
        timestamp 
        = 
        1425871035296 
        , 
          
        value 
        = 
        1988 
        - 
        05 
        - 
        09 
                                                            
        info 
        : 
        industry                    
        timestamp 
        = 
        1425871035334 
        , 
          
        value 
        = 
        it 
                                                                    
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0320 
          
        seconds 
       
        > 
          
        get 
          
        'member' 
        , 
          
        'Sariel' 
        , 
          
        'info:age' 
       
        COLUMN                            
        CELL                                                                                         
       
        info 
        : 
        age                         
        timestamp 
        = 
        1425871035225 
        , 
          
        value 
        = 
        26 
                                                                    
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0270 
          
        seconds

通过describe ‘member’可以看到，默认情况下列簇只保存1个version。我们先将其修改到2,然后update一些信息。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
        > 
          
        alter 
          
        'member' 
        , 
          
        { 
        NAME 
        = 
        > 
          
        'info' 
        , 
          
        VERSIONS 
          
        = 
        > 
          
        2 
        } 
       
        Updating  
        all  
        regions  
        with  
        the  
        new 
          
        schema 
        . 
        . 
        . 
       
        0 
        / 
        1 
          
        regions  
        updated 
        . 
       
        1 
        / 
        1 
          
        regions  
        updated 
        . 
       
        Done 
        . 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        2.2580 
          
        seconds 
       
        > 
          
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'info:age' 
        , 
        '29' 
       
        > 
          
        put 
          
        'member' 
        , 
          
        'debugo' 
        , 
        'info:age' 
        , 
        '28' 
       
        > 
          
        get 
          
        'member' 
        , 
          
        'debugo' 
        , 
          
        { 
        COLUMN 
        = 
        > 
        'info:age' 
        , 
          
        VERSIONS 
        = 
        > 
        2 
        } 
       
        COLUMN                            
        CELL                                                                                         
       
        info 
        : 
        age                         
        timestamp 
        = 
        1425884510241 
        , 
          
        value 
        = 
        28 
                                                                    
        info 
        : 
        age                         
        timestamp 
        = 
        1425884510195 
        , 
          
        value 
        = 
        29 
                                                                    
        2 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0400 
          
        seconds

4. 其他DML语句

通过delete命令，我们可以删除id为某个值的‘info:age’字段，接下来的get就无视了

通过deleteall来删除整行

给’Sariel’的’info:age’字段添加，并使用incr实现递增。但需要注意的是，这个value需要是一个数值，如果使用单引号标识的字符串就无法使用incr。在使用Java API开发时，我们可以使用toBytes函数讲数值转换成byte字节。在HBase shell中我们只能通过incr来初始化这个列，

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
        > 
          
        delete 
          
        'member' 
        , 
        'Sariel' 
        , 
        'info:age' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0270 
          
        seconds 
       
        > 
          
        incr 
          
        'member' 
        , 
        'Sariel' 
        , 
        'info:age' 
        , 
        26 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0290 
          
        seconds 
       
        > 
          
        incr 
          
        'member' 
        , 
        'Sariel' 
        , 
        'info:age' 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0290 
          
        seconds 
       
        > 
          
        incr 
          
        'member' 
        , 
        'Sariel' 
        , 
        'info:age' 
        , 
          
        - 
        1 
       
        0 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0230 
          
        seconds 
       
        > 
          
        get 
          
        'member' 
        , 
        'Sariel' 
        , 
        'info:age' 
       
        COLUMN    
        CELL                                                                                         
       
        info 
        : 
        age    
        timestamp 
        = 
        1425890213341 
        , 
          
        value 
        = 
        \ 
        x00 
        \ 
        x00 
        \ 
        x00 
        \ 
        x00 
        \ 
        x00 
        \ 
        x00 
        \ 
        x00 
        \ 
        x1A 
                                      
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0280 
          
        seconds

十六进制1A是26，通过上面增1再减1后得到的结果。下面通过count统计行数。

通过truncate来截断表。hbase是先将掉disable掉，然后drop掉后重建表来实现truncate的功能的。

5. scan和filter

通过scan来对全表进行扫描。我们将之前put的数据恢复。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
         6 
       
         7 
       
         8 
       
         9 
       
         10 
       
         11 
       
         12 
       
         13 
       
         14 
       
         15 
       
         16 
       
         17 
       
         18 
       
         19 
       
         20 
       
         21 
       
         22 
       
         23 
       
         24 
       
         25 
       
         26 
       
         27 
       
         28 
       
         29 
       
         30 
       
         31 
       
         32 
       
         33 
       
        > 
          
        scan 
          
        'member' 
       
        ROW                 
        COLUMN 
        + 
        CELL                                          
       
        Elvis              
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891057211 
        , 
          
        value 
        = 
       
        beijing                                              
       
        Elvis              
        column 
        = 
        address 
        : 
        country 
        , 
          
        timestamp 
        = 
        1425891057258 
        , 
          
        val 
       
        ue 
        = 
        china                                             
       
        Elvis              
        column 
        = 
        id 
        : 
        , 
          
        timestamp 
        = 
        1425891057038 
        , 
          
        value 
        = 
        22 
                
        Elvis              
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891057083 
        , 
          
        value 
        = 
        26 
           
        Elvis              
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891057129 
        , 
          
        value 
       
        = 
        1988 
        - 
        09 
        - 
        14 
                                                  
        Elvis              
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891057172 
        , 
          
        value 
       
        = 
        it                                                  
       
        Sariel             
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891056965 
        , 
          
        value 
        = 
       
        beijing                                              
       
        Sariel             
        column 
        = 
        address 
        : 
        country 
        , 
          
        timestamp 
        = 
        1425891057003 
        , 
          
        val 
       
        ue 
        = 
        china                                             
       
        Sariel             
        column 
        = 
        id 
        : 
        , 
          
        timestamp 
        = 
        1425891056767 
        , 
          
        value 
        = 
        21 
                
        Sariel             
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056808 
        , 
          
        value 
        = 
        26 
           
        Sariel             
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056883 
        , 
          
        value 
       
        = 
        1988 
        - 
        05 
        - 
        09 
                                                  
        Sariel             
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891056924 
        , 
          
        value 
       
        = 
        it                                                  
       
        debugo             
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891056642 
        , 
          
        value 
        = 
       
        beijing                                              
       
        debugo             
        column 
        = 
        address 
        : 
        country 
        , 
          
        timestamp 
        = 
        1425891056726 
        , 
          
        val 
       
        ue 
        = 
        china                                             
       
        debugo             
        column 
        = 
        id 
        : 
        , 
          
        timestamp 
        = 
        1425891056419 
        , 
          
        value 
        = 
        11 
                
        debugo             
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056499 
        , 
          
        value 
        = 
        27 
           
        debugo             
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056547 
        , 
          
        value 
       
        = 
        1987 
        - 
        04 
        - 
        04 
                                                  
        debugo             
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891056597 
        , 
          
        value 
       
        = 
        it 
                                                          
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0660 
          
        seconds3  
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0590 
          
        seconds

指定扫描其中的某个列：

 
         1 
       
        > 
          
        scan 
          
        'member' 
        , 
          
        { 
        COLUMNS 
        = 
        > 
          
        'info:birthday' 
        }

或者整个列簇：

 
   
 
 
  
         1 
       

         2 
       

         3 
       

         4 
       

         5 
       

         6 
       

         7 
       

         8 
       

         9 
       

         10 
       

         11 
       

         12 
       
 
        > 
          
        scan 
          
        'member' 
        , 
          
        { 
        COLUMNS 
        = 
        > 
          
        'info' 
        } 
       
 
        ROW                               
        COLUMN 
        + 
        CELL                                                                                  
       
 
          
        Elvis                            
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891057083 
        , 
          
        value 
        = 
        26 
                                                   
       
 
          
        Elvis                            
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891057129 
        , 
          
        value 
        = 
        1988 
        - 
        09 
        - 
        14 
                                      
       
 
          
        Elvis                            
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891057172 
        , 
          
        value 
        = 
        it                                      
       
 
          
        Sariel                           
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056808 
        , 
          
        value 
        = 
        26 
                                                   
       
 
          
        Sariel                           
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056883 
        , 
          
        value 
        = 
        1988 
        - 
        05 
        - 
        09 
                                      
       
 
          
        Sariel                           
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891056924 
        , 
          
        value 
        = 
        it                                      
       
 
          
        debugo                           
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056499 
        , 
          
        value 
        = 
        27 
                                                   
       
 
          
        debugo                           
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056547 
        , 
          
        value 
        = 
        1987 
        - 
        04 
        - 
        04 
                                      
       
 
          
        debugo                           
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891056597 
        , 
          
        value 
        = 
        it 
                                              
       
 
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0650 
          
        seconds 
       
 
 

除了列（COLUMNS）修饰词外，HBase还支持Limit（限制查询结果行数），STARTROW （ROWKEY起始行。会先根据这个key定位到region，再向后扫描）、STOPROW(结束行)、TIMERANGE（限定时间戳范围）、VERSIONS（版本数）、和FILTER（按条件过滤行）等。比如我们从Sariel这个rowkey开始，找下一个行的最新版本：

 
   
 
 
  
         1 
       

         2 
       

         3 
       

         4 
       

         5 
       

         6 
       

         7 
       

         8 
       

         9 
       
 
        > 
          
        scan 
          
        'member' 
        , 
          
        { 
          
        STARTROW 
          
        = 
        > 
          
        'Sariel' 
        , 
          
        LIMIT 
        = 
        > 
        1 
        , 
          
        VERSIONS 
        = 
        > 
        1 
        } 
       
 
        ROW     
        COLUMN 
        + 
        CELL 
       
 
         
        Sariel    
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891056965 
        , 
          
        value 
        = 
        beijing 
       
 
         
        Sariel    
        column 
        = 
        address 
        : 
        country 
        , 
          
        timestamp 
        = 
        1425891057003 
        , 
          
        value 
        = 
        china 
       
 
         
        Sariel    
        column 
        = 
        id 
        : 
        , 
          
        timestamp 
        = 
        1425891056767 
        , 
          
        value 
        = 
        21 
       
 
         
        Sariel    
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056808 
        , 
          
        value 
        = 
        26 
       
 
         
        Sariel    
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056883 
        , 
          
        value 
        = 
        1988 
        - 
        05 
        - 
        09 
       
 
         
        Sariel    
        column 
        = 
        info 
        : 
        industry 
        , 
          
        timestamp 
        = 
        1425891056924 
        , 
          
        value 
        = 
        it 
       
 
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0410 
          
        seconds 
       
 
 

Filter是一个非常强大的修饰词，可以设定一系列条件来进行过滤。比如我们要限制某个列的值等于26：

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "ValueFilter(=,'binary:26')" 
       
        ROW     
        COLUMN 
        + 
        CELL 
       
        Elvis     
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891057083 
        , 
          
        value 
        = 
        26 
       
        Sariel    
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056808 
        , 
          
        value 
        = 
        26 
       
        2 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0620 
          
        seconds

值包含6这个值：

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "ValueFilter(=,'substring:6')" 
       
        Elvis     
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891057083 
        , 
          
        value 
        = 
        26 
       
        Sariel    
        column 
        = 
        info 
        : 
        age 
        , 
          
        timestamp 
        = 
        1425891056808 
        , 
          
        value 
        = 
        26 
       
        2 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0620 
          
        seconds

列名中的前缀为birthday的：

 
   
 
 
  
         1 
       

         2 
       

         3 
       

         4 
       

         5 
       

         6 
       
 
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "ColumnPrefixFilter('birth') " 
       
 
        ROW                               
        COLUMN 
        + 
        CELL                                                                                  
       
 
          
        Elvis                            
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891057129 
        , 
          
        value 
        = 
        1988 
        - 
        09 
        - 
        14 
                                      
       
 
          
        Sariel       
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056883 
        , 
          
        value 
        = 
        1988 
        - 
        05 
        - 
        09 
       
 
          
        debugo                           
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056547 
        , 
          
        value 
        = 
        1987 
        - 
        04 
        - 
        04 
                                      
       
 
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0450 
          
        seconds 
       
 
 

FILTER中支持多个过滤条件通过括号、AND和OR的条件组合。

 
         1 
       
         2 
       
         3 
       
         4 
       
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "ColumnPrefixFilter('birth') AND ValueFilter ValueFilter(=,'substring:1987')" 
       
        ROW        
        COLUMN 
        + 
        CELL                                                                                  
       
        Debugo     
        column 
        = 
        info 
        : 
        birthday 
        , 
          
        timestamp 
        = 
        1425891056547 
        , 
          
        value 
        = 
        1987 
        - 
        04 
        - 
        04 
       
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0450 
          
        seconds

同一个rowkey的同一个column有多个version，根据timestamp来区分。而每一个列簇有多个column。而FIRSTKEYONLY仅取出每个列簇的第一个column的第一个版本。而KEYONLY则是对于每一个column只去取出key，把VALUE的信息丢弃,一般和其他filter结合使用。例如：

 
   
 
 
  
         1 
       

         2 
       

         3 
       

         4 
       

         5 
       

         6 
       

         7 
       

         8 
       

         9 
       

         10 
       

         11 
       

         12 
       
 
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "FirstKeyOnlyFilter()" 
       
 
        ROW     
        COLUMN 
        + 
        CELL                                                                                  
       
 
          
        Elvis     
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891057211 
        , 
          
        value 
        = 
        beijing                                  
       
 
          
        Sariel    
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891056965 
        , 
          
        value 
        = 
        beijing                                  
       
 
          
        debugo  
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891056642 
        , 
          
        value 
        = 
        beijing 
                                          
       
 
        3 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0230 
          
        seconds 
       
 
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "KeyOnlyFilter()" 
       
 
        hbase 
        ( 
        main 
        ) 
        : 
        055 
        : 
        0 
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "KeyOnlyFilter()" 
       
 
        ROW     
        COLUMN 
        + 
        CELL  
       
 
        Elvis     
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891057211 
        , 
          
        value 
        = 
       
 
        Elvis     
        column 
        = 
        id 
        : 
        , 
          
        timestamp 
        = 
        1425891057038 
        , 
          
        value 
        = 
          
       

         …… 
       
 
 

PrefixFilter是对Rowkey的前缀进行判断,这是一个非常常用的功能。

 
         1 
       
         2 
       
         3 
       
         4 
       
         5 
       
        > 
          
        scan 
          
        'member' 
        , 
          
        FILTER 
        = 
        > 
        "PrefixFilter('E')" 
       
        ROW     
        COLUMN 
        + 
        CELL                                                                                  
       
        Elvis     
        column 
        = 
        address 
        : 
        city 
        , 
          
        timestamp 
        = 
        1425891057211 
        , 
          
        value 
        = 
        beijing 
                                          
         …… 
       
        1 
          
        row 
        ( 
        s 
        ) 
          
        in 
          
        0.0460 
          
        seconds