Hbase： ------ Shell命令、Java API、MR On Hbase 、对数据的增删改查。

最新推荐文章于 2024-07-19 22:28:09 发布

姜同学的学习笔记

最新推荐文章于 2024-07-19 22:28:09 发布

阅读量149

点赞数

分类专栏： Hbase 文章标签： hbase

本文链接：https://blog.csdn.net/weixin_45764675/article/details/105618586

版权

Hbase 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

Shell 命令-掌握

基本命令

1、打开Hbase Shell

[root@CentOS hbase-1.2.4]#  ./bin/hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/hadoop-2.9.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/hbase-2.2.2/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2020-01-05 04:32:25,440 WARN  [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.2.2, re6513a76c91cceda95dad7af246ac81d46fa2589, Sat Oct 19 10:10:12 UTC 2019
Took 0.0023 seconds                                                             
hbase(main):001:0>

2、获取帮助

hbase(main):004:0> help
hbase(main):005:0> help 'get'

3、查看服务器状态

hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load

4、查看版本信息

hbase(main):003:0> version
1.2.4, rUnknown, Wed Feb 15 18:58:00 CST 2017

Namespace-数据库

alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

1、创建namespace

hbase(main):007:0> create_namespace 'baizhi',{'user'=>'jiangzz'}
0 row(s) in 0.0390 seconds

2、查看namespace详情

hbase(main):009:0> describe_namespace 'baizhi'
DESCRIPTION                                                                     
{NAME => 'baizhi', user => 'jiangzz'}                            
1 row(s) in 0.0110 seconds

3、修改namespace

hbase(main):008:0>  alter_namespace 'baizhi', {METHOD => 'set', 'sex' => 'true'}
0 row(s) in 0.0350 seconds

hbase(main):010:0> describe_namespace 'baizhi'
DESCRIPTION                                                                     
{NAME => 'baizhi', sex => 'true', user => 'jiangzz'}                            
1 row(s) in 0.0080 seconds

hbase(main):011:0>  alter_namespace 'baizhi',{METHOD => 'unset', NAME=>'sex'}
0 row(s) in 0.1140 seconds

hbase(main):012:0> describe_namespace 'baizhi'
DESCRIPTION                                                                     
{NAME => 'baizhi', user => 'jiangzz'}                                           
1 row(s) in 0.0090 seconds

4、查看所有的namespace

hbase(main):013:0> list_namespace
NAMESPACE                                                                       
baizhi                                                                          
default                                                                         
hbase                                                                           
3 row(s) in 0.0270 seconds

hbase(main):020:0>  list_namespace '^b.*'
NAMESPACE                                                                       
baizhi                                                                          
1 row(s) in 0.0140 seconds

5、查看namespace的表

hbase(main):021:0> list_namespace_tables 'hbase'
TABLE                                                                           
meta                                                                            
namespace                                                                       
2 row(s) in 0.0210 seconds

6、删除namespace

hbase(main):022:0> drop_namespace 'baizhi'
0 row(s) in 0.0260 seconds

hbase(main):023:0> list_namespace
NAMESPACE                                                                       
default                                                                         
hbase                                                                           
2 row(s) in 0.0100 seconds

表的操作

1、查看所有表-(用户表)

hbase(main):004:0> list
TABLE                                                                           
0 row(s) in 0.0350 seconds

=> []

2、创建表

hbase(main):045:0> create 'baizhi:t_user',{NAME=>'cf1',VERSIONS=>3,BLOCKCACHE => true},{NAME=>'cf2',TTL=>300}
0 row(s) in 2.3100 seconds

=> Hbase::Table - baizhi:t_user

VERSIONS:保留数据版本,默认值1 TTL:列簇下列存活时间,默认是FOREVER BLOCKCACHE:是否开启缓存,用于加快读.IN_MEMORY:设置是否将列簇下所有数据加载内存中,加速读写,默认值false,BLOOMFILTER:配置布隆过滤器(一种数据文件过滤机制),默认值ROW,可选值两个ROW|ROWCOL,如果修改为ROWCOL系统需要额外开销存储列信息作为过滤文件的索引.

3、查看table详情

hbase(main):001:0> desc 'baizhi:t_user'
Table baizhi:t_user is ENABLED                                                                                        
baizhi:t_user                                                                                                         
COLUMN FAMILIES DESCRIPTION                                                                                           
{NAME => 'cf1', BLOOMFILTER => 'ROW', VERSIONS => '3', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK
_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => 
'65536', REPLICATION_SCOPE => '0'}                                                                                    
{NAME => 'cf2', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK
_ENCODING => 'NONE', TTL => '300 SECONDS (5 MINUTES)', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true
', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}                                                                    
2 row(s) in 0.9210 seconds

4、禁用表

hbase(main):027:0> disable_all 'baizhi:t_u.*'
baizhi:t_user                                                                   

Disable the above 1 tables (y/n)?
y
1 tables successfully disabled

5、启用表

hbase(main):028:0> enable_all 'baizhi:t_u.*'
baizhi:t_user                                                                   

Enable the above 1 tables (y/n)?
y
1 tables successfully enabled

6、截断表

hbase(main):034:0> truncate 'baizhi:t_user'
Truncating 'baizhi:t_user' table (it may take a while):
 - Disabling table...
 - Truncating table...
0 row(s) in 3.3620 seconds

7、删除表

hbase(main):035:0> drop 'baizhi:t_user'

ERROR: Table baizhi:t_user is enabled. Disable it first.

hbase(main):037:0> disable 'baizhi:t_user'
0 row(s) in 2.2450 seconds

hbase(main):038:0> drop 'baizhi:t_user'
0 row(s) in 1.2590 seconds

DML操作

1、put

hbase(main):004:0> put 'baizhi:t_user','001','cf1:name','zhangsan'
0 row(s) in 0.7430 seconds

hbase(main):005:0> put 'baizhi:t_user','001','cf1:age',18
0 row(s) in 0.1120 seconds

hbase(main):006:0> put 'baizhi:t_user','001','cf1:age',20 
0 row(s) in 0.0720 seconds

2、get

hbase(main):008:0> get 'baizhi:t_user','001'
COLUMN                              CELL
 cf1:age                            timestamp=1553961219305, value=20
 cf1:name                           timestamp=1553961181804, value=zhangsan

hbase(main):009:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',VERSIONS=>3}
COLUMN                              CELL
 cf1:age                            timestamp=1553961219305, value=20
 cf1:age                            timestamp=1553961198084, value=18
 cf1:name                           timestamp=1553961181804, value=zhangsan
3 row(s) in 0.1540 seconds

hbase(main):010:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',TIMESTAMP=>1553961198084}
COLUMN                              CELL
 cf1:age                            timestamp=1553961198084, value=18
1 row(s) in 0.0900 seconds

hbase(main):015:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',TIMERANGE=>[1553961198084,1553961219306],VERSIONS=>3}
COLUMN                              CELL
 cf1:age                            timestamp=1553961219305, value=20
 cf1:age                            timestamp=1553961198084, value=18
2 row(s) in 0.0180 seconds

hbase(main):018:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',FILTER => "ValueFilter(=, 'binary:zhangsan')"}
COLUMN                              CELL
 cf1:name                           timestamp=1553961181804, value=zhangsan
1 row(s) in 0.0550 seconds

hbase(main):019:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',FILTER => "ValueFilter(=, 'substring:zhang')"}
COLUMN                              CELL
 cf1:name                           timestamp=1553961181804, value=zhangsan
1 row(s) in 0.0780 seconds

3、delete/deleteal

# 删除指定版本之前的所以cell
hbase(main):027:0> delete 'baizhi:t_user','001','cf1:age',1553961899630
0 row(s) in 0.1020 seconds
# 删除cf1：age的所有单元格
hbase(main):031:0> delete 'baizhi:t_user','001','cf1:age'
0 row(s) in 0.0180 seconds

hbase(main):034:0> deleteall 'baizhi:t_user','001'
0 row(s) in 0.0360 seconds

hbase(main):035:0> t.count
0 row(s) in 0.0450 seconds
=> 0
hbase(main):036:0> get 'baizhi:t_user','001',{COLUMN=>'cf1',VERSIONS=>3}
COLUMN                              CELL
0 row(s) in 0.0130 seconds

4、scan

hbase(main):045:0> scan 'baizhi:t_user'
ROW                                 COLUMN+CELL
 001                                column=cf1:age, timestamp=1553962118964, value=21
 001                                column=cf1:name, timestamp=1553962147916, value=zs
 002                                column=cf1:age, timestamp=1553962166894, value=19
 002                                column=cf1:name, timestamp=1553962157743, value=ls
 003                                column=cf1:name, timestamp=1553962203754, value=zl
 005                                column=cf1:age, timestamp=1553962179379, value=19
 005                                column=cf1:name, timestamp=1553962192054, value=ww

hbase(main):054:0> scan 'baizhi:t_user',{ LIMIT => 2,STARTROW=>"003",REVERSED=>true}
ROW                                 COLUMN+CELL
 003                                column=cf1:name, timestamp=1553962203754, value=zl
 002                                column=cf1:age, timestamp=1553962166894, value=19
 002                                column=cf1:name, timestamp=1553962157743, value=ls

hbase(main):058:0> scan 'baizhi:t_user',{ LIMIT => 2,STARTROW=>"003",REVERSED=>true,VERSIONS=>3,TIMERANGE=>[1553962157743,1553962203790]}
ROW                                 COLUMN+CELL
 003                                column=cf1:name, timestamp=1553962203754, value=zl
 002                                column=cf1:age, timestamp=1553962166894, value=19
 002                                column=cf1:name, timestamp=1553962157743, value=ls
2 row(s) in 0.0810 seconds

参考: https://blog.csdn.net/weixin_38231448/article/details/89357104

Hbase Java API

导入Hbase开发相关依赖

<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.2.4</version>
</dependency>

创建Admin&Conn

private Admin admin;//负责对数据库实行 DDL
private Connection  conn;// 表示链接,是访问Hbase入口

@Before
public void before() throws IOException {
    Configuration conf= HBaseConfiguration.create();
    conf.set(HConstants.ZOOKEEPER_QUORUM,"CentOS");
    conn= ConnectionFactory.createConnection(conf);
    admin=conn.getAdmin();
}
@After
public void after() throws IOException {
    admin.close();
    conn.close();
}

Namespace操作

获取所有namespace

@Test
public void testListNamespace() throws IOException {
    NamespaceDescriptor[] descriptors = admin.listNamespaceDescriptors();
    for (int i = 0; i < descriptors.length; i++) {
        NamespaceDescriptor descriptor = descriptors[i];
        System.out.println(descriptor.getName());
    }
}

创建Namespace

@Test
public void testCreateNamespace() throws IOException {
    NamespaceDescriptor nd=NamespaceDescriptor.create("zpark")
        .addConfiguration("author","jiangzz")
        .build();
    admin.createNamespace(nd);
}

修改namespace

@Test
public void testModifyNamespace() throws IOException {
    NamespaceDescriptor nd=NamespaceDescriptor.create("zpark")
        .addConfiguration("key","lisi")
        .removeConfiguration("author")
        .build();
    admin.modifyNamespace(nd);
}

删除Namespace

@Test
public void testDropNamespace() throws IOException {
    admin.deleteNamespace("zpark");
}

仅仅只能删除空namespace

表操作

查看所有表

@Test
public void testListTables() throws IOException {
    HTableDescriptor[] tableDescriptors = admin.listTables();
    for (int i = 0; i < tableDescriptors.length; i++) {
        HTableDescriptor descriptor = tableDescriptors[i];
        System.out.println(descriptor);
    }
}
@Test
public void testListNamespaceWithTable() throws IOException {
    HTableDescriptor[] tableDescriptors = admin.listTableDescriptorsByNamespace("hbase");
    for (int i = 0; i < tableDescriptors.length; i++) {
        HTableDescriptor descriptor = tableDescriptors[i];
        System.out.println(descriptor);
    }
}

创建表

@Test
public void testCreateTable() throws IOException {
    TableName tName=TableName.valueOf("baizhi:t_user");
    HTableDescriptor tableDescriptor=new HTableDescriptor(tName);

    HColumnDescriptor cf1=new HColumnDescriptor("cf1");
    cf1.setMaxVersions(3);
    cf1.setInMemory(true);
    cf1.setBloomFilterType(BloomType.ROWCOL);

    HColumnDescriptor cf2=new HColumnDescriptor("cf2");
    cf2.setMaxVersions(3);
    cf2.setTimeToLive(300);
    cf2.setBlockCacheEnabled(true);

    tableDescriptor.addFamily(cf1);
    tableDescriptor.addFamily(cf2);

    admin.createTable(tableDescriptor);
}

删除表

@Test
public void dropTable() throws IOException {
    TableName tName=TableName.valueOf("baizhi:t_user");
    if(!admin.isTableDisabled(tName)){
        admin.disableTable(tName);
    }
    admin.deleteTable(tName);
}

CRUD(重点)

不带缓冲区

Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
String[] depts=new String[]{"研发部","销售部","行政部","财务部","后勤部"};
for(Integer i=0;i<100;i++){
    DecimalFormat decimalFormat = new DecimalFormat("000");
    String rowKey=decimalFormat.format(i);
    Put put = new Put(rowKey.getBytes());
    put.addColumn("cf1".getBytes(),"name".getBytes(),("user"+rowKey).getBytes());
    put.addColumn("cf1".getBytes(),"age".getBytes(),(i+"").getBytes());
    put.addColumn("cf1".getBytes(),"salary".getBytes(),(i*100+"").getBytes());
    put.addColumn("cf1".getBytes(),"dept".getBytes(),(depts[i%5]).getBytes());
    table.put(put);
}
table.close();

批量写入

Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
String[] depts=new String[]{"研发部","销售部","行政部","财务部","后勤部"};
BufferedMutator bufferedMutator=conn.getBufferedMutator(TableName.valueOf("baizhi:t_user"));
for(Integer i=100;i<200;i++){
    DecimalFormat decimalFormat = new DecimalFormat("000");
    String rowKey=decimalFormat.format(i);
    Put put = new Put(rowKey.getBytes());
    put.addColumn("cf1".getBytes(),"name".getBytes(),("user"+rowKey).getBytes());
    put.addColumn("cf1".getBytes(),"age".getBytes(),(i+"").getBytes());
    put.addColumn("cf1".getBytes(),"salary".getBytes(),(i*100+"").getBytes());
    put.addColumn("cf1".getBytes(),"dept".getBytes(),(depts[i%5]).getBytes());

    bufferedMutator.mutate(put);
    if(i%50==0){
        bufferedMutator.flush();
    }
}
bufferedMutator.close();

指定过滤条件

@Test
public void testGetOneRow01() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
    String rowKey="000";
    //构建单个查询条件-重点
    Get get=new Get(rowKey.getBytes());
    get.setMaxVersions(3);//获取最新三个版本
    //get.setTimeStamp(1578330481699L);//指定时间戳查询
    get.setTimeRange(1578330481699L,1578345936177L);//查询区间,不包含最大时间前闭后开
    Filter filter1=new QualifierFilter(CompareFilter.CompareOp.EQUAL,new SubstringComparator("ge"));
    Filter filter2=new QualifierFilter(CompareFilter.CompareOp.EQUAL,new BinaryComparator("name".getBytes()));
    Filter list=new FilterList(FilterList.Operator.MUST_PASS_ONE,filter1,filter2);
    get.setFilter(list);//对cell进行过滤

    //遍历结果-重点
    Result result = table.get(get);

    CellScanner cellScanner = result.cellScanner();
    while(cellScanner.advance()){
        Cell cell = cellScanner.current();
        String family = Bytes.toString(CellUtil.cloneFamily(cell));
        String qualifier = Bytes.toString(CellUtil.cloneQualifier(cell));
        String value = Bytes.toString(CellUtil.cloneValue(cell));
        long version= cell.getTimestamp();
        String key=Bytes.toString(cell.getRow());
        System.out.println(key+"\t"+family+":"+qualifier+"\t"+value+" "+version);
    }

    table.close();
}

查询最新记录

@Test
public void testGetOneRow02() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
    String rowKey="002";
    Get get=new Get(rowKey.getBytes());
    //遍历结果-重点
    Result result = table.get(get);
    String key = Bytes.toString(result.getRow());
    String name = Bytes.toString(result.getValue("cf1".getBytes(),"name".getBytes()));
    String age =  Bytes.toString(result.getValue("cf1".getBytes(),"age".getBytes()));
    String salary = Bytes.toString(result.getValue("cf1".getBytes(),"salary".getBytes()));
    String dept = Bytes.toString(result.getValue("cf1".getBytes(),"dept".getBytes()));
    System.out.println(key+"\t"+name+","+age+","+salary+","+dept);
    table.close();
}

获取某个单元格的所有版本

@Test
public void testGetOneRow03() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
    String rowKey="000";
    Get get=new Get(rowKey.getBytes());
    get.setMaxVersions();
    //遍历结果-重点
    Result result = table.get(get);
    List<Cell> nameCells = result.getColumnCells("cf1".getBytes(), "name".getBytes());
    for (Cell nameCell : nameCells) {
        System.out.println(Bytes.toString(CellUtil.cloneValue(nameCell)));
    }
    table.close();
}

DELETE|DELETEALL

删除所有单元

@Test
public void testDeleteAll() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
    String rowKey="001";
    Delete delete=new Delete(rowKey.getBytes());
    table.delete(delete);
    table.close();
}

删除指定版本数据

@Test
public void testDeleteOneCell() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));
    String rowKey="000";
    Delete delete=new Delete(rowKey.getBytes());
    delete.addColumn("cf1".getBytes(),"name".getBytes(),1578330481699L);
    table.delete(delete);
    table.close();
}

SCAN

@Test
public void testScan01() throws IOException {
    Table table = conn.getTable(TableName.valueOf("baizhi:t_user"));

    Scan scan = new Scan();
    //        scan.addFamily("cf1".getBytes());//指定检索列簇
    //        scan.setStartRow("010".getBytes());
    //        scan.setStopRow("020".getBytes());
    //        scan.setTimeRange(157830000000L,1578350584987L);
    Filter filter1=new FuzzyRowFilter(Arrays.asList(new Pair<byte[], byte[]>("000".getBytes(),new byte[]{0,0,1})));
    //new RowFilter(CompareFilter.CompareOp.GREATER,new BinaryComparator("190".getBytes()));
    //new PrefixFilter("01".getBytes());
    Filter filter2=new RowFilter(CompareFilter.CompareOp.GREATER,new BinaryComparator("005".getBytes()));

    Filter list=new FilterList(FilterList.Operator.MUST_PASS_ALL,filter1,filter2);
    scan.setFilter(list);

    ResultScanner resultScanner = table.getScanner(scan);
    Iterator<Result> resultIterator = resultScanner.iterator();
    while(resultIterator.hasNext()){
        Result result = resultIterator.next();
        String key = Bytes.toString(result.getRow());
        String name = Bytes.toString(result.getValue("cf1".getBytes(),"name".getBytes()));
        String age =  Bytes.toString(result.getValue("cf1".getBytes(),"age".getBytes()));
        String salary = Bytes.toString(result.getValue("cf1".getBytes(),"salary".getBytes()));
        String dept = Bytes.toString(result.getValue("cf1".getBytes(),"dept".getBytes()));
        System.out.println(key+"\t"+name+"\t"+age+"\t"+salary+"\t"+dept);
    }

    table.close();
}

MR On Hbase （重点）

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.9.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-hdfs</artifactId>
    <version>2.9.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-jobclient</artifactId>
    <version>2.9.2</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-mapreduce-client-core</artifactId>
    <version>2.9.2</version>
</dependency>
<!--Hbase依赖-->
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-client</artifactId>
    <version>1.2.4</version>
</dependency>
<dependency>
    <groupId>org.apache.hbase</groupId>
    <artifactId>hbase-server</artifactId>
    <version>1.2.4</version>
</dependency>

配置YARN计算

1，配置yarn-site.xml

<!--配置MapReduce计算框架的核心实现Shuffle-洗牌-->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>
<!--配置资源管理器所在的目标主机-->
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>CentOS</value>
</property>

2、配置mapred-site.xml

<!--MapRedcue框架资源管理器的实现-->
<property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
</property>

3、启动yarn计算

[root@CentOS ~]# start-yarn.sh

4、配置HADOOP_ClASSPATH

HBASE_HOME=/usr/hbase-1.2.4
HADOOP_HOME=/usr/hadoop-2.9.2
JAVA_HOME=/usr/java/latest
CLASSPATH=.
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$HBASE_HOME/bin
export JAVA_HOME
export CLASSPATH
export PATH
export HADOOP_HOME
export HBASE_HOME
export HADOOP_CLASSPATH=`hbase classpath`

5、提交任务

[root@CentOS ~]# hadoop jar xxxx.jar 入口类

附注：

public class MapReduceTotalSalaryCount extends Configured implements Tool {
    public int run(String[] args) throws Exception {
        //1.构建Job对象
        Configuration conf=getConf();
        conf.set(HConstants.ZOOKEEPER_QUORUM,"CentOS");
        Job job=Job.getInstance(conf,"MapReduceTotalSalaryCount");
        job.setJarByClass(MapReduceTotalSalaryCount.class);
        TableMapReduceUtil.initTableMapperJob(
            "baizhi:t_user",
            new Scan(),
            UserMapper.class,
            Text.class,
            DoubleWritable.class,
            job
        );
        // create 'baizhi:t_result','cf1'
        TableMapReduceUtil.initTableReducerJob(
            "baizhi:t_result",
            UserReducer.class,
            job
        );
        //提交Job作业
        job.waitForCompletion(true);
        return 0;
    }
    public static void main(String[] args) throws Exception {
        ToolRunner.run(new MapReduceTotalSalaryCount(),args);
    }
}

public class UserMapper extends TableMapper<Text, DoubleWritable> {
    @Override
    protected void map(ImmutableBytesWritable key, Result value, Context context) throws IOException, InterruptedException {
        String dept = Bytes.toString(value.getValue("cf1".getBytes(), "dept".getBytes()));
        String salary = Bytes.toString(value.getValue("cf1".getBytes(), "salary".getBytes()));
        context.write(new Text(dept),new DoubleWritable(Double.parseDouble(salary)));
    }
}

public class UserReducer extends TableReducer<Text,DoubleWritable,Text> {

    @Override
    protected void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException {
        Double total=0.0;
        for (DoubleWritable value : values) {
            total += value.get();
        }
        Put put = new Put(key.toString().getBytes());
        put.addColumn("cf1".getBytes(),"sum".getBytes(),(total+"").getBytes());
        context.write(null,put);
    }
}

public class UserCombiner extends Reducer<Text, DoubleWritable, Text,DoubleWritable> {
    @Override
    protected void reduce(Text key, Iterable<DoubleWritable> values, Context context) throws IOException, InterruptedException {
        Double total=0.0;
        for (DoubleWritable value : values) {
            total += value.get();
        }
        context.write(key,new DoubleWritable(total));
    }
}

姜同学的学习笔记

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hbase： ------ Shell命令、Java API、MR On Hbase 、对数据的增删改查。

Shell 命令-掌握基本命令1、打开Hbase Shell[root@CentOS hbase-1.2.4]# ./bin/hbase shellSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/usr/hadoop-2.9.2/share/hadoop/comm...
复制链接

扫一扫

专栏目录