hbase java api count_hbase Java API操作实例

最新推荐文章于 2022-03-02 20:51:02 发布

杨朝伟

最新推荐文章于 2022-03-02 20:51:02 发布

阅读量707

点赞数

文章标签： hbase java api count

本文链接：https://blog.csdn.net/weixin_32769015/article/details/114502644

版权

本文详细介绍了如何使用HBase Java API进行表格的创建、删除以及数据的插入、查询和筛选。包括创建表格的HTableDescriptor配置，删除表格的步骤，使用Put进行数据插入，通过Get和Scan进行数据查询，以及运用RowFilter和SingleColumnValueFilter进行数据筛选。

摘要由CSDN通过智能技术生成

DDL(创建及删除表格) 如何在Hbase中创建表格以及删除表格。可通过Java和Hbase Shell两种方法实现。创建表格 HBase中表格的创建是通过对操作HBaseAdmin这一对象使其调用createTable()这一方法来实现。其中HTableDescriptor描述了表的schema，可在其上通过

DDL(创建及删除表格)

如何在Hbase中创建表格以及删除表格。可通过Java和Hbase Shell两种方法实现。

创建表格

HBase中表格的创建是通过对操作HBaseAdmin这一对象使其调用createTable()这一方法来实现。

其中HTableDescriptor描述了表的schema，可在其上通过addFamily()这一方法增加列族。

以下Java代码实现了建立一张简易的Hbase表格‘table1’，该表有两个列族，分别为f1和f2。

public class createTable{

private static Configuration config;

private static HBaseAdmin ha;

public static void main(String[] args){

try{

config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

ha = new HBaseAdmin(config);

//create table descriptor

String tableName = "table1";

HTableDescriptor htd = new HTableDescriptor(Bytes.toBytes(tableName));

//create and configure column families

HColumnDescriptor hcd1 = new HColumnDescriptor(Bytes.toBytes("family1"));

hcd1.setBlocksize(65536);

hcd1.setMaxVersions(1);

hcd1.setBloomFilterType(BloomType.ROW);

hcd1.setCompressionType(Algorithm.SNAPPY);

hcd1.setDataBlockEncoding(DataBlockEncoding.PREFIX);

hcd1.setTimeToLive(36000);

hcd1.setInMemory(false);

HColumnDescriptor hcd2 = new HColumnDescriptor(Bytes.toBytes("family2"));

hcd2.setBlocksize(65536);

hcd2.setMaxVersions(1);

hcd2.setBloomFilterType(BloomType.ROW);

hcd2.setCompressionType(Algorithm.SNAPPY);

hcd2.setDataBlockEncoding(DataBlockEncoding.PREFIX);

hcd2.setTimeToLive(36000);

hcd2.setInMemory(false);

//add column families to table descriptor

htd.addFamily(hcd1);

htd.addFamily(hcd2);

//create table

ha.createTable(htd);

System.out.println("Hbase table created.");

}catch (TableExistsException e){

System.out.println("ERROR: attempting to create existing table!");

}catch (IOException e){

e.printStackTrace();

}finally{

try{

ha.close();

}catch(IOException e){

e.printStackTrace();

}

在Hbase Shell中，创建表格功能由create ‘Hbase表名’，[‘列族名’...]来实现。

例如，create ‘table1’，‘family1’，‘family2’同样可创建上述表格。

删除表格

删除表也是通过HBaseAdmin来操作，删除表之前首先要disable表。这是一个比较耗时的操作，所以不建议频繁删除表。

以下Java代码实现了对表格“table1”的删除操作：

public class deleteTable{

private static Configuration config;

private static HBaseAdmin ha;

public static void main(String[] args){

try{

config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

ha = new HBaseAdmin(config);

String tableName = "table1";

//Only an existing table can be dropped

if (ha.tableExists(tableName)){

//read&write denied

ha.disableTable(tableName);

ha.deleteTable(tableName);

System.out.println("Hbase table dropped!");

}

}catch(IOException e){

e.printStackTrace();

}finally{

try{

ha.close();

}catch(IOException e){

e.printStackTrace();

}

在Hbase Shell中，删除表格功能由drop ‘Hbase表名’来实现。

例如，先disable ‘table1’再drop ‘table1’同样可删除上述表格。

数据插入

在Java操作中，put方法被用做插入数据。

put方法可以传递单个Put对象: public void put(Put put) throws IOException，也可以对很多Put对象进行批量插入: public void put(List puts) throws IOException

以下Java代码实现了对表格"table1"的批量数据插入操作。插入数据后，表格有10000行，列族“family1”，“family2”中都包含“q1”，“q2”两个列，其中列族“family1”储存整型数据(int)，列族“family2”储存字符串(string)。

ATTENTION：虽然Hbase支持多种类型储存，但为了应用高性能优化的hbase，表格值的储存类型建议一致使用为String。如上例所示，“family1：q1”中原为整数类型，须转制成string后再录入表中

public class insertTable{

private static Configuration config;

public static void main(String[] args) throws IOException{

config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

String tableName = "table1";

HTable table = new HTable(config, tableName);

//set AutoFlush

table.setAutoFlush(true);

int count = 10000;

String familyName1 = "family1";

String familyName2 = "family2";

String qualifier1 = "q1";

String qualifier2 = "q2";

//data to be inserted

String[] f1q1 = new String[count];

String[] f1q2 = new String[count];

String[] f2q1 = new String[count];

String[] f2q2 = new String[count];

for(int i = 0; i < count; i++){

f1q1[i] = Integer.toString(i);

f1q2[i] = Integer.toString(i+10000);

f2q1[i] = String.format("f2q1%d", i);

f2q2[i] = String.format("f2q2%d", i);

}

List puts = new ArrayList();

//insert by rows

for (int j = 0; j < count; j++){

//Create a Put object for a specified row-key

Put p = new Put(Bytes.toBytes(String.format("Row%05d",j)));

//fill columns

p.add(Bytes.toBytes(familyName1), Bytes.toBytes(qualifier1), Bytes.toBytes(f1q1[j]));

p.add(Bytes.toBytes(familyName1), Bytes.toBytes(qualifier2), Bytes.toBytes(f1q2[j]));

p.add(Bytes.toBytes(familyName2), Bytes.toBytes(qualifier1), Bytes.toBytes(f2q1[j]));

p.add(Bytes.toBytes(familyName2), Bytes.toBytes(qualifier2), Bytes.toBytes(f2q2[j]));

puts.add(p);

//put for every 100 rows

if((j+1)%100==0){

table.put(puts);

puts.clear();

}

table.close();

System.out.println("Data inserted！");

}

在Hbase Shell中，单条数据插入功能由put ‘Hbase表名’，‘rowKey’，‘列族名：列名’，‘数据值’来实现。

数据查询

Hbase表格的数据查询可分为单条查询与批量查询。

单条查询

单条查询是通过匹配rowkey在表格中查询某一行的数据。在Java中可通过get()这一方法来实现。

下列Java代码实现了在表格“table1”中取出指定rowkey一行的所有列的数据：

public class getFromTable{

private static Configuration config;

public static void main(String[] args) throws IOException{

String tableName = "table1";

config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

HTable table = new HTable(config, tableName);

Get get = new Get(Bytes.toBytes("Row01230"));

//add target columns for get

get.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

get.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q2"));

get.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q1"));

get.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q2"));

Result result = table.get(get);

//get results

byte[] rowKey = result.getRow();

byte[] val1 = result.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

byte[] val2 = result.getValue(Bytes.toBytes("family1"),Bytes.toBytes("q2"));

byte[] val3 = result.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q1"));

byte[] val4 = result.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q2"));

System.out.println("Row key: " + Bytes.toString(rowKey));

System.out.println("value1: " + Bytes.toString(val1));

System.out.println("value2: " + Bytes.toString(val2));

System.out.println("value3: " + Bytes.toString(val3));

System.out.println("value4: " + Bytes.toString(val4));

table.close();

}

在Hbase Shell中，单条数据查找功能由get ‘Hbase表名’，‘rowKey’，‘列族名：列名’来实现。

批量查询

批量查询是通过制定一段rowkey的范围来查询。可通过Java中getScanner()这一方法来实现。

下列Java代码实现了在表格“table1”中取出指定一段rowkey范围的所有列的数据：

public class scanFromTable {

private static Configuration config;

public static void main(String[] args) throws IOException{

config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

String tableName = "table1";

HTable table = new HTable(config, tableName);

//Scan according to rowkey range

Scan scan = new Scan();

//set starting row(included), if not set, start from the first row

scan.setStartRow(Bytes.toBytes("Row01000"));

//set stopping row(excluded), if not set, stop at the last row

scan.setStopRow(Bytes.toBytes("Row01100"));

//specify columns to scan, if not specified, return all columns；

scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q2"));

scan.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q1"));

scan.addColumn(Bytes.toBytes("family2"), Bytes.toBytes("q2"));

//specify maximum versions for one cell, if called without arguments, get all versions, if not called, get only the latest version

scan.setMaxVersions();

//specify maximum number of cells to avoid OutOfMemory error caused by huge amount of data in a single row

scan.setBatch(10000);

ResultScanner rs = table.getScanner(scan);

for(Result r:rs){

byte[] rowKey = r.getRow();

byte[] val1 = r.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

byte[] val2 = r.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q2"));

byte[] val3 = r.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q1"));

byte[] val4 = r.getValue(Bytes.toBytes("family2"), Bytes.toBytes("q2"));

System.out.print(Bytes.toString(rowKey)+": ");

System.out.print(Bytes.toString(val1)+" ");

System.out.print(Bytes.toString(val2)+" ");

System.out.print(Bytes.toString(val3)+" ");

System.out.println(Bytes.toString(val4));

}

rs.close();

table.close();

}

在Hbase Shell中，批量数据查找功能由scan ‘Hbase表名’，{COLUMNS=>‘列族名：列名’，STARTROW=>‘起始rowkey’，STOPROW=>‘终止rowkey’}来实现。

利用过滤器筛选

过滤器是在Hbase服务器端上执行筛选操作，可以应用到行键(RowFilter)，列限定符(QualifierFilter)以及数据值(ValueFilter)。

这里列举了两个常用的过滤器：RowFilter和SingleColumnValueFilter。

RowFilter

RowFilter通过行键(rowkey)来筛选数据。

其中BinaryComparator直接比较两个byte array，可选的比较符(CompareOp)有EQUAL,NOT_EQUAL,GREATER,GREATER_OR_EQUAL,LESS,LESS_OR_EQUAL。

public class rowFilter{

public static void main(String[] args) throws IOException{

String tableName = "table1";

Configuration config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

HTable table = new HTable(config, tableName);

Scan scan = new Scan();

scan.addColumn(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

Filter filter = new RowFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes("Row01234")));

scan.setFilter(filter);

ResultScanner scanner = table.getScanner(scan);

for(Result res:scanner){

byte[] value = res.getValue(Bytes.toBytes("family1"),Bytes.toBytes("q1"));

System.out.println(new String(res.getRow())+" value is: "+Bytes.toString(value));

}

scanner.close();

table.close();

}

SingleColumnValueFilter

SingleColumnValueFilter对某一具体列的值进行筛选。

其中SubstringComparator检查给定的字符串是否是列值的子字符串，可选的比较符(CompareOp)有EQUAL和NOT_EQUAL。

public class singleColumnValueFilter{

public static void main(String[] args) throws IOException{

Configuration config = HBaseConfiguration.create();

config.addResource("core-site.xml");

config.addResource("hdfs-site.xml");

config.addResource("yarn-site.xml");

config.addResource("mapred-site.xml");

String tableName = "table1";

HTable table = new HTable(config,tableName);

SingleColumnValueFilter filter = new SingleColumnValueFilter(

Bytes.toBytes("family2"),

Bytes.toBytes("q1"),

CompareFilter.CompareOp.NOT_EQUAL,

new SubstringComparator("45"));

//when setting setFilterIfMissing(true), rows with "null" values are filtered

filter.setFilterIfMissing(true);

Scan scan = new Scan();

scan.setFilter(filter);

ResultScanner scanner = table.getScanner(scan);

for (Result res:scanner){

byte[] val = res.getValue(Bytes.toBytes("family1"), Bytes.toBytes("q1"));

System.out.println(new String(res.getRow()));

System.out.println("value: " + Bytes.toString(val));

}

scanner.close();

table.close();

}

杨朝伟

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
hbase java api count_hbase Java API操作实例

DDL(创建及删除表格) 如何在Hbase中创建表格以及删除表格。可通过Java和Hbase Shell两种方法实现。创建表格 HBase中表格的创建是通过对操作HBaseAdmin这一对象使其调用createTable()这一方法来实现。其中HTableDescriptor描述了表的schema，可在其上通过DDL(创建及删除表格)如何在Hbase中创建表格以及删除表格。可通过Java和Hb...
复制链接

扫一扫