更多hbase简介:请查看hbase入门系列
传送门:https://blog.csdn.net/java_soldier/article/details/80708346
最近集群升级,开启kerberos认证,所有的应用都要改造,所以复习了下hbase的接口操作,代码见下方
先讲解下主要的接口类
HBaseConfiguration
org.apache.hadoop.hbase.HBaseConfiguration
Adds HBase configuration files to a Configuration
我们一般通过来获取configuration ,然后在set一些参数,比如zk的地址,端口,是否启用kerberos认证等
Configuration configuration = HBaseConfiguration.create();
configuration.set("hbase.zookeeper.quorum","bd36,bd37,bd38,bd66,bd67");
configuration.set("hbase.zookeeper.property.clientPort","2181");
configuration.set("zookeeper.znode.parent", "/hbase-unsecure");
Connection
org.apache.hadoop.hbase.client.Connection
A cluster connection encapsulating lower level individual connections to actual servers and a connection to zookeeper. Connections are instantiated through the ConnectionFactory class. The lifecycle of the connection is managed by the caller, who has to close() the connection to release the resources.
The connection object contains logic to find the master, locate regions out on the cluster, keeps a cache of locations and then knows how to re-calibrate after they move. The individual connections to servers, meta cache, zookeeper connection, etc are all shared by the Table and Admin instances obtained from this connection.
Connection creation is a heavy-weight operation. Connection implementations are thread-safe, so that the client can create a connection once, and share it with different threads. Table and Admin instances, on the other hand, are light-weight and are not thread-safe. Typically, a single connection per client application is instantiated and every thread will obtain its own Table instance. Caching or pooling of Table and Admin is not recommended.
官网给的太复杂,总结起来一句话:用来获取和hbase的连接,这里同样采用了工厂模式ConnectionFactory
connection = ConnectionFactory.createConnection(configuration);
Admin
org.apache.hadoop.hbase.client.Admin
Admin can be used to create, drop, list, enable and disable and otherwise modify tables, as well as perform other administrative operations.
Since:0.99.0
这个类主要用来创建表,删除表,启用禁用表等操作的接口类,hbase之前有个过期的方法叫HBaseAdmin,推荐用最新的,我们该如何获取Admin类呢?
Admin admin = connection.getAdmin();
TableName
org.apache.hadoop.hbase.TableName
这个类就是描述表名称的接口类,也就是把我们的字符串(表名)转换为hbase认识的样子
TableName tname = TableName.valueOf(tablename);
HTableDescriptor
org.apache.hadoop.hbase.HTableDescriptor
HTableDescriptor contains the details about an HBase table such as the descriptors of all the column families, is the table a catalog table, hbase:meta , if the table is read only, the maximum size of the memstore, when the region split should occur, coprocessors associated with it etc…
但是这个要过期了
Deprecated.
As of release 2.0.0, this will be removed in HBase 3.0.0. Use TableDescriptorBuilder to build HTableDescriptor.
这个是表描述信息的接口类
HTableDescriptor tDescriptor = new HTableDescriptor(tname);
HColumnDescriptor
org.apache.hadoop.hbase.HColumnDescriptor
An HColumnDescriptor contains information about a column family such as the number of versions, compression settings, etc. It is used as input when creating a table or adding a column.
这个是列簇的描述信息类,比如版本,压缩方式,添加一个列的时候会使用
HColumnDescriptor famliy = new HColumnDescriptor(cf);
Put
org.apache.hadoop.hbase.client.Put
Used to perform Put operations for a single row.
To perform a Put, instantiate a Put object with the row to insert to, and for each column to be inserted, execute add or add if setting the timestamp.
添加数据的时候,可以选择批量添加,还是单条添加,如果是批量添加需要创建一个List,将Put对象放入
Table table = connection.getTable(tableName);
List<Put> batPut = new ArrayList<Put>();
Put put = new Put(Bytes.toBytes("rowkey_"+i)); //插入的rowkey
put.addColumn(Bytes.toBytes("i"), Bytes.toBytes("username"), Bytes.toBytes("un_"+i)); //列簇,列,值
batPut.add(put)
table.put(batPut)