HBase回顾三、JAVA_API操作
环境准备
如果要通过java代码来操作hbase数据库,首先需要在项目中导入hbase提供的相关客户端操作jar包
如maven.pom.xml所示:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.chanzany</groupId>
<artifactId>hbaseplugins</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>1.3.1</version>
</dependency>
</dependencies>
</project>
API 初识
先回顾一下java操作mysql数据库的编程范式,大概是如下几个步骤
1. 加载数据库驱动
2. 获取数据库连接
3. 获取数据库的操作对象(Statement)
4. sql文
5. 执行数据库操作(ResultSet的遍历)
6. 提交事务,释放连接
而在hbase中大同小异,只不过不需要加载数据库驱动了,因为hbase本来就是用java来开发的,故java操作hbase的范式如下:
首先要创建配置对象,通过配置hbase内置驱动才能知道你要获取的hbase连接目标在哪儿,然后获取hbase的连接。
跟进方法内部发现需要放置配置文件hbase-site.xml和hbase-defalt.xml
Configuration conf = HBaseConfiguration.create();
跟进流程图:
从linux的hbase安装目录conf下把hbase-site.xml
直接抠出来放在类路径下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
/**
*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
-->
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop102:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 0.98后的新变动,之前版本没有.port,默认端口为60000 -->
<property>
<name>hbase.master.port</name>
<value>16000</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/opt/module/zookeeper-3.4.10/zkData</value>
</property>
</configuration>
获取连接对象
有了相关的配置,就可以获取连接对象了,这里hbase提供了一个连接工厂类,我们可以直接通过该工厂(指定配置)创建数据库的连接对象
Connection connection = ConnectionFactory.createConnection(conf);
获取数据库操作对象 Admin
与jdbc中的操作对象不同,在hbase中,数据库操作对象为实现了Admin接口的相关实体类,同样可以直接通过connection对象来获取,当然可以直接new HbaseAdmin,但是官方不推荐
Admin admin = connection.getAdmin();
操作数据库。获取相应结果
通过前一节对hbase数据结构的分析,一个数据自上而下的“全类名”是这样的:
namespace-->table-->rowkey-->columnfamily-->column-->value
所以,虽然有点麻烦,但是我们不得不将这些参数一个个添加到我们操作数据的对象中,根据官方文档的说明和IDEA的提示,一个操作数据库完成DDL、DML、DQL的相关操作如下所示:
try {
NamespaceDescriptor ns = admin.getNamespaceDescriptor("chanzany");
} catch (NamespaceExistException e) {
//如果有异常,说明没有找到命名空间
//create namespace
admin.createNamespace(NamespaceDescriptor.create("chanzany").build());
}
TableName tableName = TableName.valueOf("chanzany:emp");
boolean flag = admin.tableExists(tableName);
if (flag) {
/*DDL(create drop alter)--Admin
* DML(update insert delete)--td
* DQL(select)*/
//查询数据
// 获取指定的表对象
Table t_emp = connection.getTable(tableName);
String rowkey = "10001";
/*字符编码*/
Get get = new Get(Bytes.toBytes(rowkey));
//查询结果Result对象
Result result = t_emp.get(get);
boolean isempty = result.isEmpty();
System.out.println("是否存在数据:" + isempty);
if (isempty) {
//新增数据--增加列
Put put = new Put(Bytes.toBytes(rowkey));
String family = "info";
String column = "age";
String value = "24";
put.addColumn(Bytes.toBytes(family), Bytes.toBytes(column), Bytes.toBytes(value));
t_emp.put(put);
System.out.println("增加数据。。。");
} else {
//查询数据
Cell[] cells = result.rawCells();
for (Cell cell : cells) {
byte[] value = CellUtil.cloneValue(cell);
byte[] row = CellUtil.cloneRow(cell);
byte[] family = CellUtil.cloneFamily(cell);
byte[] col = CellUtil.cloneQualifier(cell);
System.out.println("family=" + Bytes.toString(family));
System.out.println("row=" + Bytes.toString(row));
System.out.println("col=" + Bytes.toString(col));
System.out.println("value=" + Bytes.toString(value));
}
}
} else {
//创建表描述对象
HTableDescriptor td = new HTableDescriptor(tableName);
//增加列族
HColumnDescriptor cd = new HColumnDescriptor("info");
HColumnDescriptor cd2 = new HColumnDescriptor("basic");
td.addFamily(cd);
td.addFamily(cd2);
admin.createTable(td);
System.out.println("emp表创建成功");
}
需要注意的点
针对三种对数据库的操作,分为了如下几个不同的操作对象
-
DDL(create drop alter):
Admin
-
创建方式:
Admin admin = connection.getAdmin();
-
使用方式
//获取名称空间 NamespaceDescriptor ns = admin.getNamespaceDescriptor("chanzany"); //创建名称空间 admin.createNamespace(NamespaceDescriptor.create("chanzany").build()); //创建表 //1.创建表描述对象 HTableDescriptor td = new HTableDescriptor(tableName); //2.增加列族 HColumnDescriptor cd = new HColumnDescriptor("info"); HColumnDescriptor cd2 = new HColumnDescriptor("basic"); td.addFamily(cd); td.addFamily(cd2); admin.createTable(td); System.out.println("emp表创建成功");
-
-
DML(update insert delete):
Table
-
创建方式
TableName tableName = TableName.valueOf("chanzany:emp"); Table t_emp = connection.getTable(tableName);
-
使用方式
//新增数据(修改数据) Put put = new Put(Bytes.toBytes(rowkey)); String family = "info"; String column = "age"; String value = "24"; put.addColumn(Bytes.toBytes(family), Bytes.toBytes(column), Bytes.toBytes(value)); t_emp.put(put); //删除数据 String row ="1002"; Delete del = new Delete(Bytes.toBytes(row)); t_emp.delete(del);
-
-
DQL(select) :
Table.get-->Result.rawCells()
-
创建:
Table t_emp = connection.getTable(tableName); String rowkey = "10001"; Get get = new Get(Bytes.toBytes(rowkey)); Result result = t_emp.get(get);
-
使用
//查询数据 Cell[] cells = result.rawCells(); for (Cell cell : cells) { byte[] value = CellUtil.cloneValue(cell); byte[] row = CellUtil.cloneRow(cell); byte[] family = CellUtil.cloneFamily(cell); byte[] col = CellUtil.cloneQualifier(cell); System.out.println("family=" + Bytes.toString(family)); System.out.println("row=" + Bytes.toString(row)); System.out.println("col=" + Bytes.toString(col)); System.out.println("value=" + Bytes.toString(value));
-
Hbase提供的大多数对象都无法直接通过new来创建,而是涉及到多个设计模式
比如
TableName tableName = TableName.valueOf("chanzany:emp");
通过观察发现这几个方法返回了我们需要的TableName对象
值得一提的是,如果我们传入的namespace为空,那么系统会指定默认名称空间,即default
自定义工具类
package com.chanzany.hbase.com.chanzany.util;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;
/**
* HBase操作工具类
*/
public class HBaseUtil {
//通过ThreadLocal,每个线程就拥有了各自维护的内存空间并且同一个线程内部的逻辑可以直接使用,可以解决线程内数据共享问题
//注意,虽然是各自维护,但是所有的内存都放在java虚拟机的一个堆中,所以还是无法解决线程安全问题
private static ThreadLocal<Connection> connHolder=new ThreadLocal<Connection>();
private HBaseUtil(){
}
/**
* 获取HBase连接对象
* @return Connection
*/
public static void makeHbaseConnection() throws IOException {
Connection conn = connHolder.get();
if (conn == null){
Configuration conf = HBaseConfiguration.create();
conn =ConnectionFactory.createConnection();
connHolder.set(conn);
}
}
/**
* 增加数据
* @param rowkey rowkey
* @param family family
* @param column column
* @param value value
* @throws Exception
*/
public static void insertData(String tablename,String rowkey,String family,String column,String value)throws Exception{
Connection conn = connHolder.get();
Table hTable = conn.getTable(TableName.valueOf(tablename));
Put put = new Put(Bytes.toBytes(rowkey));
put.addColumn(Bytes.toBytes(family),Bytes.toBytes(column),Bytes.toBytes(value));
hTable.put(put);
hTable.close();
}
/**
* 查询数据
* @param tableName tableName
* @param rowKey rowKey
* @return
* @throws IOException
*/
public static Result getData(String tableName,String rowKey) throws IOException {
Connection conn = connHolder.get();
Table hTable = conn.getTable(TableName.valueOf(tableName));
Get get = new Get(Bytes.toBytes(rowKey));
Result result = hTable.get(get);
hTable.close();
return result;
}
public static void showData(Result data){
Cell[] cells = data.rawCells();
for (Cell cell : cells) {
System.out.println("family:\t"+Bytes.toString(CellUtil.cloneFamily(cell)));
System.out.println("row:\t"+Bytes.toString(CellUtil.cloneRow(cell)));
System.out.println("column:\t"+Bytes.toString(CellUtil.cloneQualifier(cell)));
System.out.println("value:\t"+Bytes.toString(CellUtil.cloneValue(cell)));
}
}
public static void close() throws IOException {
Connection conn = connHolder.get();
if (conn!=null){
conn.close();
connHolder.remove();
}
}
}
测试工具类
package com.chanzany.hbase;
import com.chanzany.hbase.com.chanzany.util.HBaseUtil;
import org.apache.hadoop.hbase.client.*;
import java.io.IOException;
import static com.chanzany.hbase.com.chanzany.util.HBaseUtil.showData;
public class TestHBaseAPI_3 {
public static void main(String[] args) throws Exception {
//创建连接
HBaseUtil.makeHbaseConnection();
//插入数据
HBaseUtil.insertData("chanzany:emp","1002","info","name","zhangsan");
//查看数据
Result data = HBaseUtil.getData("chanzany:emp", "1002");
HBaseUtil.showData(data);
//关闭连接
HBaseUtil.close();
}
}