大数据实验Hbase安装部署和使用javaapi调用

最新推荐文章于 2023-11-27 22:04:45 发布

东方桑

最新推荐文章于 2023-11-27 22:04:45 发布

阅读量1.1k

点赞数 5

分类专栏：笔记文章标签： zookeeper hbase hadoop

本文链接：https://blog.csdn.net/qq_43505865/article/details/109456247

版权

笔记专栏收录该内容

7 篇文章 0 订阅

订阅专栏

实验目的和要求
1.1 实验目的
 理解HBase在Hadoop体系结构中的角色；
 熟练使用HBase操作常用的Shell命令；
 熟悉HBase操作常用的Java API。
1.2 实验软硬件环境
 操作系统： Ubuntu16.04；
 Hadoop版本：3.1.3；
 HBase版本：2.2.1；
 JDK版本：jdk-1.8；
 IDE：Eclipse。
实验记录
2.1 安装Hbase
建议先自己安装zookeeper，我们不使用hbase自带的zookeeper可以避免较多问题，下载并解压，配置，跟hadoop差不多，不多赘述，在此贴出配置文件，以及说一说重点和易错点
下载地址https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper-3.6.2/
然后讲一讲大概步骤，比较简单
/zookeeper/conf/zoo.cfg为配置文件（拷贝zoo_example.cfg即可）
修改

为自己创建的目录。后面配置hbase需要用到。
再修改/zookeeper/bin/zkEnv.sh导入自己的java_home和zookeeper_home
在这里插入图片描述

zookeeper的启动关闭是zkServer.sh start和zkServer.sh stop

最后把zookeeper_home写入.bashrc，并用source命令使其生效
在这里插入图片描述

下载并解压安装包hbase-2.2.2-bin.tar.gz至路径 /usr/local

在这里插入图片描述

将解压的文件名hbase-2.2.2改为hbase，以方便使用; 下面把hbase目录权限赋予给hadoop用户

在这里插入图片描述

配置环境变量
将hbase下的bin目录添加到path中，这样，启动hbase就无需到/usr/local/hbase目录下，大大的方便了hbase的使用。教程下面的部分还是切换到了/usr/local/hbase目录操作.
编辑~/.bashrc文件 export PATH=$PATH:/usr/local/hbase/bin

编辑完成后，再执行source命令使上述配置在当前终端立即生效

在这里插入图片描述

添加HBase权限

在这里插入图片描述

查看HBase版本，确定hbase安装成功

这里说一下，可能会出现提示与hadoop包重复问题,下面的配置即可以解决
2.2 伪分布式模式配置

配置/usr/local/hbase/conf/hbase-env.sh

这里我们用自己装的zookeeper，所以把它内置的设置为false
在这里插入图片描述
包冲突解决

配置/usr/local/hbase/conf/hbase-site.xml
修改hbase.rootdir，指定HBase数据在HDFS上的存储路径；将属性hbase.cluter.distributed设置为true。假设当前Hadoop集群运行在伪分布式模式下，在本机上运行，且NameNode运行在9000端口。

最后的端口要跟zookeeper配置文件zoo.cfg中clientPort=端口一样
3. 测试运行Hbase
第一步：首先登陆ssh，之前设置了无密码登陆，因此这里不需要密码；再切换目录至/usr/local/hadoop ；再启动hadoop，如果已经启动hadoop请跳过此步骤。
在这里插入图片描述

在这里插入图片描述

启动zookeeper

在这里插入图片描述

切换目录至/usr/local/hbase;再启动Hbase
在这里插入图片描述

进入shell界面：
在这里插入图片描述

停止HBase运行

总结启动关闭Hadoop和HBase的顺序必须是：
启动zookeeper—>启动Hadoop—>启动HBase—>关闭HBase—>关闭Hadoop—>关闭zookeeper
因为顺序报错过，请一定这样启动！！！
如下：
启动
在这里插入图片描述

关闭
在这里插入图片描述

2.3 HBase Shell命令操作
（一）编程实现以下指定功能，并用Hadoop提供的HBase Shell命令完成相同任务：
(1) 列出HBase所有的表的相关信息，例如表名；
list
在这里插入图片描述

(2) 在终端打印出指定的表的所有记录数据；
scan ’表名’
在这里插入图片描述

(3) 向已经创建好的表添加和删除指定的列族或列；
先在 Shell 中创建表 s1，作为示例表
在这里插入图片描述

然后，可以在 s1 中添加数据，
在这里插入图片描述

之后，可以执行如下命令删除指定的列
在这里插入图片描述

(4) 清空指定的表的所有记录数据；
scan查看表，发现是有内容的
然后清空表的所有记录数据：truncate ‘表名’
scan再次查看，发现已经没有了数据
在这里插入图片描述

(5) 统计表的行数。
count ‘表名’
在这里插入图片描述

（二）HBase数据库操作

现有以下关系型数据库中的表和数据，要求将其转换为适合于HBase存储的表并插入数据：
学生表（Student）
学号（S_No）姓名（S_Name）性别（S_Sex）年龄（S_Age）
2015001 Zhangsan male 23
2015003 Mary female 22
2015003 Lisi male 24

课程表（Course）
课程号（C_No）课程名（C_Name）学分（C_Credit）
123001 Math 2.0
123002 Computer Science 5.0
123003 English 3.0

选课表（SC）
学号（SC_Sno）课程号（SC_Cno）成绩（SC_Score）
2015001 123001 86
2015001 123003 69
2015002 123002 77
2015002 123003 99
2015003 123001 98
2015003 123002 95

先用create创建，再用put插入数据，具体代码如下：
Student表：

在这里插入图片描述

查看结果：scan ‘Student’

在这里插入图片描述

Course表
在这里插入图片描述

查看结果：scan ‘Course’
在这里插入图片描述

SC表：
在这里插入图片描述

在这里插入图片描述

查看结果：scan ‘SC’
在这里插入图片描述

2.4编程实现相应的功能
（1）createTable(String tableName, String[] fields)
创建表，参数tableName为表的名称，字符串数组fields为存储记录各个字段名称的数组。要求当HBase已经存在名为tableName的表的时候，先删除原有的表，然后再创建新的表。
创建项目并且导入hbase-2.2.2目录下的lib下和lib/client-facing-thirdparty子目录下所有包。
在这里插入图片描述

代码：

import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
public class HBaseDemo {
	public static Configuration configuration;
	public static Connection connection;
	public static Admin admin;
	public static void main(String[] args)throws IOException{
		String fields[]= {"a","b","c"};
		createTable("Java_test",fields);
	}
	public static void init(){
	    configuration  = HBaseConfiguration.create();
	    configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
	    try{
	        connection = ConnectionFactory.createConnection(configuration);
	        admin = connection.getAdmin();
	    }catch (IOException e){
	        e.printStackTrace();
	    }
	}
	//关闭连接
	public static void close(){
	    try{
	        if(admin != null){
	            admin.close();
	        }
	        if(null != connection){
	            connection.close();
	        }
	    }catch (IOException e){
	        e.printStackTrace();
	    }
	}
	public static void createTable(String tableName,String[] fields) throws IOException {
		 init();
		 TableName tablename = TableName.valueOf(tableName);
		 if(admin.tableExists(tablename)){
		 System.out.println("表已存在，正在删除");
		 admin.disableTable(tablename);
		 admin.deleteTable(tablename);//删除原来的表
		 System.out.println("删除成功");
		 }
		 HTableDescriptor hTableDescriptor= new HTableDescriptor(tablename);
		 for(String str:fields){
		 HColumnDescriptor hColumnDescriptor = new HColumnDescriptor(str);
		 hTableDescriptor.addFamily(hColumnDescriptor);
		 }
		 admin.createTable(hTableDescriptor);
		 System.out.println("表创建成功");
		 close();
		}
}

运行结果
表不存在时：
在这里插入图片描述

第二次运行表已存在时:
在这里插入图片描述

（2）addRecord(String tableName, String row, String[] fields, String[] values)
向表tableName、行row（用S_Name表示）和字符串数组fields指定的单元格中添加对应的数据values。其中，fields中每个元素如果对应的列族下还有相应的列限定符的话，用“columnFamily:column”表示。例如，同时向“Math”、“Computer Science”、“English”三列添加成绩时，字符串数组fields为{“Score:Math”, ”Score:Computer Science”, ”Score:English”}，数组values存储这三门课的成绩。
新建表test2，添加列族
在这里插入图片描述

代码：

import java.io.IOException;
import java.util.Scanner;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
public class Test2 {
    public static Configuration configuration;
    public static Connection connection;
    public static Admin admin;
    public static void main(String[] args)throws IOException{
	String fields[]= {"Score:Math", "Score:Computer Science", "Score:English"};
	System.out.println("输入姓名，数学成绩，电脑成绩，英语成绩");
	Scanner sc=new Scanner(System.in);
	String name = sc.next();
	String values[]= {sc.next(), sc.next(), sc.next()};
	addRecord("test2",name,fields,values);
}
public static void init(){
    configuration  = HBaseConfiguration.create();
    configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
    try{
        connection = ConnectionFactory.createConnection(configuration);
        admin = connection.getAdmin();
    }catch (IOException e){
        e.printStackTrace();
    }
}
//关闭连接
public static void close(){
    try{
        if(admin != null){
            admin.close();
        }
        if(null != connection){
            connection.close();
        }
    }catch (IOException e){
        e.printStackTrace();
    }
}
public static void addRecord(String tableName,String row,String[] fields,String[] values) throws 
IOException {
   init();
   Table table = connection.getTable(TableName.valueOf(tableName));
   for(int i = 0;i < fields.length;i++){
       Put put = new Put(row.getBytes());
       String[] cols = fields[i].split(":");//列族，列名
       put.addColumn(cols[0].getBytes(), cols[1].getBytes(), values[i].getBytes());
       table.put(put);
       System.out.println("添加成功");
     }
   table.close();
   close();
  }
}

运行结果：
在这里插入图片描述

（3）scanColumn(String tableName, String column)
浏览表tableName某一列的数据，如果某一行记录中该列数据不存在，则返回null。要求当参数column为某一列族名称时，如果底下有若干个列限定符，则要列出每个列限定符代表的列的数据；当参数column为某一列具体名称（例如“Score:Math”）时，只需要列出该列的数据。
代码：

import java.io.IOException;
import java.util.Scanner;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
public class Test3 {
    public static Configuration configuration;
    public static Connection connection;
    public static Admin admin;
    public static void main(String[] args)throws IOException{
	Scanner sc=new Scanner(System.in);
	System.out.println("请输入表名和列名：");
	String tablename =sc.next();
	String columname =sc.next();
	scanColumn(tablename,columname);
}
public static void init(){
    configuration  = HBaseConfiguration.create();
    configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
    try{
        connection = ConnectionFactory.createConnection(configuration);
        admin = connection.getAdmin();
    }catch (IOException e){
        e.printStackTrace();
    }
}
//关闭连接
public static void close(){
    try{
        if(admin != null){
            admin.close();
        }
        if(connection != null){
            connection.close();
        }
    }catch (IOException e){
        e.printStackTrace();
    }
}
public static void scanColumn(String tableName,String column)throws IOException{
	 init();
	 Table table = connection.getTable(TableName.valueOf(tableName));//获取相关表信息
	try {
	 Scan scan = new Scan();
		if (column.lastIndexOf(":") == -1) {//列族
			 scan.addFamily(Bytes.toBytes(column));//指定需要的family
		} else {
			 String[] cols = column.split(":");//列族，列名
			 scan.addColumn(cols[0].getBytes(), cols[1].getBytes());
		}
	 ResultScanner scanner = table.getScanner(scan);
	 for (Result result = scanner.next(); result != null; result = scanner.next()){
	     showCell(result);
	 }
	}catch (Exception e) {
	System.out.println("null");
	}
	 table.close();
	 close();
	}
	//输出
	public static void showCell(Result result){
	  Cell[] cells = result.rawCells();
	  for(Cell cell:cells){
	      System.out.println("RowName:"+new String(CellUtil.cloneRow(cell))+" ");
	      System.out.println("Timetamp:"+cell.getTimestamp()+" ");
	      System.out.println("column Family:"+new String(CellUtil.cloneFamily(cell))+" ");
	      System.out.println("row Name:"+new String(CellUtil.cloneQualifier(cell))+" ");
	      System.out.println("value:"+new String(CellUtil.cloneValue(cell))+" ");
	  }
    }
}

运行结果：
指定列族：
在这里插入图片描述

在这里插入图片描述

指定列：
在这里插入图片描述

（4）modifyData(String tableName, String row, String column)
修改表tableName，行row（可以用学生姓名S_Name表示），列column指定的单元格的数据。
代码：

import java.io.IOException;
import java.util.Scanner;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;

public class Test4 {
    public static Configuration configuration;
    public static Connection connection;
    public static Admin admin;
    public static void main(String[] args)throws IOException{
	Scanner sc=new Scanner(System.in);
	System.out.println("请输入表名,行键，列名，新值：");
	String tableName =sc.next();
	String rowname =sc.next();
	String columname =sc.next();
	String val =sc.next();
	modifyData(tableName,rowname,columname,val);
}
public static void init(){
    configuration  = HBaseConfiguration.create();
    configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
    try{
        connection = ConnectionFactory.createConnection(configuration);
        admin = connection.getAdmin();
    }catch (IOException e){
        e.printStackTrace();
    }
}
public static void close(){
    try{
        if(admin != null){
            admin.close();
        }
        if(connection != null){
            connection.close();
        }
    }catch (IOException e){
        e.printStackTrace();
    }
}

public static void modifyData(String tableName,String row,String column,String val)throws 
IOException{
     init();
     Table table = connection.getTable(TableName.valueOf(tableName));
     Put put = new Put(row.getBytes());
     if (column.lastIndexOf(":") != -1) {//列族
    	 String[] cols = column.split(":");//列族，列名
    	 put.addColumn(cols[0].getBytes(),cols[1].getBytes(),val.getBytes());
	} 
     else put.addColumn(column.getBytes(),null,val.getBytes());
     System.out.println("新值已修改为："+val);
     table.put(put);
     table.close();
     close();
  }
}

运行结果：
在这里插入图片描述

从shell查看
在这里插入图片描述

（5）deleteRow(String tableName, String row)
删除表tableName中row指定的行的记录。
代码：

import java.io.IOException;
import java.util.Scanner;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
public class Test5 {
    public static Configuration configuration;
    public static Connection connection;
    public static Admin admin;
    public static void main(String[] args)throws IOException{
	Scanner sc=new Scanner(System.in);
	System.out.println("请输入表名,行键：");
	String tableName =sc.next();
	String rowname =sc.next();
	deleteRow(tableName,rowname);
}
public static void init(){
    configuration  = HBaseConfiguration.create();
    configuration.set("hbase.rootdir","hdfs://localhost:9000/hbase");
    try{
        connection = ConnectionFactory.createConnection(configuration);
        admin = connection.getAdmin();
    }catch (IOException e){
        e.printStackTrace();
    }
}
public static void close(){
    try{
        if(admin != null){
            admin.close();
        }
        if(connection != null){
            connection.close();
        }
    }catch (IOException e){
        e.printStackTrace();
    }
}
public static void deleteRow(String tableName,String row)throws IOException{
	 init();
	 Table table = connection.getTable(TableName.valueOf(tableName));
	 Delete delete = new Delete(row.getBytes());
	 table.delete(delete);
	 table.close();
	 close();
  }
}

运行结果：
在这里插入图片描述

用shell查看
在这里插入图片描述

已经被删除

问题总结
起初直接先安装了Hbase，没有先装zookeeper，导致很不稳定，时常Hmaster崩溃；Hbase与Hadoop的jar包冲突问题；zookeeper和Hbase的配置问题，出现了一些小问题，都得以解决，记录在了此处。

东方桑

关注

5
点赞
踩
16

收藏

觉得还不错? 一键收藏
1
评论
大数据实验Hbase安装部署和使用javaapi调用

实验目的和要求1.1 实验目的 理解HBase在Hadoop体系结构中的角色； 熟练使用HBase操作常用的Shell命令； 熟悉HBase操作常用的Java API。1.2 实验软硬件环境 操作系统： Ubuntu19.04； Hadoop版本：3.2.0； HBase版本：2.2.1； JDK版本：jdk-13； Java IDE：Eclipse 4.3。实验记录2.1 安装Hbase建议先自己安装zookeeper，我们不使用hbase自带的zookeepe.
复制链接

扫一扫