Hbase的使用
一、zookeeper
1、 zookeeper的安装
(1) 上传并解压;
(2) 修改zookeeper/conf/zoo.cnf,查看或修改dataDir=/tmp/zookeeper的目录(数据目录);
(3) 在zoo.cnf中修改:
clientPort=2181
dataDir=/usr/zookeeper-3.5.3-beta/datadir
syncLimit=5
tickTime=2000
initLimit=10
(4) 在最后面创建server.1=192.168.80.12:2888:3888
其中具体格式为:
server(固定).第4步你设置的myid内容=连接ip:zookeeper内部联系端口号: zookeeper内部选举端口号.
(5) 创建dataDir=/tmp/zookeeper(mkdir -p / tmp/zookeeper)或其他目录;
(6) 在dataDir=/tmp/zookeeper目录下创建myid的文件,文件内容为不同的数字;
2、 Zookeeper的启动
(1)进入zookeeper/bin目录下,运行./zkServer.sh,启动服务(一点要每台电脑分别启动zookeeper服务);
(2) 运行./zkServer.sh status,查看zookeeper运行状态;
(3)运行./zkCli.sh start,启动客户端;
(4)可以执行help查看命令提示;
(5)常用命令:create set get delete ls quit
其中ls/get-w代表设置监听器,即改变时会发生提醒
create -s代表创建带序号的节点
-e代表创建临时的节点(客户端下线会删除该节点)
-p代表创建稳定持久的节点
3、 zookeeper的java连接
ZooKeeper zooKeeper= new ZooKeeper("192.168.80.12",2000,null);
可以用zooKeeper的api(creat,setData,getData,getChildren)来进行操作
注:create的ACL(权限------通常使用Ids.OPEN_ACL_UNSAFE)有:
还有,监听者:
new Watcher() {
public void process(WatchedEvent watchedEvent) {
//EventType.NodeChildrenChanged代表监听类型,即监听什么动作发生(还有set,get之类)
//KeeperState.SyncConnected代表当前客户端连接状态
if (watchedEvent.getType() == Event.EventType.NodeChildrenChanged &&
watchedEvent.getState() == Event.KeeperState.SyncConnected) {
try {
getNode();
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
二、hbase
1、 安装
(1)上传及解压
(2)修改hbas/conf/hbase-env.sh
修改JAVA_HOME以及HBASE_MANAGER_ZK的值
(3) 修改hbas/conf/hbase-site.cfg
hbase.tmp.dir
hbase.rootdir
hbase.cluster.distributed
hbase.zookeeper.quorum
hbase.zookeeper.property.dataDir
zoo.cfg会覆盖hbase-site.xml中的属性
| ||||||
(4) 修改hbase/conf/regionservers
写入集群的主机名或者IP地址
(6) 进入hbase/bin下,运行./start-hbase.sh开始运行集群服务(进入ip:16010可以进入WEB UI查询已有表格)
(7) 运行./hbase shell进入到shell命令界面
2、 hbase的使用
(1)创建名称空间:create_namespace ‘名称空间’ (查看/删除用list/drop_namespace)
(2)创建表格命令:create ‘名称空间:表名’,‘列族名1’’,’列族名2’(删除用truncate,或者先disable,再drop)
(3)检查表是否存在命令:exists ‘名称空间:表名’
(4)增加列族命令:alter ‘名称空间:表名’,‘列族名1’’,’列族名2’
(5)禁用/不禁用表命令:disable/enable ‘名称空间:表名’
(6)检查是否禁用/不禁用表命令:is_disable/is_enable ‘名称空间:表名’
(7)查看表格命令:list
(8)输入数据命令:put ‘命名空间:表名’,‘行名’,‘列族名:列名’,‘值’
(9)查看数据命令:get ‘命名空间:表名’,’行名’,’列族名:列名’
get ‘命名空间:表名’,’行名’ 得到一行数据
(10)查看记录行数:count ‘命名空间:表名’
(11)删除数据命令:delete ‘命名空间:表名’,’行名’,’列族名:列名’
(12)更新数据命令:put ‘命名空间:表名’,‘行名’,‘列族名:列名’,‘值’
(13)扫描数据命令:scan ‘命名空间:表名’
(14)扫描整个列族命令:scan '命名空间:表名', {COLUMN=>'列族名'}
(15)扫描某个列命令:scan '命名空间:表名', {COLUMNS=> '列族名:列名'}
(16)表描述:describe/desc ‘命名空间:表名’
(17)可以用help查看命令帮助
具体:https://www.cnblogs.com/ityouknow/p/7344001.html
3、 hbase的java编程
(1) 增删改查范例:
public class MyHbase {
Connection connection;
/**
* 获取连接
* @throws IOException
*/
@Before
publicvoid createConnection()throws IOException{
Configuration configuration= HBaseConfiguration.create();
//必须的参数设置,用于设定连接到那台机器
//同时必须在运行这个程序的机器的hosts文件中加入主机名,不然失败!!!!!!
configuration.set("hbase.zookeeper.quorum","192.168.80.12,192.168.80.13");
connection=ConnectionFactory.createConnection(configuration);
}
/**
* 建表、设置表结构的方法
* @throws IOException
*/
public void createTables()throws IOException{
Admin admin= connection.getAdmin();
ColumnFamilyDescriptor build= ColumnFamilyDescriptorBuilder
.newBuilder("family_name".getBytes())
.setBlocksize(131072)
.setMinVersions(1)
.setMaxVersions(3)
.build();
TableDescriptor my_table= TableDescriptorBuilder
.newBuilder(TableName.valueOf("my_table"))
.setColumnFamily(build)
.build();
admin.createTable(my_table);
admin.close();
connection.close();
}
/**
* 创建namespace方法
* @throws IOException
*/
public void createNameSpace()throws IOException{
Admin admin= connection.getAdmin();
NamespaceDescriptor namespaceDescriptor=NamespaceDescriptor.create("namespace_name").build();
admin.createNamespace(namespaceDescriptor);
admin.close();
connection.close();
}
/**
* 写入数据
* @throws IOException
*/
public void putData()throws IOException{
Table my_table= connection.getTable(TableName.valueOf("my_table"));
Put put=newPut("zk001".getBytes()); //行键
put.addColumn("family_name".getBytes(),"col_name".getBytes(),Bytes.toBytes(24)); //24为value
my_table.put(put);
my_table.close();
connection.close();
}
/**
* 查询数据
* @throws IOException
*/
public void getDate()throws IOException{
Table my_table= connection.getTable(TableName.valueOf("my_table"));
/*或者这种,比较麻烦
Get get=new Get("zk1".getBytes());
Result result =my_table.get(get);
CellScanner cellScanner =result.cellScanner();
while (cellScanner.advance()){
Cell current =cellScanner.current();
byte[] rowArray =current.getRowArray();
byte[] familyArray =current.getFamilyArray();
byte[] qualifierArray =current.getQualifierArray();
System.out.println(Bytes.toString(rowArray,current.getRowOffset(),current.getRowLength()));
System.out.println(Bytes.toString(familyArray,current.getFamilyOffset(),current.getFamilyLength()));
System.out.println(Bytes.toString(qualifierArray,current.getQualifierOffset(),current.getQualifierLength()));
}*/
Scan scan=new Scan();
scan.withStartRow("zk001".getBytes()).withStopRow("zk005".getBytes(),true);
ResultScanner scanner= my_table.getScanner(scan);
//ResultScanner scanner = my_table.getScanner("name".getBytes()); 遍历列族名为name的全部数据
Iterator<Result> iterator = scanner.iterator();
while (iterator.hasNext()){
Result next= iterator.next();
byte[]row = next.getRow();
byte[]value = next.value();
System.out.print(Bytes.toString(row));
System.out.println(Bytes.toString(value));
}
}
}
(2)批量导入到hbase中
1)main方法
public static void main(String[] args)throws InterruptedException,IOException, ClassNotFoundException {
Configuration configuration1= HBaseConfiguration.create();
configuration1.set("","192.168.80.12");
Job instance= Job.getInstance(configuration);
job.setJarByClass(MyHadoop.class);
job.setMapperClass(MySecondMapper.class);
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setOutputValueClass(Put.class);
FileInputFormat.setInputPaths(instance,newPath("hdfs://master:9000/INFO/Input"));
FileOutputFormat.setOutputPath(instance,newPath("hdfs://master:9000/INFO/Output"));
Connection connection= ConnectionFactory.createConnection(configuration1);
Table my_table= connection.getTable(TableName.valueOf("my_table"));
HFileOutputFormat2.configureIncrementalLoad(instance,my_table,connection.
getRegionLocator(TableName.valueOf("my_table")));
boolean b1= instance.waitForCompletion(true);
//调用bulkload批量导入
LoadIncrementalHFiles load= new LoadIncrementalHFiles(configuration1);
load.doBulkLoad(newPath("hdfs://master:9000/OutPut"),connection.getAdmin(),
my_table,connection.getRegionLocator(TableName.valueOf("my_table")));
//退出
System.exit(b1?0:1);
}
2)map方法
protected void map(Textrowkey, IntWritable value,Context context)throws IOException,InterruptedException {
ImmutableBytesWritable immutableBytesWritable=new ImmutableBytesWritable("rowkey".getBytes());
byte[] families = Bytes.toBytes("family"); //表中建好的列族名
byte[] keys = Bytes.toBytes("count"); //key
byte[]values = Bytes.toBytes(value.get()); //value
Put put= new Put("rowkey".getBytes());
put.addColumn(families,keys,values);
//put.addImmutable(families,keys,values);与上面是同样效果
context.write(immutableBytesWritable,put);
}
3)batch方法
public class MyHbase {
public static void main(String[] args) throws IOException {
System.setProperty("hadoop.home.dir", "J:\\hadoop-2.7.7");
Configuration configuration = HBaseConfiguration.create();
configuration.set("hbase.zookeeper.quorum","hadoop1,hadoop2");
Connection connection = ConnectionFactory.createConnection(configuration);
Admin admin = connection.getAdmin();
putBatchData(connection);
admin.close();
connection.close();
}
public void putBatchData(Connection connection) throws IOException, InterruptedException {
Table my_java_table = connection.getTable(TableName.valueOf("my_java_table"));
ArrayList<Row> rows = new ArrayList<Row>();
Put put1 = new Put("rowkey1".getBytes());
put1.addColumn("myfamily".getBytes(),"mycolumn12".getBytes(),"hahahahahahaha".getBytes());
rows.add(put1);
Object[] res = new Object[rows.size()];
my_java_table.batch(rows,res);
}
}