[1]Hbase-overview

HBase - Overview

原文hbase_overview

Limitations of Hadoop

  • 顺序批处理数据,简单job也要扫整个dataset
    Hadoop can perform only batch processing, data will be accessed only in a sequential manner. That means one has to search the entire dataset even for the simplest of jobs.

  • 大数据集间的相互依赖处理是顺次的,这点来看,迫切需要一种单位时间随机获取数据的解决方案
    A huge dataset when processed results in another huge data set, which should also be processed sequentially. At this point, a new solution is needed to access any point of data in a single unit of time (random access)

  • 已有的解决方案有:HBase, Cassandra, couchDB, Dynamo, MongoD

One can store the data in HDFS either directly or through HBase. Data consumer reads/accesses the data in HDFS randomly using HBase. HBase sits on top of the Hadoop File System and provides read and write access
在这里插入图片描述

HBase and HDFS

HDFSHBase
HDFS is a distributed file system suitable for storing large files.HBase is a database built on top of the HDFS
HDFS does not support fast individual record lookupsHBase provides fast lookups for larger tables.
It provides high latency batch processing; no concept of batch processing.It provides low latency access to single rows from billions of records (Random access).
It provides only sequential access of data.HBase internally uses Hash tables and provides random access, and it stores the data in indexed HDFS files for faster lookups.

Installing HBase 单机安装HBase

http://www.interior-dsgn.com/apache/hbase/stable/hbase-0.98.24-hadoop2-bin.tar.gz
tar -zxvf hbase-0.98.8-hadoop2-bin.tar.gz
移动到安装目录

Configuring HBase Pseudo-Distributed Mode

hbase-env.sh

设置JAVA_HOME如
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/home.

hbase-site.xml

在hbase的家目录中建立一个文件夹data/zk

<configuration>
//Here you have to set the path where you want HBase to store its files.
	<property>
			<name>hbase.rootdir</name> 
			<value>hdfs://localhost:9000/hbase</value>
	</property>
	
	//Here you have to set the path where you want HBase to store its built in zookeeper files.
	<property> 
			<name>hbase.zookeeper.property.dataDir</name>
		    <value>/home/hadoop/zookeeper/data/zk</value>
	</property> 
	
	<property> 
	    <name>hbase.cluster.distributed</name> 
	    <value>true</value>
   </property>
</configuration>

Start HBase

先启动hadoop,然后启动Hbase

bin/hbase-daemon.sh start zookeeper
bin/hbase-daemon.sh start master
bin/hbase-daemon.sh start regionserver

查询进程

$ jps
77456 SecondaryNameNode
77363 DataNode
77651 NodeManager
77290 NameNode
77573 ResourceManager
78039 HRegionServer
77891 HQuorumPeer

Web UI

http://localhost:60010/
此时会在hdfs中建立默认文件
http://localhost:50070/explorer.html#/hbase

$ hadoop dfs -ls /hbase
Found 7 items
drwxr-xr-x   - didi supergroup          0 2018-03-28 20:06 /hbase/.tmp
drwxr-xr-x   - didi supergroup          0 2018-03-28 20:07 /hbase/WALs
drwxr-xr-x   - didi supergroup          0 2018-03-28 20:07 /hbase/corrupt
drwxr-xr-x   - didi supergroup          0 2018-03-19 23:50 /hbase/data
-rw-r--r--   1 didi supergroup         42 2018-03-19 23:50 /hbase/hbase.id
-rw-r--r--   1 didi supergroup          7 2018-03-19 23:50 /hbase/hbase.version
drwxr-xr-x   - didi supergroup          0 2018-03-28 20:17 /hbase/oldWALs

Setting Java Environment

通过java libraries和Hbase通信,避免“class not found”异常,需要设置habase的lib到classPath ~/.bashrc添加

export CLASSPATH = $CLASSPATH://home/hadoop/hbase/lib/*

HBASE - SHELL

General Commands

  • status - 状态, 如servers数.
  • version - version.
  • table_help - table相关命令帮助.
  • whoami - 使用者的信息.

Data Definition Language

表操作命令

  • create - 建表.
  • list - 列举表.
  • disable - 失能表
  • is_disabled - 验证表是否失能
  • enable - 失能表
  • is_enabled - Verifies whether a table is enabled.
  • describe - table 描述
  • alter - Alters a table.
  • exists - 是否存在.
  • drop - drop table.
  • drop_all - 正则匹配删除表.
  • Java Admin API - 提供JAVA API 实现DDL操作,主要在package org.apache.hadoop.hbase.client中HBaseAdmin和HTableDescriptor两个主要类

Data Manipulation Language

  • put - 向table中的特定row的特定column中put 一条cell值
  • get - 获取一行row 或者一个 cell.
  • delete - Deletes a cell value in a table.
  • deleteall - Deletes all the cells in a given row.
  • scan - Scans and returns the table data.
  • count - Counts and returns the number of rows in a table.
  • truncate - Disables, drops, and recreates a specified table.
  • Java client API - org.apache.hadoop.hbase.client中HTable 、Put、Get主要类提供DML CRUD CreateRetrieveUpdateDelete 操作

启动shell

./bin/hbase shell

/hbase-0.98.24-hadoop2$ bin/hbase shell
2018-03-28 20:48:09,311 INFO  [main] Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.98.24-hadoop2, r9c13a1c3d8cf999014f30104d1aa9d79e74ca3d6, Thu Dec 22 02:36:05 UTC 2016

hbase(main):001:0>
hbase(main):002:0> list
TABLE
hbase(main):003:0> status
1 active master, 0 backup masters, 1 servers, 1 dead, 2.0000 average load
hbase(main):004:0> version
0.98.24-hadoop2, r9c13a1c3d8cf999014f30104d1aa9d79e74ca3d6, Thu Dec 22 02:36:05 UTC 2016

退出exit 或者 ctr+c

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值