HBase Concept

- Data Model, sparse, distributed, persisted multidimensional sorted map

(row:string, column:string, time:int64) -> string //both key and value are uninterpreted bytes
  • Row 
  1. single row read and update is atomic. 
  2. design of row key to make good rows locality when access them by range. note the order of int (1,10,100, 2,...)
  • Column Family, should be declared upfront. basic unit of access control. columns under the column family have same type. disk(compress and index) and memory statistics are based on column family
  • Timestamp (64-bit int), multiple versions of cell. values are ordered in timestamp decreasing order. values can be garbage collected by specifying last n versions or new-enough versions (e.g., values written in last several days)

- API, Get, Put, Scan, Delete

HBase doesn't modify data in place. delete is handled by putting tomstones on. These tomstones, along with dead values are cleaned up on major compaction

get rows by,

  1. row key
  2. row range
  3. scan


- Building block


  • region, regions split by row range comprise table. they can be distributed and load-balanced in different machines (region servers). 
  • HFile, pesisted, ordered immutable map from keys to values. Block indices is loaded to memory for looking up. then load blocks to memory. 


  1. Datablock and metablock can be compressed, gzip or lzo
  2. FileInfo, hfile meta info and user customized info
  3. Datablock index, Index's key is the key of the first record in the datablock
  4. trainer, fixed offset in hfile. it's loaded first when access hfile.
  • HLog, one region server one Hlog file. it's a sequence file (HLogKey -> KeyValue). 

HLogKey: table+region+sequence nbr+timestamp

KeyValue: from HFile

cons: need to split log and send splits to different region servers when recover from the region server failure.

pros: only append log to one log file avoiding seeking time to multiple files.


- Architecture view




  • client, access hbase via API, cache some info, like region location
  • zookeeper
  1.  only one master gets lock of master service. all other masters keep trying to get locks in case of the master failures.
  2. a file pointing to the root region of META table
  3. status of all region servers, notify server up and down event to master
  4. hbase schema info, table and column family
  • Master
  1. assign region to region server when Hbase startup and table is created. existing regions will be assigned to the same region server before Hbase shutdown to keep data locality.
  2. region server load-balancing
  3. reassign regions to other region server after being notified of region server down
  4. schema update
  5. reclaim of HDFS file
  6. assign ROOT and META table to region server when startup
  • region server
  1. maintain its region being assigned. I/O talks to client
  2. split large region

- workflow

  • locate region (b+ tree of three-level structure

  1. non-splittable root region
  2. META table contains locations of regions.  row key = table name + start key of region
  3. all records of MEAT table are in memory of regionserver hosting the region of META table
  4. up to (128MB/1KB) * (128MB/1KB) =2(34) regions 
  5. 6-rounds between client and server to recover invalid region cache in client
  • read/write, 
  1. write, first to WAL and memstore. flush memstore to storefile if exceeds valve. a redo point sends to zookeeper for recovery.
  2. read, merged view of storefiles and memstore. need minor compact and major compact to reduce to one storefile to improve read performance. 
  • region assignment, done by Master who knows region servers, region's affliation and unassigned regions.
  • region server up/down, 
  1. up, put a file under zookeeper's server folder and lock it. Master gets notified by zookeeper.
  2. down, zookeeper unlock the file and master can then lock the file (master enquiries all files under the server folder to get file lock info or master failed to talk to the region server after several tries). After that, Master reassign the regions to other region servers
  • master up/down
  1. up, master tries to get the master service lock to become the primary master; get all servers from zookeeper; get all regions from region servers;  access META table and calculate unassigned regions.
  2. down, cannot create/delete table, cannot update schema, cannot assign region, cannot merge region


reference: http://mvplee.iteye.com/blog/2247221, the bigtable paper

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值