Hbase 学习笔记一 》Data Manipulation

Data manipulation

Every row in an HBase table has a unique identifier called its rowkey. Other coordinates

are used to locate a piece of data in an HBase table, but the rowkey is primary. Just like

a primary key in a table in a relational database, rowkey values are distinct across all

rows in an HBase table. Every interaction with data in a table begins with the rowkey.

Every user in TwitBase is unique, so the user’s name makes a convenient rowkey for

the users table; that’s what you’ll use.

The HBase API is broken into operations called commands. There are five primitive

commands for interacting with HBase: Get, Put, Delete, Scan, and Increment. The

command used to store data is Put. To store data in a table, you’ll need to create a Put

instance. Creating a Put instance from a rowkey looks like this:

Put p = new Put(Bytes.toBytes("Mark Twain"));

Why can’t you store the user’s name directly? All data in HBase is stored as raw data in

the form of a byte array, and that includes the rowkeys. The Java client library provides

a utility class, Bytes, for converting various Java data types to and from byte[] so you

don’t have to worry about doing it yourself. Note that this Put instance has not been

inserted into the table yet. You’re only building the object right now.


一.Storing data

Now that you’ve staged a command for adding data to HBase, you still need to provide

data to store. You can start by storing basic information about Mark, such as his email

address and password. What happens if another person comes along whose name is

also Mark Twain? They’ll conflict, and you won’t be able to store data about them in

TwitBase. Instead of using the person’s real name as the rowkey, let’s use a unique

username and store their real name in a column. Putting (no pun intended!) it

all together:

<span style="font-size:14px;">Put p = new Put(Bytes.toBytes("<strong>TheRealMT</strong>"));
</span>
<span style="font-size:14px;">p.add(Bytes.toBytes("info"),
	Bytes.toBytes("name"),
	Bytes.toBytes("Mark Twain"));
p.add(Bytes.toBytes("info"),
	Bytes.toBytes("email"),
	Bytes.toBytes("samuel@clemens.org"));
 
p.add(Bytes.toBytes("info"),
	Bytes.toBytes("password"),
	Bytes.toBytes("Langhorne"));</span>


Remember, HBase uses coordinates to locate a piece of data within a table. The rowkey

is the first coordinate, followed by the column family. When used as a data coordinate,

the column family serves to group columns. The next coordinate is the column qualifier,

often called simply column, or qual, once you’re versed in HBase vernacular. The column

qualifiers in this example are name, email, and password. Because HBase is schema-less,

you never need to predefine the column qualifiers or assign them types. They’re

dynamic; all you need is a name that you give them at write time. These three coordinates

define the location of a cell. The cell is where HBase stores data as a value. A cell

is identified by its [rowkey, column family, column qualifier] coordinate within a

table. The previous code stores three values in three cells within a single row. The cell

storing Mark’s name has the coordinates [TheRealMT, info, name].

The last step in writing data to HBase is sending the command to the table. That

part is easy:

HTableInterface usersTable = pool.getTable("users");
Put p = new Put(Bytes.toBytes("TheRealMT"));
p.add(...);
usersTable.put(p);
usersTable.close();


二、Modifying data

Changing data in HBase is done the same way you store new data: create a Put object,

give it some data at the appropriate coordinates, and send it to the table. Let’s update

Mark’s password to something more secure.

Put p = new Put(Bytes.toBytes("TheRealMT"));
p.add(Bytes.toBytes("info"),
   Bytes.toBytes("password"),
   Bytes.toBytes("abc123"));
usersTable.put(p);


Under the hood: the HBase write path

Whether you use Put to record a new row in HBase or to modify an existing row, the

internal process is the same. HBase receives the command and persists the change, or

throws an exception if the write fails. When a write is made, by default, it goes into two

places: the write-ahead log (WAL), also referred to as the HLog, and the MemStore

(figure 2.1). The default behavior of HBase recording the write in both places is in

order to maintain data durability. Only after the change is written to and confirmed in

both places is the write considered complete.

The MemStore is a write buffer where HBase accumulates data in memory before a

permanent write. Its contents are flushed to disk to form an HFile when the MemStore

fills up. It doesn’t write to an existing HFile but instead forms a new file on every flush.

The HFile is the underlying storage format for HBase. HFiles belong to a column family,

 

 

and a column family can have multiple HFiles. But a single HFile can’t have data for multiple

column families. There is one MemStore per column family.1

Failures are common in large distributed systems, and HBase is no exception.

Imagine that the server hosting a MemStore that has not yet been flushed crashes.

You’ll lose the data that was in memory but not yet persisted. HBase safeguards against

that by writing to the WAL before the write completes. Every server that’s part of theHBase cluster keeps a WAL to record changes as they happen. The WAL is a file on the

underlying file system. A write isn’t considered successful until the new WAL entry is

successfully written. This guarantee makes HBase as durable as the file system backing

it. Most of the time, HBase is backed by the Hadoop Distributed Filesystem (HDFS).

If HBase goes down, the data that was not yet flushed from the MemStore to the

HFile can be recovered by replaying the WAL. You don’t have to do this manually. It’s

all handled under the hood by HBase as a part of the recovery process. There is a single

WAL per HBase server, shared by all tables (and their column families) served from

that server.

As you can imagine, skipping the WAL during writes can help improve write performance.

There’s one less thing to do, right? We don’t recommend disabling the WAL

unless you’re willing to lose data when things fail. In case you want to experiment, you

can disable the WAL like this:

Put p = new Put();
p.setWriteToWAL(false);


NOTE Not writing to the WAL comes at the cost of increased risk of losing data

in case of RegionServer failure. Disable the WAL, and HBase can’t recover your

data in the face of failure. Any writes that haven’t flushed to disk will be lost.

三、Reading data

Reading data back out of HBase is as easy as writing. Make a Get command instance,

tell it what cells you’re interested in, and send it to the table:

Get g = new Get(Bytes.toBytes("TheRealMT"));
Result r = usersTable.get(g);


The table gives you back a Result instance containing your data. This instance contains

all the columns from all the column families that exist for the row. That’s potentially

far more data than you need. You can limit the amount of data returned by

placing restrictions on the Get instance. To retrieve only the password column, execute

addColumn(). The same can be done per column family using addFamily(), in

which case it’ll return all the columns in the specified column family:

Get g = new Get(Bytes.toBytes("TheRealMT"));
g.addColumn(
	Bytes.toBytes("info"),
	Bytes.toBytes("password"));
Result r = usersTable.get(g);
Retrieve the specific value and convert it back from bytes like so:
Get g = new Get(Bytes.toBytes("TheRealMT"));
g.addFamily(Bytes.toBytes("info"));
byte[] b = r.getValue(
	Bytes.toBytes("info"),
	Bytes.toBytes("email"));
String email = Bytes.toString(b); // "samuel@clemens.org"


Under the hood: the HBase read path

As a general rule, if you want fast access to data, keep it ordered and keep as much of

it as possible in memory. HBase accomplishes both of these goals, allowing it to serve

millisecond reads in most cases. A read against HBase must be reconciled between the

persisted HFiles and the data still in the MemStore. HBase has an LRU cache for reads.

This cache, also called the BlockCache, sits in the JVM heap alongside the MemStore.

The BlockCache is designed to keep frequently accessed data from the HFiles in memory

so as to avoid disk reads. Each column family has its own BlockCache.

Understanding the BlockCache is an important part of understanding how to run

HBase at optimal performance. The “Block” in BlockCache is the unit of data that

HBase reads from disk in a single pass. The HFile is physically laid out as a sequence of

blocks plus an index over those blocks. This means reading a block from HBase requires

only looking up that block’s location in the index and retrieving it from disk. The block

is the smallest indexed unit of data and is the smallest unit of data that can be read from

disk. The block size is configured per column family, and the default value is 64 KB. You

may want to tweak this value larger or smaller depending on your use case. If you primarily

perform random lookups, you likely want a more granular block index, so a

smaller block size is preferred. Having smaller blocks creates a larger index and thereby

consumes more memory. If you frequently perform sequential scans, reading many

blocks at a time, you can afford a larger block size. This allows you to save on memory

because larger blocks mean fewer index entries and thus a smaller index.

Reading a row from HBase requires first checking the MemStore for any pending

modifications. Then the BlockCache is examined to see if the block containing this

row has been recently accessed. Finally, the relevant HFiles on disk are accessed. There

are more things going on under the hood, but this is the overall outline. Figure 2.2

illustrates the read path.

Note that HFiles contain a snapshot of the MemStore at the point when it was

flushed. Data for a complete row can be stored across multiple HFiles. In order to read

 

a complete row, HBase must read across all HFiles that might contain information for

that row in order to compose the complete record.

四、 Deleting data

Deleting data from HBase works just like storing it. You make an instance of the

Delete command, constructed with a rowkey:

Delete d = new Delete(Bytes.toBytes("TheRealMT"));
usersTable.delete(d);


You can delete only part of a row by specifying additional coordinates:

Delete d = new Delete(Bytes.toBytes("TheRealMT"));
d.deleteColumns(
	Bytes.toBytes("info"),
	Bytes.toBytes("email"));
usersTable.delete(d);


The method deleteColumns() removes a cell entirely from the row. This is a distinct

method from deleteColumn() (notice the missing s at the end of the method name),

which operates on the content of a cell.

五、Compactions: HBase housekeeping

The Delete command doesn’t delete the value immediately. Instead, it marks the

record for deletion. That is, a new “tombstone” record is written for that value, marking

it as deleted. The tombstone is used to indicate that the deleted value should no

longer be included in Get or Scan results. Because HFiles are immutable, it’s not until

a major compaction runs that these tombstone records are reconciled and space is truly

recovered from deleted records.

Compactions come in two flavors: minor and major. Both types result in a consolidation

of the data persisted in HFiles. A minor compaction folds HFiles together,

creating a larger HFile from multiple smaller HFiles, as shown in figure 2.3. Restricting

the number of HFiles is important for read performance, because all of them

must be referenced to read a complete row. During the compaction, HBase reads the

content of the existing HFiles, writing records into a new one. Then, it swaps in the

new HFile as the current active one and deletes the old ones that formed the new

one.2 HBase decides which HFiles to compact based on their number and relative

sizes. Minor compactions are designed to be minimally detrimental to HBase performance,

so there is an upper limit on the number of HFiles involved. All of these settings

are configurable.

When a compaction operates over all HFiles in a column family in a given region, it’s

called a major compaction. Upon completion of a major compaction, all HFiles in the

column family are merged into a single file. Major compactions can also be triggered


 

 

for the entire table (or a particular region) manually from the shell. This is a relatively

expensive operation and isn’t done often. Minor compactions, on the other hand, are

relatively lightweight and happen more frequently. Major compactions are the only

chance HBase has to clean up deleted records. Resolving a delete requires removing

both the deleted record and the deletion marker. There’s no guarantee that both the

record and marker are in the same HFile. A major compaction is the only time when

HBase is guaranteed to have access to both of these entries at the same time.

The compaction process is described in greater detail, along with incremental

illustrations, in a post on the NGDATA blog.3

六、Versioned data

In addition to being a schema-less database, HBase is also versioned. For example, you

can look back in time for the original password:

List<KeyValue> passwords = r.getColumn(
	Bytes.toBytes("info"),
	Bytes.toBytes("password"));
b = passwords.get(0).getValue();
String currentPasswd = Bytes.toString(b); // "abc123"
b = passwords.get(1).getValue();
String prevPasswd = Bytes.toString(b); // "Langhorne"

Every time you perform an operation on a cell, HBase implicitly stores a new version.

Creating, modifying, and deleting a cell are all treated identically; they’re all new versions.

Get requests reconcile which version to return based on provided parameters.

The version is used as the final coordinate when accessing a specific cell value. HBase

uses the current time4 in milliseconds when a version isn’t specified, so the version

number is represented as a long. By default, HBase stores only the last three versions;

this is configurable per column family. Each version of data within a cell contributes

one KeyValue instance to the Result. You can inspect the version information in a

KeyValue instance with its getTimestamp() method:

long version =passwords.get(0).getTimestamp(); // 1329088818321

When a cell exceeds the maximum number of versions, the extra records are dropped

during the next major compaction.

Instead of deleting an entire cell, you can operate on a specific version or versions

within that cell. The deleteColumns() method (with the s) described previously operates

on all KeyValues with a version less than the provided version. If no version is provided,

the default of now is used. The deleteColumn() method (without the s) deletes

a specific version of a cell. Be careful which method you call; they have identical calling

signatures and only subtly different semantics.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值