Hbase 学习笔记一》Summary

最新推荐文章于 2022-06-28 16:13:08 发布

残阙的歌

最新推荐文章于 2022-06-28 16:13:08 发布

阅读量354

点赞数

分类专栏： hbase

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/u010666884/article/details/51702077

版权

hbase 专栏收录该内容

9 篇文章 0 订阅

订阅专栏

Summary

In case you missed something along the way, here is a quick overview of the material

covered in this chapter.

HBase is a database designed for semistructured data and horizontal scalability. It

stores data in tables. Within a table, data is organized over a four-dimensional coordinate

system: rowkey, column family, column qualifier, and version. HBase is schema-less,

requiring only that column families be defined ahead of time. It’s also type-less, storing

all data as uninterpreted arrays of bytes. There are five basic commands for interacting

with data in HBase: Get, Put, Delete, Scan, and Increment. The only way to query

HBase based on non-rowkey values is by a filtered scan.6

HBase is not an ACID-compliant database6

HBase isn’t an ACID-compliant database. But HBase provides some guarantees that

you can use to reason about the behavior of your application’s interaction with the

system. These guarantees are as follows:

1 Operations are row-level atomic. In other words, any Put() on a given row

either succeeds in its entirety or fails and leaves the row the way it was

before the operation started. There will never be a case where part of the row

is written and some part is left out. This property is regardless of the number

of column families across which the operation is being performed.

2 Interrow operations are not atomic. There are no guarantees that all operations

will complete or fail together in their entirety. All the individual operations

are atomic as listed in the previous point.

3 checkAnd* and increment* operations are atomic.

4 Multiple write operations to a given row are always independent of each other

in their entirety. This is an extension of the first point.

5 Any Get() operation on a given row returns the complete row as it exists at

that point in time in the system.

6 A scan across a table is not a scan over a snapshot of the table at any point.

If a row R is mutated after the scan has started but before R is read by the

scanner object, the updated version of R is read by the scanner. But the data

read by the scanner is consistent and contains the complete row at the time

it’s read.

From the context of building applications with HBase, these are the important points

you need to be aware of.

The data model is logically organized as either a key-value store or as a sorted map of maps.

The physical data model is column-oriented along column families and individual records

are stored in a key-value style. HBase persists data records into HFiles, an immutable file

format. Because records can’t be modified once written, new values are persisted to

new HFiles. Data view is reconciled on the fly at read time and during compactions.

The HBase Java client API exposes tables via the HTableInterface. Table connections

can be established by constructing an HTable instance directly. Instantiating an

HTable instance is expensive, so the preferred method is via the HTablePool because it

manages connection reuse. Tables are created and manipulated via instances of the

HBaseAdmin, HTableDescriptor, and HColumnDescriptor classes. All five commands

are exposed via their respective command objects: Get, Put, Delete, Scan, and Increment.

Commands are sent to the HTableInterface instance for execution. A variant of

Increment is also available using the HTableInterface.incrementColumnValue()

method. The results of executing Get, Scan, and Increment commands are returned in

instances of Result and ResultScanner objects. Each record returned is represented

by a KeyValue instance. All of these operations are also available on the command line

via the HBase shell.

Schema designs in HBase are heavily influenced by anticipated data-access patterns.

Ideally, the tables in your schema are organized according to these patterns.

The rowkey is the only fully indexed coordinate in HBase, so queries are often implemented

as rowkey scans. Compound rowkeys are a common practice in support of

these scans. An even distribution of rowkey values is often desirable. Hashing algorithms

such as MD5 or SHA1 are commonly used to achieve even distribution.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hbase 学习笔记一》Summary

SummaryIn case you missed something along the way, here is a quick overview of the materialcovered in this chapter.HBase is a database designed for semistructured data and horizontal scalability
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。