HBase:HBase的数据模型

1.声明

当前内容主要用于学习和复习,内容主要为理解HBase的数模型

当前内容主要来源:HBase官方文档

2.基本描述

In HBase, data is stored in tables, which have rows and columns. This is a terminology overlap with relational databases (RDBMSs), but this is not a helpful analogy. Instead, it can be helpful to think of an HBase table as a multi-dimensional map.

在HBase中,数据被存储在表中,表中具有行和列。这个是与关系型数据库的术语重叠的,但是这不是一个有用的类比,相反可以将HBase表看作多维映射

3.Table(表)

An HBase table consists of multiple rows.
一个HBase表由多个行(row)构成

4.Row(行)

A row in HBase consists of a row key and one or more columns with values associated with them. Rows are sorted alphabetically by the row key as they are stored. For this reason, the design of the row key is very important. The goal is to store data in such a way that related rows are near each other. A common row key pattern is a website domain. If your row keys are domains, you should probably store them in reverse (org.apache.www, org.apache.mail, org.apache.jira). This way, all of the Apache domains are near each other in the table, rather than being spread out based on the first letter of the subdomain.

一个row在Hbase中由一个row key和一个或者多个列值组成行存储的时候是通过row key进行排列存储的。所以设置这个row key是非常重要的。目标就是为了让存储数据的时候以相邻的方式存储。通常一个row key被设置为网站域。如果你的row key是域。你应该反向存储他们((org.apache.www, org.apache.mail, org.apache.jira)。这样,所有的Apache域在表中都很接近,而不是基于子域的第一个字母展开

row中包含一个row key和多个列值,使用row key进行排序存储的,设计row key时非常重要

5.Column(列)

A column in HBase consists of a column family and a column qualifier, which are delimited by a : (colon) character.
在HBase中一个列由一个列族和一个列限定符组成,使用冒号分隔(cf:a)

6.Column Family(列族)

Column families physically colocate a set of columns and their values, often for performance reasons. Each column family has a set of storage properties, such as whether its values should be cached in memory, how its data is compressed or its row keys are encoded, and others. Each row in a table has the same column families, though a given row might not store anything in a given column family.

列族在物理上将一组列和其值组成。每一个列族都有一组存储属性,像是否将其值缓存在内存中,怎样压缩数据或者为row key加码,和其他的。每个行在表中有相同的列族,但是可以不再给定的列族中存放数据

7. Column Qualifier(列限定符)

A column qualifier is added to a column family to provide the index for a given piece of data. Given a column family content, a column qualifier might be content:html, and another might be content:pdf. Though column families are fixed at table creation, column qualifiers are mutable and may differ greatly between rows.

一个列限定符被添加到列族中,提供索引和数据分数。给定一个列族为content,一个列限定符可能时content:html也可能时content:pdf.虽然列族在表创建时被固定的,但是列限定符是可以修改的,行之间具有很大的差异

8.Cell(单元格)

A cell is a combination of row, column family, and column qualifier, and contains a value and a timestamp, which represents the value’s version.

一个单元格上绑定了行、列族、列限定符、包含一个值和一个时间撮,时间撮表示值的版本

9.Timestamp(时间撮)

A timestamp is written alongside each value, and is the identifier for a given version of a value. By default, the timestamp represents the time on the RegionServer when the data was written, but you can specify a different timestamp value when you put data into the cell.

一个时间撮写在每个值的旁边,是给定版本的表示符。默认情况下,这个时间撮表示写入ReginServer上面的时间,但是将值写入单元格时,可以只当其时间撮的值

10.分析图解

在这里插入图片描述

11.总结

1.当前的HBase中数据存放在table中,一个table包括row、colum

2.一个row具有row key和一些列的值组成,通过get row key方式获取匹配的row key的值(默认按照row key自动排序)

3.一个cloumn中包含列族和列限定符,并使用冒号分隔,列限定符主要用于添加到列族中并添加索引和数据分数值

以上纯属个人见解,如有问题请联本人!

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值