为什么hbase的value都是使用字节数组byte[]来存储

最新推荐文章于 2024-07-07 03:20:32 发布

20150414

最新推荐文章于 2024-07-07 03:20:32 发布

阅读量2.7k

点赞数 1

分类专栏： hbase 文章标签： hbase

hbase 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

HBase以字节数组存储数据，以实现更灵活且高效的存储。这种方式允许存储不同类型的数据而无需预先定义数据类型，简化了插入操作。当数据类型或字段变化时，只需将值转换为Bytes.ToBytes()即可。此外，字节数组存储在某些情况下可节省存储空间，尤其是在处理大规模数据时，这种效率提升更为显著。

摘要由CSDN通过智能技术生成

这是原文，上面的答案翻译

https://stackoverflow.com/questions/11834544/why-hbase-stores-value-as-byte-array-instead-of-typed-value-like-string-long

不能看的同学可以看一下原文回答

it allows us to store any kind of data without much fuss. For example, imagine you have to store a product related data into your hbase table, say ID, make, country, price etc. To store each of these parameters you would have to take care of the individual datatypes of each of these parameters in advance which will definitely add some overhead. And unlike RDBMSs, hbase doesn't ask for all this at the time of table creation. So, even if datatypes of these parameters change tomorrow or you decide to add some parameters(with some new datatype), all you have to do is wrap the value in Bytes.ToBytes() and push it into your table. All this makes insertions faster. Also, sometimes storing a value in a serialized byte[] form saves a few bytes as compared to storing the same values in their native format. And this minor saving becomes quite significant when you deal with BigData. Long story short, Hbase does this to make things faster and to make storage more efficient, keeping the overhead of internal data structures to a minimum

谷歌翻译：

它允许我们毫不费力地存储任何类型的数据。例如，假设您必须将产品相关数据存储到您的hbase表中，例如ID，品牌，国家/地区，价格等。要存储这些参数中的每一个，您必须提前处理每个参数的各个数据类型。这肯定会增加一些开销。与RDBMS不同，hbase在创建表时不会要求所有这些。因此，即使这些参数的数据类型明天发生变化，或者您决定添加一些参数（使用一些新的数据类型），您所要做的就是将值包装在Bytes.ToBytes（）中并将其推送到表中。所有这些使得插入更快。此外，与以原始格式存储相同值相比，有时以序列化byte []形式存储值可节省几个字节。当你处理BigData时，这种轻微的节省变得非常重要。简而言之，Hbase这样做可以使事情变得更快，并使存储更加高效，从而将内部数据结构的开销降至最低