HBase 第一篇

最新推荐文章于 2024-08-15 14:09:45 发布

iteye_8075

最新推荐文章于 2024-08-15 14:09:45 发布

阅读量128

点赞数

分类专栏： HBase 文章标签：大数据数据库 shell

本文链接：https://blog.csdn.net/iteye_8075/article/details/82235410

版权

HBase 专栏收录该内容

11 篇文章 0 订阅

订阅专栏

Welcome to Apache HBase!

HBase is the Hadoop database. Think of it as a distributed scalable Big Data store.

hadoop 的database，类似与google的Big Table

When Would I Use HBase?

Use HBase when you need random, realtime read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware. HBase is an open-source, distributed, versioned, column-oriented store modeled after Google's Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop and HDFS.

Features

HBase provides:

Linear and modular scalability.
Strictly consistent reads and writes.
Automatic and configurable sharding of tables
Automatic failover support between RegionServers.
Convenient base classes for backing Hadoop MapReduce jobs with HBase tables.
Easy to use Java API for client access.
Block cache and Bloom Filters for real-time queries.
Query predicate push down via server side Filters
Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
Extensible jruby-based (JIRB) shell
Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

When Should I Use HBase?(这个据对的是重点哦，找了很久，貌似要找到适合我应用的框架了，赞。 fast record lookups (and updates) )

First, make sure you have enough data. HBase isn't suitable for every problem. If you have hundreds of millions or billions of rows, then HBase is a good candidate. If you only have a few thousand/million rows, then using a traditional RDBMS might be a better choice due to the fact that all of your data might wind up on a single node (or two) and the rest of the cluster may be sitting idle.

Second, make sure you have enough hardware. Even HDFS doesn't do well with anything less than 5 DataNodes (due to things such as HDFS block replication which has a default of 3), plus a NameNode.

HBase can run quite well stand-alone on a laptop - but this should be considered a development configuration only.

What Is The Difference Between HBase and Hadoop/HDFS?

HDFS is a distributed file system that is well suited for the storage of large files. It's documentation states that it is not, however, a general purpose file system, and does not provide fast individual record lookups in files. HBase, on the other hand, is built on top of HDFS and provides fast record lookups (and updates) for large tables. This can sometimes be a point of conceptual confusion. HBase internally puts your data in indexed "StoreFiles" that exist on HDFS for high-speed lookups. See the Chapter 5, Data Model and the rest of this chapter for more information on how HBase achieves its goals.

iteye_8075

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HBase 第一篇

Welcome to Apache HBase!HBase is the Hadoop database. Think of it as a distributed scalable Big Data store.hadoop 的database，类似与google的Big TableWhen Would I Use HBase?Use HBase when ...
复制链接

扫一扫

专栏目录