hbase
文章平均质量分 81
macyang
Chance is waiting for prepared people and my Status is read the fucking source code.
展开
-
How-to: Use HBase Bulk Loading, and Why
.8.3. Bulk Load ArchitectureThe HBase bulk load process consists of two main steps.9.8.3.1. Preparing data via a MapReduce jobThe first step of a bulk load is to generate转载 2014-07-23 15:08:36 · 1578 阅读 · 0 评论 -
HADOOP-HBase MapReduce Examples
1. HBase MapReduce Read ExampleThe following is an example of using HBase as a MapReduce source in read-only manner. Specifically, there is a Mapper instance but no Reducer, and nothing is being emi转载 2013-01-19 20:34:00 · 2760 阅读 · 0 评论 -
Apache HBase Internals: Locking and Multiversion Concurrency Control
This following post was originally published via blog.apache.org; we republish it here for your convenience.NOTE: This blog post describes how Apache HBase does concurrency control. This assume转载 2013-02-03 22:05:34 · 895 阅读 · 0 评论 -
在HBase上查询地理信息系统(HBase in Action)
本章我们将进入一个使用HBase的新领域:地理信息系统(Geographic Information Systems)。GIS是个有趣的研究领域,因为它提出了两个重要的挑战:大规模数据处理的延迟和空间位置建模。我们将以GIS作为透镜来演示如何让HBase适应这些挑战。为了做到这些,你需要充分运用一些特有的行业知识。8.1 运用地理数据地理系统经常作为在线交互用户体验的基础来使用转载 2013-02-02 19:27:35 · 6002 阅读 · 0 评论 -
Snapshots in HBase 0.96
Snapshots in HBase 0.96 –v0.1 (5/20/12) The initial design document andimplementation for snapshots was proposed in HBASE-50. The original, overalldesign is still valid on the current HBase trunk,原创 2012-11-25 19:42:50 · 2066 阅读 · 0 评论 -
HBase Replication Overview
HBase Replication is a way of copying data from one HBase cluster to a different and possibly distant HBase cluster. It works on the principle that the transactions from the originating cluster are pu转载 2012-11-25 18:31:26 · 955 阅读 · 1 评论 -
HBase, HDFS and durable sync
HBase and HDFS go hand in hand to provide HBase's durability and consistency guarantees.One way of looking at this setup is that HDFS handles the distribution and storage of your data whereas HBas转载 2012-11-13 13:29:03 · 831 阅读 · 0 评论 -
Online HBase Backups with CopyTable
CopyTable is a simple Apache HBase utility that, unsurprisingly, can be used for copying individual tables within an HBase cluster or from one HBase cluster to another. In this blog post, we’ll talk a转载 2012-11-25 18:36:51 · 1341 阅读 · 0 评论 -
Stabilizing a Large HBase Cluster
Running a large HBase cluster smoothly with minimum downtime is a skill which requires a deep understanding of how HBase works. When a disaster strikes, you find yourself digging into HBase code and转载 2012-10-21 21:41:22 · 687 阅读 · 0 评论 -
HBaseAdmin
这篇文章主要想介绍一下HBaseAdmin,这个家伙比我想象中的要强大许多,它能够提供下面几个方面的操作:Just as with the client API you also have an API for administrative tasks at your disposal. Compare this to the Data Definition Language (DDL)原创 2011-08-06 00:45:51 · 4417 阅读 · 0 评论 -
HBase Client: Delete Method
Single DeletesThe variant of the delete() call that takes a single Delete instance is:void delete(Delete delete) throws IOExceptionHBase中用来做删除操作的接口,通过构造一个Delete的实例,传给delete就OK了!原创 2011-07-27 22:57:59 · 1813 阅读 · 0 评论 -
HBase跨集群复制数据的另一种方法
一、从源hbase集群中复制出HBase数据库表到本地目录最好停止HBase,否则可能会丢部分数据[hbase@hadoop200 ~]$ hadoop fs -get /hbase/toplist_ware_total_1009_201232 toplist_ware_total_1009_201232压缩[hbase@hadoop200 ~]$ tar zcvf to转载 2012-08-21 20:55:41 · 3599 阅读 · 1 评论 -
HBase性能优化方法总结
本文主要是从HBase应用程序设计与开发的角度,总结几种常用的性能优化方法。有关HBase系统配置级别的优化,这里涉及的不多,这部分可以参考:淘宝Ken Wu同学的博客。[转发者注明: 关于使用多线程去读取hbase全表数据,推荐先将rowkey根据线程的个数划分为多段,然后将每段 start-key ~ end-key丢给线程去执行!]1. 表的设计1.1 Pre-Cre转载 2013-01-19 23:17:22 · 3877 阅读 · 1 评论 -
A utility for importing/exporting between hbase and csv file
除了可以使用hbase自带的org.apache.hadoop.hbase.mapreduce.TsvImporterMapper, 下面这个实现的很基础也很不错,不过是单线程的。HBase is a NOSQL distributed database system that runs a cluster of machines on top of Hadoop. HBase and转载 2013-01-20 19:32:45 · 6568 阅读 · 1 评论 -
Guide to Using Apache HBase Ports
For those people new to Apache HBase (version 0.90 and later), the configuration of network ports used by the system can be a little overwhelming.In this blog post, you will learn all the TCP port转载 2013-07-21 22:37:28 · 945 阅读 · 0 评论 -
How Scaling Really Works in Apache HBase
At first glance, the Apache HBase architecture appears to follow a master/slave model where the master receives all the requests but the real work is done by the slaves. This is not actually the case,转载 2013-07-21 22:36:50 · 721 阅读 · 0 评论 -
HBase - Who needs a Master?
At first glance, the Apache HBase architecture appears to follow a master/slave model where the master receives all the requests but the real work is done by the slaves. This is not actually the cas转载 2013-06-14 16:10:55 · 845 阅读 · 0 评论 -
Migration to the New Metrics Hotness – Metrics2
IntroductionHBase is a distributed big data store modeled after Google’s Bigtable paper. As with all distributed systems, knowing what’s happening at a given time can help spot problems before转载 2013-06-14 16:55:12 · 818 阅读 · 0 评论 -
Introduction to HBase Mean Time to Recover (MTTR)
HBase is an always-available service and remains available in the face of machine failures and rack failures. Machines in the cluster runs RegionServer daemons. When a RegionServer crashes or the mach转载 2013-06-14 16:41:12 · 1077 阅读 · 0 评论 -
HBase Coprocessors
The original version of the blog was posted athttp://hbaseblog.com/2010/11/30/hbase-coprocessors/ in late 2010, however the site is no longer available. Since we decided to move all blog posts to th原创 2011-05-16 22:51:00 · 3203 阅读 · 0 评论 -
Hbase Table Isolation for Multi-tenancy using Region Server Grouping
Motivation:HBase table isolation is required for scenarios that involve multiple users sharing a common HBase instance. We want to isolate the impact of usage (like read,write patterns) and maintena转载 2013-02-19 22:11:37 · 973 阅读 · 0 评论 -
HBase Futures @Hortonworks
As we have said here, Hortonworks has been steadily increasing our investment in HBase. HBase’s adoption has been increasing in the enterprise. To continue this trend, we feel HBase needs investment转载 2013-02-17 10:31:01 · 1010 阅读 · 0 评论 -
Apache HBase Region Splitting and Merging
For this post, we take a technical deep-dive into one of the core areas of HBase. Specifically, we will look at how Apache HBase distributes load through regions, and manages region splitting. HBase s转载 2013-02-17 10:18:45 · 1490 阅读 · 0 评论 -
Apache HBase AssignmentManager Improvements
AssignmentManager is a module in the Apache HBase Master that manages regions to RegionServers assignment. (See HBase architecture for more information.) It ensures that all regions are assigned and转载 2013-02-16 16:03:57 · 803 阅读 · 0 评论 -
How-to: Enable User Authentication and Authorization in Apache HBase
With the default Apache HBase configuration, everyone is allowed to read from and write to all tables available in the system. For many enterprise setups, this kind of policy is unacceptable. Ad转载 2013-02-16 21:28:19 · 1337 阅读 · 0 评论 -
HBase在淘宝主搜索的Dump中的性能调优
目前HBase已经运用于淘宝主搜索的全量和增量的数据存储,有效的减低的数据库的压力,增强了业务扩展的能力。Dump系统的特点是要求在短时间内处理大量数据,对延时要求高。在实施这个项目过程中,我们积累了一些优化的实践,抛砖引玉,供大家参考。环境:Hadoop CDH3U4 + HBase 0.92.11、 尽可能用LZO数据使用LZO,不仅可以节省存储空间尤其是可以提高传输转载 2012-08-21 07:25:50 · 711 阅读 · 0 评论 -
Hbase Node Management
Node DecommissionYou can stop an individual RegionServer by running the following script in the HBase directory on the particular node:$ ./bin/hbase-daemon.sh stop regionserverThe RegionServer w转载 2012-07-28 12:00:11 · 722 阅读 · 0 评论 -
HBase性能优化方法总结
本文主要是从HBase应用程序设计与开发的角度,总结几种常用的性能优化方法。有关HBase系统配置级别的优化,这里涉及的不多,这部分可以参考:淘宝Ken Wu同学的博客。1. 表的设计1.1 Pre-Creating Regions默认情况下,在创建HBase表的时候会自动创建一个region分区,当导入数据的时候,所有的HBase客户端都向这一个region写转载 2012-03-20 21:46:14 · 4237 阅读 · 0 评论 -
Why HBase disable a large table taking very long time
Jean-Daniel Cryans 在apache mail list: http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/15699上给了一个答复,另外HBase貌似在考虑使用async的方式,也就是说用户发出了disable以后马上返回,但是用户可以调用一个阻塞的命令来查看disable是否完成,有兴趣的读者可以关原创 2011-02-17 19:18:00 · 983 阅读 · 0 评论 -
Hbase Web Based UI
The HBase processes exposes a web-based user interface (in short UI), which you can use to gain insight into the cluster's state, as well as the tables it hosts. The majority of the functionality is r转载 2012-01-03 21:52:12 · 4493 阅读 · 0 评论 -
Apache HBase 0.92.0 has been released
好东西终于released了,欢迎!Today the Apache HBase community has proudly released Apache HBase 0.92.0, a major new version of the scalable distributed data store inspired by Google’s BigTable. Over 670 iss转载 2012-01-30 13:26:44 · 1057 阅读 · 0 评论 -
基于hbase jira读源代码
最近痴迷于hbase jira,索性摘录些比较有意思的jira吧!通过jira中描述的问题,然后读hbase源代码。更多可以参考: https://issues.apache.org/jira/browse/HBASE- HBASE-3287: Add option to cache blocks on hfile write and evict blocks on hfile close原创 2011-12-15 18:44:31 · 2618 阅读 · 0 评论 -
My Experience with HBase Dynamic Partitioning (a.k.a. Region Splitting)
实战积累的关于region split问题,原文地址:http://chilinglam.blogspot.com/2011/12/my-experience-with-hbase-dynamic.html (天朝的人需要翻墙才能看原文)I have been working on HBase for 9 months and below is a summary of my experi原创 2011-12-26 15:44:36 · 1107 阅读 · 0 评论 -
Storage Infrastructure Behind Facebook Messages
One of the talks that I particularly enjoyed yesterday at HPTS 2011 was Storage Infrastructure Behind Facebook Messages by Kannan Muthukkaruppan. In this talk, Kannan talked about the Facebook sto转载 2011-12-13 16:13:56 · 1055 阅读 · 0 评论 -
Facebook Messages & HBase
最近膜拜了一下Nicolas Spiegelberg(Hbase committer),下面就转他share的几个比较赞的分享:New Compaction Heuristichttps://issues.apache.org/jira/browse/HBASE-3209Facebook Messages & HBasehttp://www.sl原创 2011-12-13 11:51:48 · 1244 阅读 · 0 评论 -
HBase异常——当RegionServer Crash之后
真是一篇好文章啊,条例清楚,看过这篇文章再去看看代码(推荐阅读http://www.spnguru.com里面关于hbase的文章,很赞!)。对于分布式数据库来说,容错处理是非常重要的一个部分。RegionServer是HBase系统中存在最多的节点,所以对于RegionServer的容错处理对于HBase来说至关重要。本文对RegionServer的容错处理进行Step by Ste转载 2011-11-25 14:36:43 · 2260 阅读 · 0 评论 -
Concurrent LRU Block Cache in HBase
Jonathan Gray 用HBASE-1460来说明自己是如何实现LRU Block Cache的:The LRU-based block cache that will be committed in HBASE-1192 is thread-safe but contains a big lock on the hash map. Under high load, the bl原创 2011-02-15 22:47:00 · 2036 阅读 · 0 评论 -
Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 1
Today, rather than discussing new projects or use cases built on top of CDH, I’d like to switch gears a bit and share some details about the engineering that goes into our products. In this post, I’ll转载 2012-04-16 23:38:16 · 880 阅读 · 0 评论 -
Avoiding Full GCs in HBase with MemStore-Local Allocation Buffers: Part 3
This is the third and final post in a series detailing a recent improvement in Apache HBase that helps to reduce the frequency of garbage collection pauses. Be sure you’ve read part 1 and part 2 bef转载 2012-04-16 23:45:56 · 962 阅读 · 0 评论 -
Configuring HBase Memstore: What You Should Know
In this post we discuss what HBase users should know about one of the internal parts of HBase: the Memstore. Understanding underlying processes related to Memstore will help to configure HBase cluster转载 2012-07-22 21:44:29 · 854 阅读 · 0 评论