hadoop
iteye_15479
这个作者很懒,什么都没留下…
展开
-
hadoop log level
hadoop version 0.19.1Modify two files: 1. bin/hadoop-daemon.sh: set HADOOP_ROOT_LOGGER.2. src/mapred/org/apache/hadoop/mapred/TaskRunner.java: set -Dhadoop.root.logger.2009-10-29 09:28:26 · 268 阅读 · 0 评论 -
zookeeper介绍
ZooKeeper是Hadoop的正式子项目,它是一个针对大型分布式系统的可靠协调系统,提供的功能包括:配置维护、名字服务、分布式同步、组服务等。ZooKeeper的目标就是封装好复杂易出错的关键服务,将简单易用的接口和性能高效、功能稳定的系统提供给用户。Zookeeper是Google的Chubby一个开源的实现.是高有效和可靠的协同工作系统.Zookeeper能够用来leader选举,...原创 2010-07-23 10:51:24 · 134 阅读 · 0 评论 -
Implementing WebGIS on Hadoop: A Case Study of Improving Small File IO Performan
Implementing WebGIS on Hadoop: A Case Study of Improving Small File IO Performance on HDFS原创 2010-07-26 22:46:33 · 133 阅读 · 0 评论 -
Will all HFiles managed by a regionserver kept open
code 没看仔细,所以在hbase 的mail list上面问了这么个问题。其实再仔细看一下big table的paper就知道肯定是open的。现在分析的结果是hbase random read的performance决定在几个方面:1)HDFS的seek操作,平均每次random get导致几次seek?2)memory copy; 这个问题尤其在data locality差的时候,...原创 2011-01-19 10:29:25 · 198 阅读 · 0 评论 -
impact of total region numbers?
这几天tune了hbase的几个参数,有些有意思的结果。具体看我下面的邮件吧。 For example, I have total some data and I can tune hbase.hregion.max.filesize to increase/decrease total region number, rite?I want ...原创 2011-01-19 16:31:09 · 122 阅读 · 0 评论 -
【读书笔记】Data warehousing and analytics infrastructure at facebook
这好像是sigmod2010上的paper。读了之后做了以下几点记录:1. facebook的hadoop cluster分成scribe hadoop cluster: scribe servers将web service的log汇总然后存到HDFS上。通常会有带宽成为bottleneck的问题,这时候可以考虑压缩,但是一个副作用就是在buffer待压缩的数据的同时导致latency...原创 2011-03-18 22:03:27 · 159 阅读 · 0 评论 -
首相发怒记之hadoop篇
我在youtube上看到的,某位能翻*墙的看一下吧,挺好笑的。http://www.youtube.com/watch?v=hEqQMLSXQlY原创 2012-03-23 12:14:52 · 71 阅读 · 0 评论 -
Using the libjars option with Hadoop
As I have said in my last post, I was developing a hbase based mapreduce application. But one damn thing is the hadoop cluster managed by our system admin has no hbase jars in its classpath...So I...原创 2013-05-20 15:03:23 · 119 阅读 · 0 评论 -
Question on HBase source code
I'm reading source code of hbase. When come to class org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher I find this class has private member *unassignedNodes* but I cannot find where nodes are added i...2013-05-22 15:05:16 · 93 阅读 · 0 评论 -
how to study hadoop?
From StackOverflow[url]http://stackoverflow.com/questions/6385888/what-are-some-good-resources-for-studying-hadoops-source-code[/url]Studying Hadoop or MapReduce can be a daunting task if you ge...原创 2012-04-27 15:34:25 · 91 阅读 · 0 评论 -
Hadoop 2.0 代码分析---MapReduce
本文参考hadoop的版本: hadoop-2.0.1-alpha-src参考: http://www.cnblogs.com/biyeymyhjob/archive/2012/08/16/2640733.html和参考的文章一样,还是考虑具体一个MR的job怎么执行的,这个例子如下:[code="java"] // Create a new Job Job j...原创 2012-10-25 18:27:47 · 201 阅读 · 0 评论 -
Hadoop Version Graph
可以到这里看全文:http://cloudblog.8kmiles.com/2012/01/19/apache-hadoop-version-timeline/有感于hadoop版本的混乱,看到了这篇文章讲解其发展路线,甚好。但是,注意原文是在2012年一月份写的,所以最新的肯定没在上面,谁要是能再总结一下就好了。[img]http://dl.iteye.com/upl...原创 2012-11-14 11:47:08 · 130 阅读 · 0 评论 -
学习hadoop之基于protocol buffers的 RPC
现在版本的hadoop各种server、client RPC端通信协议的实现是基于google的protocol buffers的,如果对这个不熟悉,读code的时候会比较痛苦一些,所以花了些时间学习了一下,然后仿照写了个比较简单的例子,麻雀虽小,五脏俱全,看懂了我这个或许对你读hadoop的code有帮助! :)我现在实现一个简单的server-client方式的calculator,c...2012-11-15 23:23:21 · 157 阅读 · 0 评论 -
problems in building hadoop
When I try to modify some code in ASF hadoop-0.20.2 and compile it with command "ant jar". It reports no *route to host* exceptions. I do this on a server that with no direct internet connectio...原创 2010-12-22 10:28:04 · 101 阅读 · 0 评论 -
HDFS scalability: the limits to growth
Abstract: The Hadoop Distributed File Sys-tem (HDFS) is an open source system currently being used in situations where massive amounts of data need to be processed. Based on experience with the l...原创 2010-11-30 12:52:38 · 150 阅读 · 0 评论 -
HDFS
Run hadoop fsck / will give you summary of current HDFS status including some useful information : Minimally replicated blocks: 51224 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-repl...原创 2010-01-27 09:26:48 · 125 阅读 · 0 评论 -
[转]hadoop at ebay
http://www.ebaytechblog.com/2010/10/29/hadoop-the-power-of-the-elephant/Hadoop – The Power of the Elephantby Anil Madan on 10/29/2010in Machine LearningIn a previous post, Junling discussed data...原创 2011-06-11 21:09:38 · 146 阅读 · 0 评论 -
hadoop cluster at ebay
Friday, December 17, 2010Hadoop cluster at Ebay I am always curious to know how other companies are installing Hadoop clusters. How are they using its ecosystem. Since Hadoop is still relatively new...原创 2011-06-11 21:39:25 · 124 阅读 · 0 评论 -
一个HDFS Error
ERROR: hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink While running a job once I got the following exception10/12/10 21:09:05 INFO h...原创 2011-06-11 21:53:45 · 189 阅读 · 0 评论 -
HDFS NotReplicatedYetException
I encounter exception below when I use copyFromLocal to copy several big files (10G) to HDFS. Hadoop guy’s word to explain this: “I noticed the same recently. For me it happened since the datanodes ...2009-07-21 16:57:49 · 2916 阅读 · 3 评论 -
Advantages of Kosmix's KFS vs. HDFS
A post about KFS vs. HDFSOctober 02, 2007Advantages of Kosmix's KFS vs. HDFSI was excited to learn last week that my friends at Kosmix have decided to open source a project long in the works: th...原创 2009-07-21 17:12:52 · 122 阅读 · 0 评论 -
HDFS的写操作策略
量了一下4个datanode时候HDFS写操作时每个node分配block的情况,每个datanode是4张disk,写了个脚本处理log后发现平均每个node分配的block数量占写的block总数量的25%,而每个node上的每个disk又分到了25%的block数目。因此,HDFS的写分配算法相对在我这个cluster上还是均匀的。 具体看了一下code,选datanode的算法是c...2009-07-21 17:15:27 · 116 阅读 · 0 评论 -
How Does HDFS Deletes Files?
In the HDFS design document, it introduces deletes and undeletes in HDFS. File Deletes and Undeletes When a file is deleted by a user or an application, it is notimmediately removed from...2009-07-30 15:04:17 · 96 阅读 · 0 评论 -
HDFS中两种random read比较
code version: hadoop-0.19.1 首先说pread。pread会明确的把要读的size传给datanode(在new BlockReader的时候) /** * Read bytes starting from the specified position. * * @param position start read ...2009-08-13 20:40:31 · 877 阅读 · 0 评论 -
Exceptions in HDFS
Log them here for later analysis. 2009-08-26 01:17:37,798 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_5223350282761282817_281131 java.nio.chann...2009-08-26 09:57:32 · 177 阅读 · 0 评论 -
HDFS中不起作用的buffer size
/** * Create a new output stream to the given DataNode. * @see ClientProtocol#create(String, FsPermission, String, boolean, short, long) */ DFSOutputStream(String src, FsPermis...2010-05-12 11:38:03 · 350 阅读 · 0 评论 -
hadoop-0.20.2+737 and hbase-0.20.6 not compatible?
master log里面发现0 region servers, 0 dead, average load NaNregionserver hang住。原因不明。原创 2010-11-11 13:28:29 · 99 阅读 · 0 评论 -
hadoop-2.2.0 build failure due to missing dependancy
The bug and fix is at https://issues.apache.org/jira/browse/HADOOP-101102014-01-06 13:18:18 · 104 阅读 · 0 评论