cloud
文章平均质量分 63
iteye_15479
这个作者很懒,什么都没留下…
展开
-
DataCenter使用SSD Drive的好处
http://arstechnica.com/business/news/2009/10/latest-migrations-show-ssd-is-ready-for-some-datacenters.ars 1. save floor space.2. cost saving in the long run: take power and cooling cost into cou...2009-10-29 14:27:59 · 333 阅读 · 0 评论 -
Exceptions in HDFS
Log them here for later analysis. 2009-08-26 01:17:37,798 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in receiveBlock for block blk_5223350282761282817_281131 java.nio.chann...2009-08-26 09:57:32 · 187 阅读 · 0 评论 -
hbase master cannot start up
2010-10-06 15:21:44,704 INFO org.apache.hadoop.hbase.master.HMaster: vmName=Java HotSpot(TM) Client VM, vmVendor=Sun Microsystems Inc., vmVersion=16.3-b012010-10-06 15:21:44,704 INFO org.apache.h...原创 2010-11-11 13:24:31 · 144 阅读 · 0 评论 -
hadoop-0.20.2+737 and hbase-0.20.6 not compatible?
master log里面发现0 region servers, 0 dead, average load NaNregionserver hang住。原因不明。原创 2010-11-11 13:28:29 · 110 阅读 · 0 评论 -
HDFS scalability: the limits to growth
Abstract: The Hadoop Distributed File Sys-tem (HDFS) is an open source system currently being used in situations where massive amounts of data need to be processed. Based on experience with the l...原创 2010-11-30 12:52:38 · 157 阅读 · 0 评论 -
HBase importing
slides by hbase developer Ryan Rawson.原创 2010-11-30 12:54:03 · 87 阅读 · 0 评论 -
Cloud Security?
看了一些文章,主要是保证用户怎么保证存储在公有云的数据的完整性和正确性,即使用一套逻辑或者叫协议供用户和云存储提供者使用,使得用户能够相信数据放在存储厂商那里没有被删改,而且数据对于厂商是加密的。基于早就成型的密码学算法,这些文章都给出了有效的算法。 现在问题是如果用户不但要用公有云的存储,还要用它的计算结点,怎么保证整个过程的安全呢?...原创 2011-09-02 14:23:28 · 104 阅读 · 0 评论 -
problems in building hadoop
When I try to modify some code in ASF hadoop-0.20.2 and compile it with command "ant jar". It reports no *route to host* exceptions. I do this on a server that with no direct internet connectio...原创 2010-12-22 10:28:04 · 108 阅读 · 0 评论 -
zookeeper介绍
ZooKeeper是Hadoop的正式子项目,它是一个针对大型分布式系统的可靠协调系统,提供的功能包括:配置维护、名字服务、分布式同步、组服务等。ZooKeeper的目标就是封装好复杂易出错的关键服务,将简单易用的接口和性能高效、功能稳定的系统提供给用户。Zookeeper是Google的Chubby一个开源的实现.是高有效和可靠的协同工作系统.Zookeeper能够用来leader选举,...原创 2010-07-23 10:51:24 · 141 阅读 · 0 评论 -
Will all HFiles managed by a regionserver kept open
code 没看仔细,所以在hbase 的mail list上面问了这么个问题。其实再仔细看一下big table的paper就知道肯定是open的。现在分析的结果是hbase random read的performance决定在几个方面:1)HDFS的seek操作,平均每次random get导致几次seek?2)memory copy; 这个问题尤其在data locality差的时候,...原创 2011-01-19 10:29:25 · 212 阅读 · 0 评论 -
impact of total region numbers?
这几天tune了hbase的几个参数,有些有意思的结果。具体看我下面的邮件吧。 For example, I have total some data and I can tune hbase.hregion.max.filesize to increase/decrease total region number, rite?I want ...原创 2011-01-19 16:31:09 · 128 阅读 · 0 评论 -
cassandra example
http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/原创 2011-01-19 16:39:06 · 129 阅读 · 0 评论 -
【读书笔记】Data warehousing and analytics infrastructure at facebook
这好像是sigmod2010上的paper。读了之后做了以下几点记录:1. facebook的hadoop cluster分成scribe hadoop cluster: scribe servers将web service的log汇总然后存到HDFS上。通常会有带宽成为bottleneck的问题,这时候可以考虑压缩,但是一个副作用就是在buffer待压缩的数据的同时导致latency...原创 2011-03-18 22:03:27 · 172 阅读 · 0 评论 -
HDFS中两种random read比较
code version: hadoop-0.19.1 首先说pread。pread会明确的把要读的size传给datanode(在new BlockReader的时候) /** * Read bytes starting from the specified position. * * @param position start read ...2009-08-13 20:40:31 · 908 阅读 · 0 评论 -
How Does HDFS Deletes Files?
In the HDFS design document, it introduces deletes and undeletes in HDFS. File Deletes and Undeletes When a file is deleted by a user or an application, it is notimmediately removed from...2009-07-30 15:04:17 · 100 阅读 · 0 评论 -
云计算 定义
【云计算】概念是由Google提出的,这是一个美丽的网络应用模式。狭义云计算是指IT基础设施的交付和使用模式,指通过网络以按需、易扩展的方式获得所需的资源;广义云计算是指服务的交付和使用模式,指通过网络以按需、易扩展的方式获得所需的服务。这种服务可以是IT和软件、互联网相关的,也可以是任意其他的服务,它具有超大规模、虚拟化、可靠安全等独特功效;“云计算”图书版本也很多,都从理论和实践上介绍了云计算...原创 2010-09-10 17:26:16 · 136 阅读 · 0 评论 -
HDFS
Run hadoop fsck / will give you summary of current HDFS status including some useful information : Minimally replicated blocks: 51224 (100.0 %) Over-replicated blocks: 0 (0.0 %) Under-repl...原创 2010-01-27 09:26:48 · 130 阅读 · 0 评论 -
当hbase的master无法停掉的时候……
some times you will find the master in a hbase cluster cannot be stopped, so check the following two items suggested by some guy in hbase user mail list.But I'm not sure it helps. > If zookee...原创 2010-09-28 11:19:48 · 123 阅读 · 0 评论 -
What are some tips for configuring HBase?
Jeff Hammerbacher 8 endorsements2 votes by Oleksiy Kovyrin and Alex KamilMuch of this content is taken from the HBase Overview [1] and the HBase Default Configur...原创 2010-10-11 15:42:59 · 130 阅读 · 0 评论 -
Hbase read performance with increasing number of client threads
在跑ycsb的时候,发现在做heavy的get操作的时候,ycsb统计的latency很大,100个thread的时候就接近100ms,而从ganglia上看hbase的“get_avg_time"这个metric发现只有20~30左右。最终查看code,发现100个thread共享同一个连接,所有Call的请求数据都走这一个连接,所以在大量请求并发时会造成拥堵,latency变大。具体看下...原创 2010-10-12 23:19:32 · 135 阅读 · 0 评论 -
细说HBase怎么完成一个Get操作 (client side)
源码解析基于HBase-0.20.6。先看HTable类get()方法的code: HTable.java /** * Extracts certain cells from a given row. * @param get The object that specifies what data to fetch and from which ...原创 2010-10-14 14:37:33 · 236 阅读 · 0 评论 -
HDFS NotReplicatedYetException
I encounter exception below when I use copyFromLocal to copy several big files (10G) to HDFS. Hadoop guy’s word to explain this: “I noticed the same recently. For me it happened since the datanodes ...2009-07-21 16:57:49 · 2944 阅读 · 3 评论 -
Advantages of Kosmix's KFS vs. HDFS
A post about KFS vs. HDFSOctober 02, 2007Advantages of Kosmix's KFS vs. HDFSI was excited to learn last week that my friends at Kosmix have decided to open source a project long in the works: th...原创 2009-07-21 17:12:52 · 129 阅读 · 0 评论 -
HDFS的写操作策略
量了一下4个datanode时候HDFS写操作时每个node分配block的情况,每个datanode是4张disk,写了个脚本处理log后发现平均每个node分配的block数量占写的block总数量的25%,而每个node上的每个disk又分到了25%的block数目。因此,HDFS的写分配算法相对在我这个cluster上还是均匀的。 具体看了一下code,选datanode的算法是c...2009-07-21 17:15:27 · 123 阅读 · 0 评论 -
云计算海量数据是要这么转移过去的……
看Amazon的S3,发现如果要提供大的数据量,就要物理运输了,呵呵。Using AWS Import/ExportTo use the AWS Import/Export beta you simply:Copy your data to a portable storage device (see below for supported devices). Email...2009-07-21 17:22:20 · 104 阅读 · 0 评论 -
EMC's Cloud Storage System
EMC this week took the wraps off its long-awaited cloud storage infrastructure solution, moving the vendor long known for proprietary hardware offerings into the market for commodity hardware....原创 2009-07-29 13:24:34 · 148 阅读 · 0 评论 -
EMC Atmos and Atmos onLine —The Yin and Yang of Unstructured Data Storage
AtmosIn fall 2008, EMC launched Atmos as its solution to the Dilemma of Unstructured Data, especially static and distributed data. The Atmos storage platform is very different from EMC’s other stora...原创 2009-07-30 09:32:49 · 152 阅读 · 0 评论 -
What's Xen?
Xen的介绍。原创 2012-12-23 17:19:53 · 153 阅读 · 0 评论