转载 [转] KFS,一个克隆GFS的文件系统收藏

KFS(KOSMOS DISTRIBUTED FILE SYSTEM),一个类似GFS、Hadoop中HDFS 的一个开源的分布式文件系统。

PS: google的三大基石 gfs,bigtable,map-reduce 相对应的开源产品 gfs:kfs(据传google创史人的同窗所创),hdfs(hadoop的子项目) bigtable:hbase(hadoop的子项目),Hypertable(从hbase项目组分离出去的,用c++实现) map-reduce:hadoop(apache的项目,java实现,目前创史人在yahoo全力打造,已有2000个以上的节点并行计算的规模)

Google两个共同创始人的两个大学同窗(印度人)Anand Rajaraman和Venky Harinarayan,创立的一个新的搜索引擎Kosmix最近捐献了一个克隆GFS的文件系统KFS项目HadoopHypertable这两个项目也开始支持KFS来做底层的存储。KFS是用C++写的,但是其client支持C++,Java和Python。那么KFS到底有什么特性呢?

  1. 支持存储扩充(添加新的chunckserver,系统自动感知)
  2. 有效性(复制机制保证文件有效性)
  3. 负载平衡(系统周期地检查chunkservers的磁盘利用,并重新平衡chunkservers的磁盘利用,HDFS现在还没有支持)
  4. 数据完整性(当要读取数据时检查数据的完整性,如果检验出错使用另外的备份覆盖当前的数据)
  5. 支持FUSE(HDFS也有工具支持FUSE)
  6. 使用契约(保证Client缓存的数据和文件系统中的文件保持一致性)

HDFS未支持的高级特性:

  1. 支持同一文件多次写入和Append,不像HDFS支持一次写入多次读取和不支持Append(最近要增加Append,但是遇到许多问题)。
  2. 文件及时有效,当应用程序创建一个文件时,文件名在系统马上有效。不像HDFS文件只当输入流关闭时才在系统中有效,因此,如果应用程序在关闭前出现异常导致没有关闭输入流,数据将会丢失。

 

 

官方网站: http://kosmosfs.sourceforge.net/

来自startup的垂直搜索引擎http://www.kosmix.com/的开源项目,又一个开源的类似google mapreduce 的分布式文件系统,可以应用在诸如图片存储、搜索引擎、网格计算、数据挖掘这样需要处理大数据量的网络应用中。与hadoop集成得也比较好,这样可以充分利用了hadoop一些现成的功能,基于C++。

Introduction

Applications that process large volumes of data (such as, search engines, grid computing applications, data mining applications, etc.) require a backend infrastructure for storing data. Such infrastructure is required to support applications whose workload could be characterized as:

  • Primarily write-once/read-many workloads
  • Few millions of large files, where each file is on the order of a few tens of MB to a few tens of GB in size
  • Mostly sequential access

We have developed the Kosmos Distributed File System (KFS), a high performance distributed file system to meet this infrastructure need.

The system consists of 3 components:

  1. Meta-data server : a single meta-data server that provides a global namespace
  2. Block server: Files are split into blocks or chunks and stored on block servers. Blocks are also known as chunk servers. Chunkserver store the chunks as files in the underlying file system (such as, XFS on Linux)
  3. Client library: that provides the file system API to allow applications to interface with KFS. To integrate applications to use KFS, applications will need to be modified and relinked with the KFS client library.

KFS is implemented in C++. It is built using standard system components such as, TCP sockets, aio (for disk I/O), STL, and boost libraries. It has been tested on 64-bit x86 architectures running Linux FC5.

While KFS can be accessed natively from C++ applications, support is also provided for Java applications. JNI glue code is included in the release to allow Java applications to access the KFS client library APIs.

Features
  • Incremental scalability: New chunkserver nodes can be added as storage needs increase; the system automatically adapts to the new nodes.
  • Availability: Replication is used to provide availability due to chunk server failures. Typically, files are replicated 3-way.
  • Per file degree of replication: The degree of replication is configurable on a per file basis, with a max. limit of 64.
  • Re-replication: Whenever the degree of replication for a file drops below the configured amount (such as, due to an extended chunkserver outage), the metaserver forces the block to be re-replicated on the remaining chunk servers. Re-replication is done in the background without overwhelming the system.
  • Re-balancing: Periodically, the meta-server may rebalance the chunks amongst chunkservers. This is done to help with balancing disk space utilization amongst nodes.
  • Data integrity: To handle disk corruptions to data blocks, data blocks are checksummed. Checksum verification is done on each read; whenever there is a checksum mismatch, re-replication is used to recover the corrupted chunk.
  • File writes: The system follows the standard model. When an application creates a file, the filename becomes part of the filesystem namespace. For performance, writes are cached at the KFS client library. Periodically, the cache is flushed and data is pushed out to the chunkservers. Also, applications can force data to be flushed to the chunkservers. In either case, once data is flushed to the server, it is available for reading.
  • Leases: KFS client library uses caching to improve performance. Leases are used to support cache consistency.
  • Chunk versioning: Versioning is used to detect stale chunks.
  • Client side fail-over: The client library is resilient to chunksever failures. During reads, if the client library determines that the chunkserver it is communicating with is unreachable, the client library will fail-over to another chunkserver and continue the read. This fail-over is transparent to the application.
  • Language support: KFS client library can be accessed from C++, Java, and Python.
  • FUSE support on Linux: By mounting KFS via FUSE, this support allows existing linux utilities (such as, ls) to interface with KFS.
  • Tools: A shell binary is included in the set of tools. This allows users to navigate the filesystem tree using utilities such as, cp, ls, mkdir, rmdir, rm, mv. Tools to also monitor the chunk/meta-servers are provided.
  • Deploy scripts: To simplify launching KFS servers, a set of scripts to (1) install KFS binaries on a set of nodes, (2) start/stop KFS servers on a set of nodes are also provided.
  • Job placement support: The KFS client library exports an API to determine the location of a byte range of a file. Job placement systems built on top of KFS can leverage this API to schedule jobs appropriately.
  • Local read optimization: When applications are run on the same nodes as chunkservers, the KFS client library contains an optimization for reading data locally. That is, if the chunk is stored on the same node as the one on which the application is executing, data is read from the local node.
KFS with Hadoop

KFS has been integrated with Hadoop using Hadoop’s filesystem interfaces. This allows existing Hadoop applications to use KFS seamlessly. The integration code has been submitted as a patch to Hadoop-JIRA-1963 (this will enable distribution of the integration code with Hadoop). In addition, the code as well as instructions will also be available for download from the KFS project page shortly. As part of the integration, there is job placement support for Hadoop. That is, the Hadoop Map/Reduce job placement system can schedule jobs on the nodes where the chunks are stored.

参考资料:

  • distribute file system

http://lucene.apache.org/hadoop/

http://www.danga.com/mogilefs/

http://www.lustre.org/

http://oss.sgi.com/projects/xfs/

 

http://www.megite.com/discover/filesystem

http://swik.net/distributed+cluster

  • cluster&high availability

http://www.gluster.org/index.php

http://www.linux-ha.org/

http://openssi.org

http://kerrighed.org/

http://openmosix.sourceforge.net/

 

http://www.linux.com/article.pl?sid=06/09/12/1459204

http://labs.google.com/papers/mapreduce.html

 

 

文章来源:

 

 

 

发表于 @ 2008年06月27日 00:08:00|评论(loading...)|举报|收藏

新一篇: [转] 使用GDB和Valgrind调试C程序 | 旧一篇: [原创]自己动手写 HTTP Server

用户操作
[即时聊天] [发私信] [加为好友]
heiyeluren
订阅我的博客
XML聚合  FeedSky
订阅到鲜果
订阅到Google
订阅到抓虾
heiyeluren的公告

联系方式:


访问统计: free hit counter code
FeedSky订阅:
FeedSky订阅
文章分类
收藏
    ::eYou::
    kevin world
    lewis - 老吕
    qyb - BT的花
    Realzay的blog
    叶金荣
    天堂地狱鬼-dulao5's Blog
    沙漠之周
    狐狸糊涂
    老韩
    與子觀化
    鞋门
    ::Yahoo::
    glemir’s blog
    happy_fish - 分布式文件系统FastDFS
    LinZi's Blog
    Rainx
    stauren
    互联网,请记住我 - 162同学的技术博客
    冰的河
    北冥之鱼-张彪同学
    天韵之星
    小蚂蚁同学滴测试博客
    随网之舞 - kaven的DHTML博客
    风雪之隅
    ::朋友::
    【推荐】中文分类网
    DDR的博客
    kevin world
    miky
    PHPCup.cn论坛
    俺兄弟的blog
    冰河的技术博客:心随风动
    好旅网
    小少的技术博客
    无尘居
    晋陵路人的Blog
    李天华同学滴技术博客
    沙狐部落
    轻量级的editor
    ::网友::
    blankyao同学
    Code & Stock.
    Hello, Willko
    LionD8的Blog
    magiclab.cn
    MooPHP - 轻量级PHP框架
    Phzzy
    Xhttpd.cn
    张贺同学的博客
    技术大牛老余的博客
    抚琴居
    旋木木同学滴博客
    流水孟春
    矛盾网
    程序人生
    邢红瑞的blog
    阿健的博客
    :PHP博客:
    .: Easy style :.
    [琴剑楼]
    CoolCode.cn
    Haohappy的Blog
    Hightman
    iwind的blog
    Javascript开发站
    JD Space
    Nio's Weblog
    Open Source PHP
    PHP面对对象
    SourceForge.net
    trip的专栏
    UGIA.cn
    windix's blog
    Windix's Weblog
    一个藏袍
    俊麟 Michael`s blog
    偶然的blog
    刘敏的blog
    大龄青年的Blog
    廖宇雷的blog
    懒猫开始新生活blog
    某人的栖息地
    王春生的博客
    神仙
    :牛人blog:
    DBA notes
    http://blog.csdn.net/tingya/
    侯捷网站
    孟岩
    搜索引擎研究
    方舟
    王咏刚的BLOG
    竹笋炒肉
    荣耀
    车东[Blog^2]
    透明思考
    陈硕的Blog
    DHTML
    DHTMLGoodies
    FCKEditor
    Google Code
    Google Web Toolkit
    HTML Goodies
    HTML.it
    HTMLAre
    HTMLdog
    JavaScript Kit
    jQuery
    KindEditor
    Prototype
    TinyMCE
    W3 Schools
    Yahoo JavaScript Developer Center
    Yahoo! Developer Network
    Yahoo! UI Library (YUI)
    网页设计师Web标准
    Java国内站
    ChinaJavaWorld.com技术论坛
    IBM developerWorks 中国: Java
    Java中文站
    Java开源大全
    Java爱好者
    JR - Java翻译站
    J道-JDON
    Matrix: 与Java共舞
    中国Java开发网
    中文java技术网
    PHP国内站点
    CSDN PHP论坛
    Discuz!
    FleaPHP
    Google--PHP用户组
    IBM DeveloperWorks
    JavsScript技术讨论
    Nirvana Studio
    OpenPHP.cn
    PHPChina
    TiM Club
    中文 PFC 1.0 手册--PHP5的开发包
    中文 PFC 1.0 手册--PHP5的开发包
    中文PHP网
    太平洋--PHP开发区
    爱MySQL
    超越PHP
    PHP国外站点
    ADOdb
    Agavi Framework
    Cake PHP
    MySQL Performance Blog
    MySQL Performance Blog
    Nonaweb
    PEAR
    PECL
    PECL Windows
    PHP Builder
    PHP Classes
    PHP Classes
    PHP New Download
    PHP Security Consortium
    php.MVC
    php.MVC
    PHPkitchen(OO & MVC)
    phpPatterns
    PHP国外图书下载
    smart template
    Smarty
    SourceForge.net
    Symfony Framework
    Zend
    Zend Framework
    Unix C/C++
    Free Gentux
    周立发的blog(Linux C)
    Unix/Linux
    BSD智库
    ChinaUnix
    FreeBSDChina
    FreeLAMP
    IBM开发者Linux专区
    Linux Byte
    LinuxKit
    LinuxTS
    Linux伊甸园
    Linux技术中坚站
    Linux非常空间
    Love Unix
    NetBSD&OpenBSD中文用户组
    NetBSD中国社区
    Oracle中国用户讨论组
    OurLinux
    Unix中文
    Unix中文
    Unix中文宝库
    中国Linux公社
    中国Unix用户技术论坛
    中文FreeBSD用户组
    永远的Unix
    炎黄角马
    程序设计
    CSDN
    IBM开发者中心
    Microsoft TechNet: 主页
    MSDN 中文网站
    PHP中文站
    Sun技术社区
    中国IT认证实验室--企业应用技术
    中国协议分析网
    喜悦国际村
    太平洋电脑网---开发特区
    实用网站
    veBook(国外大量免费图书下载网站)
    Whois.net
    中国Web信息博物馆
    中国互联网络信息中心whois查询
    服务器系统信息查看
    网络安全
    AnySide.com
    CGI Secutiry
    K-OTik Security Monitoring
    Linux Security
    Packet Storm Security
    PHP Secure
    RFC中文文档索引
    Safemode.org
    SecuriTeam.com
    Security Corporation
    SecurityFocus
    SecurityTracker
    Zone-h (区域黑客,每天公布各国被黑的网站)
    中华安全网
    中国信息安全组织
    国家计算机网络应急处理中心
    安全天使
    安全焦点
    幻影旅团
    绿盟科技
    网络安全评估中心(cnns )
    在线手册
    Apache2.0中文文档
    Beyond Linux From Scratch
    Debian参考手册
    FreeBSD Porter 手册
    FreeBSD使用手册
    Linux C函数中文参考手册
    MySQL 4.1.0 中文参考手册
    NetBSD在线手册
    OpenBSD在线FAQ
    PHP ADODB 1.99版手册中文翻译(Tripc)
    PHP中文手册(国内)
    PHP中文手册(国外)
    PostgreSQL中文文档
    Red Hat Linux 9入门指南
    Red Hat Linux 9安装指南
    Red Hat Linux 9定制手册
    中国OSS技术手册中心
    技术文档手册中心-ChinaUnix
    存档
    Csdn Blog version 3.1a
    Copyright © heiyeluren