HighLevelOverview

coofucoo: because china government block this tech site, i access it very slow. So i copy the content from offical wiki to make me and you conveniently.

 

The following is a high-level overview of how MogileFS works, and what all the moving pieces are.

 

The involved parties are:

  • Application : thing that wants to store/load files
  • Tracker (the mogilefsd process): event-based parent process/message bus that manages all client communication from applications (requesting operations to be performed), including load balancing those requests onto "query workers", and handles all communication between mogilefsd child processes. You should run 2 trackers on different hosts, for HA, or more for load balancing (should you need more than 2). The child processes under mogilefsd include:
    • Replication -- replicating files around
    • Deletion -- deletes from the namespace are immediate; deletes from filesystems are async
    • Query -- answering requests from clients
    • Reaper -- reenqueuing files for replication after a disk fails
    • Monitor -- monitor health and status of hosts and devices
    • ...
  • Database -- the database that stores the MogileFS metadata (the namespace, and which files are where). This should be setup in a HA config so you don't have a single point of failure.
  • Storage Nodes -- where files are stored. The storage nodes are just HTTP servers that do DELETE, PUT, etc. Any WebDAV server is fine, but mogstored is recommended. mogilefsd can be configured to use two servers on different ports... mogstored for all DAV operations (and sideband monitoring), and your fast/light HTTP server of choice for GET operations. Typically people have one fat SATA disk per mountpoint, each mounted at /var/mogdata/devNN.
High-level flow:
  • app requests to open a file (does RPC via library to a tracker, finding whichever one is up). does a "create_open" request.
  • tracker makes some load balancing decisions about where it could go, and gives app a few possible locations
  • app writes to one of the locations (if it fails writing to one midway, it can retry and write elsewhere).
  • app (client) tells tracker where it wrote to in the "create_close" API.
  • tracker then links that name into the domain's namespace (via the database)
  • tracker, in the background, starts replicating that file around until it's in compliance with that file class's replication policy
  • later, app issues a "get_paths" request for that domain+key (key == "filename"), and tracker replies (after consulting database/memcache/etc), all the URLs that the file is available at, weighted based on I/O utilization at each location.
  • app then tries the URLs in order. (although the tracker's continually monitoring all hosts/devices, so won't return dead stuff, and by default will double-check the existence of the 1st item in the returned list, unless you ask it not to...)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值