GFS一些问题总结

1. 数据写入流程

gfs论文中给定的流程如下:

1. The client asks the master which chunkserver holdsthe current lease for the chunkan d the locations ofthe other replicas. If no one has a lease, the mastergrants one to a replica it chooses (not shown).

2. The master replies with the identity of the primary andthe locations of the other (secondary) replicas. Theclient caches this data for future mutations. It needsto contact the master again only when the primary becomes unreachable or replies that it no longer holdsa lease.

3. The client pushes the data to all the replicas. A clientcan do so in any order. Each chunkserver will storethe data in an internal LRU buffer cache until thedata is used or aged out. By decoupling the data flowfrom the control flow, we can improve performance byscheduling the expensive data flow based on the networktopology regardless of which chunkserver is theprimary. Section 3.2 discusses this further.

4. Once all the replicas have acknowledged receiving thedata, the client sends a write request to the primary.The request identifies the data pushed earlier to all ofthe replicas. The primary assigns consecutive serialnumbers to all the mutations it receives, possibly frommultiple clients, which provides the necessary serialization.It applies the mutation to its own local statein serial number order.

5. The primary forwards the write request to all secondaryreplicas. Each secondary replica applies mutationsin the same serial number order assigned bythe primary.

6. The secondaries all reply to the primary indicatingthat they have completed the operation.

7. The primary replies to the client. Any errors encounteredat any of the replicas are reported to the client.In case of errors, the write may have succeeded at theprimary and an arbitrary subset of the secondary replicas.(If it had failed at the primary, it would nothave been assigned a serial number and forwarded.)The client request is considered to have failed, and themodified region is left in an inconsistent state. Ourclient code handles such errors by retrying the failedmutation. It will make a few attempts at steps (3)through (7) before falling backt o a retry from the beginningof the write.

这里需要注意的是client首先将数据发送到各个replicas,之后等待各个replicas发回接收到数据的响应,之后client向primary chunk server(primary chunk server是有master指定的)发送WRITE命令,之后由primary chunk server协调写入顺序,分别向其他chunk server发送WRITE命令,primary chunk server等到其他chunk server发送到的ACK之后,才向client发送写成功结果。

也就是说gfs中数据的发送和命令写入是分开。

具体参考这里:http://www.slideshare.net/xuqianghitsoft/gfs-andmapreduce






2. 关于GFS容错


3. 读取数据


  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值