2020 mit 6.824 GFS FAQ

最新推荐文章于 2024-08-22 20:07:42 发布

米兰的小耳朵

最新推荐文章于 2024-08-22 20:07:42 发布

阅读量182

点赞数

分类专栏：读书笔记文章标签： gfs mit6.824 分布式

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/khn64/article/details/108040316

版权

读书笔记专栏收录该内容

4 篇文章 0 订阅

订阅专栏

原子追加写为什么是至少一次而不是正好一次?

论文 3.1 节, 第7步 says that if a write fails at one of the
secondaries, the client re-tries the write. That will cause the data
to be appended more than once at the non-failed replicas.
How does an application know what sections of a chunk consist of
padding and duplicate records?

为了检测填充，应用程序可以在有效记录的开始处放置一个可预测的魔数，或者包含一个校验和，该校验和可能只有在记录有效时才有效。
如果原子追加写以不可预测的偏移量将数据写入文件,那么 clients 如何寻找数据?
追加写适用于读整个文件的应用.这样的应用程序将扫描文件以查找有效的记录(请参阅前面的问题)，因此它们不需要预先知道记录的位置。
什么是 checksum?
校验和. 不再赘述
GFS 论文提到的引用计数是什么?
它们是快照的即写即拷实现的一部分。当GFS创建快照时，它不会复制数据块，而是增加每个数据块的引用计数器。如果客户机写入一个块，而主服务器注意到引用计数大于1，则主服务器首先进行复制，以便客户机可以更新该副本(而不是作为快照一部分的块). 即延迟复制
If an application uses the standard POSIX file APIs, would it need
to be modified in order to use GFS?
需要修改. 而且 GFS 主要适用于类似 mapreduce的新应用.
GFS 中,如何找到最近的副本?
GFS是基于存储可用副本的服务器的IP地址来实现这一点的。在2003年，谷歌分配的IP地址必须是这样的:如果两个IP地址在IP地址空间中很近，那么它们在机房中也很近。
s1 是 chunk 的 primary, s1 与 Mater 的网络出现意外, Master 又找了 s2 当 primary. 如何避免两主问题?
租约机制.
对于 chunk size, 64 MB是不是大到笨拙了?
64 MB的块大小是 Master 中 book-keeping 的单位，是文件在块服务器上分片的粒度。 Clients 可以发出较小的读取和写入操作-他们没有被强制处理整个64 MB的块。使用如此大的块大小的目的是减小 Master 中元数据表的大小，并避免限制希望进行大量传输以减少开销的客户端。另一方面，小于64 MB的文件不会获得太多并行性。
Does Google still use GFS?
Rumor has it that GFS has been replaced by something called
Colossus, with the same overall goals, but improvements in master
performance and fault-tolerance
How acceptable is it that GFS trades correctness for performance
and simplicity?
这在分布式系统中是一个老生常谈的话题。强一致性通常需要复杂的协议，并且需要在机器之间进行聊天(我们将在接下来的几节课中看到)。通过利用特定应用程序类能够容忍宽松一致性的方法，可以设计具有良好性能和足够一致性的系统。
What if the master fails?
There are replica masters with a full copy of the master state; the
paper’s design requires human intervention to switch to one of the
replicas after a master failure (Section 5.1.3). We will see later how
to build replicated services with automatic cut-over to a backup,
using Raft.
Did having a single master turn out to be a good idea?
简化了最初的部署,但长远看效果并不好.
this article – https://queue.acm.org/detail.cfm?id=1594206
– says that as the years went by and GFS use grew, a few things went
wrong. The number of files grew enough that it wasn’t reasonable to
store all files’ metadata in the RAM of a single master. The number of
clients grew enough that a single master didn’t have enough CPU power
to serve them. The fact that switching from a failed master to one of
its backups required human intervention made recovery slow. Apparently
Google’s replacement for GFS, Colossus, splits the master over
multiple servers, and has more automated master failure recovery.

米兰的小耳朵

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
2020 mit 6.824 GFS FAQ

原子追加写为什么是至少一次而不是正好一次?论文 3.1 节, 第7步 says that if a write fails at one of thesecondaries, the client re-tries the write. That will cause the datato be appended more than once at the non-failed replicas.How does an application know what sections of..
复制链接

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。