For clustering computing, the file system must also be different from those legacy system, a brand new system which we called DFS(distributed file system) came out.
However there's something we have to pay attention:
1. DFS only matters when the data amount is huge.
2. the system should be rarely refreshed.
otherwise a DFS is not necessary.
*** ***
MapReduce只有当运行它的主控进程的计算节点崩溃时才需要重启,除该节点外其余节点遇到问题都不许要重启整个任务。
*** 未完 ***