My naive mapreduce notes

最新推荐文章于 2024-08-12 18:19:22 发布

dlpkh42648

最新推荐文章于 2024-08-12 18:19:22 发布

阅读量72

点赞数

文章标签：大数据

原文链接：http://www.cnblogs.com/RuiYan/p/5548733.html

版权

Challenges on parallel computing

1. node failure on huge clusters are frequent (~1000 nodes fail per day), storing data in the failed node will lost the data.

2. network bottlenecks: network bendwidth = 1Gbps, moving 10 TB data takes approximiately 1 day.

3. distributed programming is hard!

MapReduce addresses all the three challenges above:

1. Map-Reduce model addresses the node-failure-so-data-loss problem by storing data redundantly on multiple nodes. The same data is stored on multiple nodes so that even if you lose one of those nodes, the data is still available on the other nodes.

2. Map-Reduces addresses the network bottlenecks by moving the computations close to the data to avoid copying data around the network

3. Map-Reduce provides a very simple programing model that hides the complexity of the parallel computing.

转载于:https://www.cnblogs.com/RuiYan/p/5548733.html

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

立即使用

dlpkh42648

关注关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
My naive mapreduce notes

Challenges on parallel computing1. node failure on huge clusters are frequent (~1000 nodes fail per day), storing data in the failed node will lost the data.2. network bottlenecks: network be...
复制链接

扫一扫