Google’s MapReduce Programming Model-Revisted

Google’s MapReduce Programming Model-Revisted

 

      Google's MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google's domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall's aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.

 

Keywords: Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell

 

http://portal.acm.org/citation.cfm?id=1290812

 

Google’s MapReduce programming model — Revisited

Ralf LämmelCorresponding Author Contact InformationaE-mail The Corresponding Author

aData Programmability Team, Microsoft Corp., Redmond, WA, USA

Received 9 February 2006;  
revised 10 July 2007;  
accepted 10 July 2007.  
Available online 18 July 2007. 

Abstract

Google’s MapReduce programming model serves for processing large data sets in a massively parallel manner. We deliver the first rigorous description of the model including its advancement as Google’s domain-specific language Sawzall. To this end, we reverse-engineer the seminal papers on MapReduce and Sawzall, and we capture our findings as an executable specification. We also identify and resolve some obscurities in the informal presentation given in the seminal papers. We use typed functional programming (specifically Haskell) as a tool for design recovery and executable specification. Our development comprises three components: (i) the basic program skeleton that underlies MapReduce computations; (ii) the opportunities for parallelism in executing MapReduce computations; (iii) the fundamental characteristics of Sawzall’s aggregators as an advancement of the MapReduce approach. Our development does not formalize the more implementational aspects of an actual, distributed execution of MapReduce computations.

Keywords: Data processing; Parallel programming; Distributed programming; Software design; Executable specification; Typed functional programming; MapReduce; Sawzall; Map; Reduce; List homomorphism; Haskell


Corresponding Author Contact InformationCorresponding address: Universität Koblenz-Landau, Institut für Informatik B 128, Universitätsstrasse 1, D-56070 Koblenz, Germany. 

 

http://www.sciencedirect.com/science/article/pii/S0167642307001281

 

[PDF] 

Google's MapReduce Programming Model — Revisited

 - [  ]
文件格式: PDF/Adobe Acrobat -  快速查看
作者:R Lämmel -  被引用次数:111 -  相关文章
Google's MapReduce Programming Model — Revisited. ∗. Ralf Lämmel. Data Programmability Team. Microsoft Corp. Redmond, WA, USA. Abstract  ...
citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.104...

 

Author:Ralf LämmelData Programmability Team, Microsoft Corp., Redmond, WA, USA
Published in:
· Journal
Science of Computer Programming archive

Volume 68 Issue 3, October, 2007 
Elsevier North-Holland, Inc. Amsterdam, The Netherlands, The Netherlands 
table of contents doi>10.1016/j.scico.2007.07.001

 


评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值