Erlang里实现MapReduce

最新推荐文章于 2024-05-14 19:56:00 发布

Xiao_Qiang_

最新推荐文章于 2024-05-14 19:56:00 发布

阅读量756

点赞数

分类专栏： erlang 文章标签： mapreduce erlang fun java list 测试

erlang 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

http://www.cnblogs.com/orez88/articles/1787119.html

参考： http://weblambdazero.blogspot.com/2008/08/mapreduce-in-erlang.html

MapReduce的主要原理是将一个数据集上的计算分发到许多单独的进程上(map)，然后收集它们的结果(reduce)。

在Erlang里实现MapReduce非常细节也十分简单，例如Erlang的作者Joe Armstrong发表了一段代码来表示MapReduce版本的Erlang标准lists:map/2方法：
pmap.erl

    Java代码  
  
 -module(pmap).  
 -export([pmap/2]).  
   
 pmap(F, L) ->   
   S = self(),  
   Pids = lists:map(fun(I) ->   
     spawn(fun() -> do_fun(S, F, I) end)  
   end, L),  
   gather(Pids).  
   
 gather([H|T]) ->  
   receive  
     {H, Result} -> [Result|gather(T)]  
   end;  
 gather([]) ->  
   [].  
   
 do_fun(Parent, F, I) ->                        
     Parent ! {self(), (catch F(I))}.  

pmap的原理也很简单，对List的每项元素的Fun调用都spawn一个process来实际处理，然后再调用gather来收集结果。

如此简洁的代码就实现了基本的MapReduce，不得不服Erlang！

下面是一个fib的示例调用：
fib.erl

    Java代码  
  
 -module(fib).  
 -export([fib/1]).  
   
 fib(0) -> 0;  
 fib(1) -> 1;  
 fib(N) when N > 1 -> fib(N-1) + fib(N-2).  

编译好之后比较一下lists:map/2和pmap:pmap/2的执行效率：

    Java代码  
  
 Eshell > L = lists:seq(0,35).  
 Eshell > lists:map(fun(X) -> fib:fib(X) end, L).  
 Eshell > pmap:pmap(fun(X) -> fib:fib(X) end, L).  

测试结果lists:map执行时间大概4s，pmap:pmap执行时间大概2s，节约了一半的时间，呵呵。

Xiao_Qiang_

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Erlang里实现MapReduce

http://www.cnblogs.com/orez88/articles/1787119.html参考： http://weblambdazero.blogspot.com/2008/08/mapreduce-in-erlang.html MapReduce的主要原理是将一个数据集上的计算分发到许多单独的进程上(map)，然后收集它们的结果(reduce)。在Er
复制链接

扫一扫