php 使用hadoop,php能用hadoop吗

最新推荐文章于 2022-04-07 10:54:42 发布

旺家威

最新推荐文章于 2022-04-07 10:54:42 发布

阅读量247

点赞数

文章标签： php 使用hadoop

Hadoop是一个由Apache基金会所开发的分布式系统基础架构，用户可以在不了解分布式底层细节的情况下，开发分布式程序。充分利用集群的威力进行高速运算和存储。

虽然Hadoop是用java写的，但是Hadoop提供了Hadoop流，Hadoop流提供一个API,，允许用户使用任何语言编写map函数和reduce函数。 (推荐学习：PHP视频教程)

Hadoop流动关键是，它使用UNIX标准流作为程序与Hadoop之间的接口。因此，任何程序只要可以从标准输入流中读取数据，并且可以把数据写入标准输出流中，那么就可以通过Hadoop流使用任何语言编写MapReduce程序的map函数和reduce函数。

例如：bin/hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper /usr/local/hadoop/mapper.php -reducer /usr/local/hadoop/reducer.php -input test/* -output out4

Hadoop流引入的包：hadoop-streaming-0.20.203.0.jar,Hadoop根目录下是没有hadoop-streaming.jar的，因为streaming是一个contrib，所以要去contrib下面找，以hadoop-0.20.2为例，它在这里：

-input：指明输入hdfs文件的路径

-output：指明输出hdfs文件的路径

-mapper：指明map函数

-reducer：指明reduce函数

mapper函数

mapper.php文件，写入如下代码：#!/usr/local/php/bin/php

$word2count = array();

// input comes from STDIN (standard input)

// You can this code :$stdin = fopen(“php://stdin”, “r”);

while (($line = fgets(STDIN)) !== false) {

// remove leading and trailing whitespace and lowercase

$line = strtolower(trim($line));

// split the line into words while removing any empty string

$words = preg_split('/\W/', $line, 0, PREG_SPLIT_NO_EMPTY);

// increase counters

foreach ($words as $word) {

$word2count[$word] += 1;

}

}

// write the results to STDOUT (standard output)

// what we output here will be the input for the

// Reduce step, i.e. the input for reducer.py

foreach ($word2count as $word => $count) {

// tab-delimited

echo $word, chr(9), $count, PHP_EOL;

}

?>

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。