还是按上次安装的Hadoop
cd /home/hadoop
mkdir test
vi user.txt:
1,张三,23,beijing,10086,
2,李四,34,shanghai,10000,
3,王五,20,beijing,10010,
vi mapper.php:
#!/usr/bin/php
<?php
$count = 0;
while($line = fgets(STDIN)) {
$line = trim($line);
$user = explode(',', $line);
echo $user[3]." 1\n";
}
vi reducer.php:
#!/usr/bin/php
<?php
$result = array();
while($line = fgets(STDIN)) {
list($city, $count) = explode(' ', $line);
if(!isset($result[$city])) $result[$city] = 0;
$result[$city] += $count;
}
foreach($result as $key=>$value){
echo "$key $value\n";
}
chmod +x reducer.php
把这个test同步到每个节点同样的位置上
调试:
cat user.txt|./mapper.php
cat user.txt | ./mapper.php | ./reducer.php
执行:
/usr/local/hadoop/bin/hdfs dfs -mkdir /user
/usr/local/hadoop/bin/hdfs dfs -mkdir /user/hadoop
/usr/local/hadoop/bin/hdfs dfs -mkdir /user/hadoop/input
/usr/local/hadoop/bin/hdfs dfs -put /home/hadoop/test/user.txt /user/hadoop/input
结果:
/usr/local/hadoop/bin/hadoop jar /usr/local/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.6.2.jar -input input/user.txt -output output2 -mapper /home/hadoop/test/mapper.php -reducer /home/hadoop/test/reducer.php
查看结果:
/usr/local/hadoop/bin/hdfs dfs -cat output2/*
如果发现文件夹已存在:
/usr/local/hadoop/bin/hdfs dfs -rm -r -f output2