Hadoop C++ Pipes中context常见成员函数的作用

getJobConf

Get the JobConf for the current task

getInputKey

Get the current key

getInputValue

Get the current value

In the reducer, context.getInputValue is not available till context.nextValue is called !

progress

This method simply phones home to the NameNode, letting it know that the mapper or reducer is still working and has not died or zombified.

setStatus

The status message can be found in the hadoop*tasktracker*.log and in the web interface as "Status".

1 context.setStatus("Teke-lili");

getCounter

The counter will be displayed in the Web interface. You will have to get it once on init of the class.


nextValue

Iterate over the values. Important: The key will be the same all the time !

context.getInputValue is not available till context.nextValue is called 




例子:


假设输入文件是hello.txt

内容为:

hello world 

hello bupt

程序为:

#include "hadoop/Pipes.hh" 
#include "hadoop/TemplateFactory.hh" 
#include "hadoop/StringUtils.hh" 

const std::string WORDCOUNT = "WORDCOUNT"; 
const std::string INPUT_WORDS = "INPUT_WORDS"; 
const std::string OUTPUT_WORDS = "OUTPUT_WORDS"; 

class WordCountMap: public HadoopPipes::Mapper { 
// Mapper类 
public: 
	HadoopPipes::TaskContext::Counter* inputWords; 
	WordCountMap(HadoopPipes::TaskContext& context) 
	{ 
		inputWords = context.getCounter(WORDCOUNT, INPUT_WORDS); 
	} 
	void map(HadoopPipes::MapContext& context) 
	{ 
		std::vector<std::string> words = HadoopUtils::splitString(context.getInputValue(), " "); // 按空格进行单词分割 
		for(unsigned int i=0; i < words.size(); ++i) 
		{ 
			context.emit(words[i], "1"); // 单词作为key,value为1 
		} 
		context.incrementCounter(inputWords, words.size()); // 向map-reduce提交进度信息 
	} 
}; 

class WordCountReduce: public HadoopPipes::Reducer 
{ // reduce类 
public: 
	HadoopPipes::TaskContext::Counter* outputWords; 
	WordCountReduce(HadoopPipes::TaskContext& context) 
	{ 
		outputWords = context.getCounter(WORDCOUNT, OUTPUT_WORDS); 
	} 
	void reduce(HadoopPipes::ReduceContext& context) 
	{ 
		int sum = 0; 
		while (context.nextValue()) 
		{ 
			sum += HadoopUtils::toInt(context.getInputValue()); // 统计单词出现的次数 
		} 
		context.emit(context.getInputKey(), HadoopUtils::toString(sum)); // 输出结果 
		context.incrementCounter(outputWords, 1); 
	} 
}; 

int main(int argc, char *argv[]) 
{ 
	return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMap, WordCountReduce>()); // 运行任务 
}



一。MapContext:

内容为:

key->value

(1,hello word)  注:这里的1是该行的偏移量,具体值不一定是这个

(2,hello bupt)

getInpuptValue() 可以得到一行的value,例如头一次调用将得到:hello world

emit()

将以下内容写入

(hello,1)

(world,1)

(hell0,1)

(bupt,1)


二。ReduceContext:

以上一步的内容为输入,经过MapReduce框架处理以后得到,内容为:

(hello,[1,1])  注:这里已经将key相同的value放到了一块

(world,1)

(bupt,1)


context.nextValue() 将会前进到特定key的下一个Value
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值