拷贝map任务输出源码解读

最新推荐文章于 2024-06-07 16:43:58 发布

weixin_33775572

最新推荐文章于 2024-06-07 16:43:58 发布

阅读量80

点赞数

原文链接：http://blog.51cto.com/xigan/1184001

版权

map任务的输出由ReduceTask类的方法long copyOutput(MapOutputLocation loc)实现，包括以下几个步骤：

1.检查是否已经拷贝，如果已经拷贝，则返回-2表示要拷贝的数据已经过期

// check if we still need to copy the output from this location

if (copiedMapOutputs.contains(loc.getTaskId()) ||

obsoleteMapIds.contains(loc.getTaskAttemptId())) {

return CopyResult.OBSOLETE;

}

2. 构造map输出的路径及文件名和本地用于存储远程数据的临时文件路径

//map输出文件名output/map_任务Id.out

Path filename =

new Path(String.format(

MapOutputFile.REDUCE_INPUT_FILE_FORMAT_STRING,

TaskTracker.OUTPUT, loc.getTaskId().getId()));

// Copy the map output to a temp file whose name is unique to this attempt

//拷贝到本地的临时文件名

Path tmpMapOutput = new Path(filename+"-"+id);

3. 执行数据的拷贝

这步主要由函数getMapOutput（）实现，在下面会详细描述这个个过程

// Copy the map output

MapOutput mapOutput = getMapOutput(loc, tmpMapOutput,

reduceId.getTaskID().getId());

4.以同步并发的机制实现以下功能

synchronized (ReduceTask.this) {}

1）再次检查当前拷贝的数据是否已经拷贝过，如果拷贝过，则丢弃

if (copiedMapOutputs.contains(loc.getTaskId())) {

mapOutput.discard();

return CopyResult.OBSOLETE;

}

2）检查原始map输出数据大小是否为0，如果为0，则把拷贝生成的文件删除

// Special case: discard empty map-outputs

if (bytes == 0) {

try {

mapOutput.discard();

} catch (IOException ioe) {

LOG.info("Couldn't discard output of " + loc.getTaskId());

}

// Note that we successfully copied the map-output

noteCopiedMapOutput(loc.getTaskId());

return bytes;

}

3）分别处理拷贝完成的数据，分为内存和本地文件两种

a.数据被拷贝到内存中，则把拷贝的内存数据句柄加入集合中

// Process map-output

if (mapOutput.inMemory) {

// Save it in the synchronized list of map-outputs

mapOutputsFilesInMemory.add(mapOutput);

}

b.数据存储在本地文件，则把临时文件重命名为最终文件

// Rename the temporary file to the final file;

// ensure it is on the same partition

//把拷贝生成的临时文件重命名为最后

tmpMapOutput = mapOutput.file;

//把output/output/map_任务Id.out-0这样的临时文件重命名为

//output/output/map_任务Id.out这样的文件

filename = new Path(tmpMapOutput.getParent(), filename.getName());

if (!localFileSys.rename(tmpMapOutput, filename)) {

localFileSys.delete(tmpMapOutput, true);

bytes = -1;

throw new IOException("Failed to rename map output " +

tmpMapOutput + " to " + filename);

}

4）把本次拷贝的任务加入已经拷贝任务的集合中，并修改可拷贝的任务数

// Note that we successfully copied the map-output

//把此任务id加入进copiedMapOutputs

//并把还需要拷贝的map任务数置为（总数-已经拷贝的数量）

noteCopiedMapOutput(loc.getTaskId());

此方法内部代码为：

/**

* Save the map taskid whose output we just copied.

* This function assumes that it has been synchronized on ReduceTask.this.

* @param taskId map taskid

private void noteCopiedMapOutput(TaskID taskId) {

copiedMapOutputs.add(taskId);

ramManager.setNumCopiedMapOutputs(numMaps - copiedMapOutputs.size());

}

getMapOutput是数据拷贝的主实现方法，以下是这个方法的源码解析，方法签名为

private MapOutput getMapOutput(MapOutputLocation mapOutputLoc,

Path filename, int reduce)

throws IOException, InterruptedException

内部实现步骤：

1.获取map任务输出地址的连接和输入流

// Connect

URL url = mapOutputLoc.getOutputLocation();

URLConnection connection = url.openConnection();

InputStream input = setupSecureConnection(mapOutputLoc, connection);

2.检查当前地址的map输出是否是想要获取的map输出

// Validate header from map output

TaskAttemptID mapId = null;

try {

mapId =

TaskAttemptID.forName(connection.getHeaderField(FROM_MAP_TASK));

} catch (IllegalArgumentException ia) {

LOG.warn("Invalid map id ", ia);

return null;

}

TaskAttemptID expectedMapId = mapOutputLoc.getTaskAttemptId();

if (!mapId.equals(expectedMapId)) {

LOG.warn("data from wrong map:" + mapId +

" arrived to reduce task " + reduce +

", where as expected map output should be from " + expectedMapId);

return null;

}

如果是，则往下继续执行，如果不是，则说明取数据的地址出现问题，则返回

3.检查map输出的数据大小是否大于零，包括压缩和未压缩的情况

//未压缩的数据

long decompressedLength =

Long.parseLong(connection.getHeaderField(RAW_MAP_OUTPUT_LENGTH));

//压缩的数据长度

long compressedLength =

Long.parseLong(connection.getHeaderField(MAP_OUTPUT_LENGTH));

if (compressedLength < 0 || decompressedLength < 0) {

LOG.warn(getName() + " invalid lengths in map output header: id: " +

mapId + " compressed len: " + compressedLength +

", decompressed len: " + decompressedLength);

return null;

}

4.检查map输出的分区是否属于此reduce任务

//检查是否属于此reduce任务的输出，我的理解是，map端的分区输出记录有reduce的 //任务id，需要查看map端输出

//猜测？job在初始化任务的时候，已经创建了所有的map任务ID以及reduce任务ID

int forReduce =

(int)Integer.parseInt(connection.getHeaderField(FOR_REDUCE_TASK));

//reduce的值为当前reduce任务id

if (forReduce != reduce) {

LOG.warn("data for the wrong reduce: " + forReduce +

" with compressed len: " + compressedLength +

", decompressed len: " + decompressedLength +

" arrived to reduce task " + reduce);

return null;

}

5.执行数据的拷贝

此步，又可以分为以下几个详细的步骤：

1）检查剩下的内存是否足够存储拷贝的数据

//We will put a file in memory if it meets certain criteria:

//1. The size of the (decompressed) file should be less than 25% of

// the total inmem fs

//2. There is space available in the inmem fs

// Check if this map-output can be saved in-memory

//通过检查输出数据没有压缩的大小与内存能放的最大值比较，如果小于，则可以放，如 //果大于，则不可以放内存

//最大值是mapred.job.reduce.total.mem.bytes配置的0.25倍

boolean shuffleInMemory = ramManager.canFitInMemory(decompressedLength);

2）拷贝数据到内存

if (shuffleInMemory) {

if (LOG.isDebugEnabled()) {

LOG.debug("Shuffling " + decompressedLength + " bytes (" +

compressedLength + " raw bytes) " +

"into RAM from " + mapOutputLoc.getTaskAttemptId());

}

mapOutput = shuffleInMemory(mapOutputLoc, connection, input,

(int)decompressedLength,

(int)compressedLength);

}

shuffleInMemory函数的详细源码分析如下：

a）检查是否有足够的内存存放数据，如果内存不够，则把线程进入等待队列，直到内存够了以后，线程被通知，然后继续执行

/**

* 如果内存空间大小不够，则调用wait进行等待，当空间释放后，线程被唤醒后，此方 * 法返回

* 返回true表示不用等待，false表示等待后，线程唤醒返回

// Reserve ram for the map-output

boolean createdNow = ramManager.reserve(mapOutputLength, input);

b) 重新连接

如果createdNow返回为真，则表示内存够，线程没有进入对象等待对象，则不需要重新连接，如果返回为假，则说明线程进入等待队列，并且重新被激活，原来的连接已经关闭

// Reconnect if we need to

//因为空间不够，线程进入等待，关闭了与map输出节点之间的连接，所以需要重新连接

if (!createdNow) {

// Reconnect

try {

connection = mapOutputLoc.getOutputLocation().openConnection();

input = setupSecureConnection(mapOutputLoc, connection);

} catch (IOException ioe) {

LOG.info("Failed reopen connection to fetch map-output from " +

mapOutputLoc.getHost());

// Inform the ram-manager

ramManager.closeInMemoryFile(mapOutputLength);

ramManager.unreserve(mapOutputLength);

throw ioe;

}

c) 计算数据长度，因为数据带有校验信息，需要减去

//截留出真实数据长度，因为输入流中的数据包括数据校验信息和真实数据

IFileInputStream checksumIn =

new IFileInputStream(input,compressedLength);

input = checksumIn;

d)如果数据是压缩的，则把输入流改为压缩文件

// Are map-outputs compressed?

if (codec != null) {

decompressor.reset();

input = codec.createInputStream(input, decompressor);

}

e）执行数据的拷贝

// Copy map-output into an in-memory buffer

byte[] shuffleData = new byte[mapOutputLength];

MapOutput mapOutput =

new MapOutput(mapOutputLoc.getTaskId(),

mapOutputLoc.getTaskAttemptId(), shuffleData, compressedLength);

int bytesRead = 0;

try {

//n表示实际读到的字节数，因为一次实际读到的数值要小于等于总长度

//所以下面循环度，但是接收空间长度不变都是数组的完整初始化长度

int n = input.read(shuffleData, 0, shuffleData.length);

while (n > 0) {

bytesRead += n;

shuffleClientMetrics.inputBytes(n);

// indicate we're making progress

reporter.progress();

n = input.read(shuffleData, bytesRead,

(shuffleData.length-bytesRead));

}

if (LOG.isDebugEnabled()) {

LOG.debug("Read " + bytesRead + " bytes from map-output for " +

mapOutputLoc.getTaskAttemptId());

}

input.close();

} catch (IOException ioe) {

LOG.info("Failed to shuffle from " + mapOutputLoc.getTaskAttemptId(),

ioe);

// Inform the ram-manager

ramManager.closeInMemoryFile(mapOutputLength);

ramManager.unreserve(mapOutputLength);

// Discard the map-output

try {

mapOutput.discard();

} catch (IOException ignored) {

LOG.info("Failed to discard map-output from " +

mapOutputLoc.getTaskAttemptId(), ignored);

}

mapOutput = null;

// Close the streams

IOUtils.cleanup(LOG, input);

// Re-throw

readError = true;

throw ioe;

}

// Close the in-memory file

ramManager.closeInMemoryFile(mapOutputLength);

f）检查拷贝完的数据长度是否与原始文件的长度相等，不相等，则丢弃拷贝的数据

3）拷贝数据到硬盘

此部分代码相对简单，不做阐述，总体分两个步骤，与拷贝的内存一致

第一步拷贝

第二步检查数据长度是否一致

private MapOutput shuffleToDisk(MapOutputLocation mapOutputLoc,

InputStream input,

Path filename,

long mapOutputLength)

throws IOException {

// Find out a suitable location for the output on local-filesystem

Path localFilename =

lDirAlloc.getLocalPathForWrite(filename.toUri().getPath(),

mapOutputLength, conf);

MapOutput mapOutput =

new MapOutput(mapOutputLoc.getTaskId(), mapOutputLoc.getTaskAttemptId(),

conf, localFileSys.makeQualified(localFilename),

mapOutputLength);

// Copy data to local-disk

OutputStream output = null;

long bytesRead = 0;

try {

output = rfs.create(localFilename);

byte[] buf = new byte[64 * 1024];

int n = -1;

try {

n = input.read(buf, 0, buf.length);

} catch (IOException ioe) {

readError = true;

throw ioe;

}

while (n > 0) {

bytesRead += n;

shuffleClientMetrics.inputBytes(n);

output.write(buf, 0, n);

// indicate we're making progress

reporter.progress();

try {

n = input.read(buf, 0, buf.length);

} catch (IOException ioe) {

readError = true;

throw ioe;

}

LOG.info("Read " + bytesRead + " bytes from map-output for " +

mapOutputLoc.getTaskAttemptId());

output.close();

input.close();

} catch (IOException ioe) {

LOG.info("Failed to shuffle from " + mapOutputLoc.getTaskAttemptId(),

ioe);

// Discard the map-output

try {

mapOutput.discard();

} catch (IOException ignored) {

LOG.info("Failed to discard map-output from " +

mapOutputLoc.getTaskAttemptId(), ignored);

}

mapOutput = null;

// Close the streams

IOUtils.cleanup(LOG, input, output);

// Re-throw

throw ioe;

}

// Sanity check

if (bytesRead != mapOutputLength) {

try {

mapOutput.discard();

} catch (Exception ioe) {

// IGNORED because we are cleaning up

LOG.info("Failed to discard map-output from " +

mapOutputLoc.getTaskAttemptId(), ioe);

} catch (Throwable t) {

String msg = getTaskID() + " : Failed in shuffle to disk :"

+ StringUtils.stringifyException(t);

reportFatalError(getTaskID(), t, msg);

}

mapOutput = null;

throw new IOException("Incomplete map output received for " +

mapOutputLoc.getTaskAttemptId() + " from " +

mapOutputLoc.getOutputLocation() + " (" +

bytesRead + " instead of " +

mapOutputLength + ")"

);

}

return mapOutput;

}

转载于:https://blog.51cto.com/xigan/1184001

weixin_33775572

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫