BlockReader类的创建代码调用的是BlockReaderFactory类中的build函数,代码如下:
/**
* Build a BlockReader with the given options.
*
* This function will do the best it can to create a block reader that meets
* all of our requirements. We prefer short-circuit block readers
* (BlockReaderLocal and BlockReaderLocalLegacy) over remote ones, since the
* former avoid the overhead of socket communication. If short-circuit is
* unavailable, our next fallback is data transfer over UNIX domain sockets,
* if dfs.client.domain.socket.data.traffic has been enabled. If that doesn't
* work, we will try to create a remote block reader that operates over TCP
* sockets.
*
* There are a few caches that are important here.
*
* The ShortCircuitCache stores file descriptor objects which have been passed
* from the DataNode.
*
* The DomainSocketFactory stores information about UNIX domain socket paths
* that we not been able to use in the past, so that we don't waste time
* retrying them over and over. (Like all the caches, it does have a timeout,
* though.)
*
* The PeerCache stores peers that we have used in the past. If we can reuse
* one of these peers, we avoid the overhead of re-opening a socket. However,
* if the socket has been timed out on the remote end, our attempt to reuse
* the socket may end with an IOException. For that reason, we limit our
* attempts at socket reuse to dfs.client.cached.conn.retry times. After
* that, we create new sockets. This avoids the problem where a thread tries
* to talk to a peer that it hasn't talked to in a while, and has to clean out
* every entry in a socket cache full of stale entries.
*
* @return The new BlockReader. We will not return null.
*
* @throws InvalidToken
* If the block token was invalid.
* InvalidEncryptionKeyException
* If the encryption key was invalid.
* Other IOException
* If there was another problem.
*/
public BlockReader build() throws IOException {
BlockReader reader = null;
Preconditions.checkNotNull(configuration);
//如果允许短路读操作
if (conf.shortCircuitLocalReads && allowShortCircuitLocalReads) {
//判断是否支持老版本(HDFS-2246)的短路读,这种情况是通过RPC从datanode上获取文件路径,
//然后客户端直接通过该文件路径读取数据,不过由于这种方式可以浏览文件所有数据,所以是不太安全的。
if (clientContext.getUseLegacyBlockReaderLocal()) {
//获取BlockReaderLocalLegacy类对象
reader = getLegacyBlockReaderLocal();
if (reader != null) {
if (LOG.isTraceEnabled()) {
LOG.trace(this + ": returning new legacy block reader local.");
}
return reader;
}
} else {//如果不支持老版本的短路读,那么就进行新版(HDFS-347)的短路读
reader = getBlockReaderLocal();
if (reader != null) {
if (LOG.isTraceEnabled()) {
LOG.trace(this + ": returning new block reader local.");
}
return reader;
}
}
}
if (conf.domainSocketDataTraffic) {
reader = getRemoteBlockReaderFromDomain();
if (reader != null) {
if (LOG.isTraceEnabled()) {
LOG.trace(this + ": returning new remote block reader using " +
"UNIX domain socket on " + pathInfo.getPath());
}
return reader;
}
}
Preconditions.checkState(!DFSInputStream.tcpReadsDisabledForTesting,
"TCP reads were disabled for testing, but we failed to " +
"do a non-TCP read.");
return getRemoteBlockReaderFromTcp();
}
从代码中我们可以看到主要有四种创建BlockReader类对象的方式,分别为:
1、getLegacyBlockReaderLocal()
2、getBlockReaderLocal()
3、getRemoteBlockReaderFromDomain()
4、getRemoteBlockReaderFromTcp()
接下来我们针对这四个创建BlockReader类对象来进行分析
getLegacyBlockReaderLocal函数
该代码如下:
/**
* Get {@link BlockReaderLocalLegacy} for short circuited local reads.
* This block reader implements the path-based style of local reads
* first introduced in HDFS-2246.
*/
private BlockReader getLegacyBlockReaderLocal() throws IOException {
if (LOG.isTraceEnabled()) {
LOG.trace(this + ": trying to construct BlockReaderLocalLegacy");
}
//判断客户端和datanode是否在同一台机器上
if (!DFSClient.isLocalAddress(inetSocketAddress)) {
if (LOG.isTraceEnabled()) {
LOG.trace(this + ": can't construct BlockReaderLocalLegacy because " +
"the address " + inetSocketAddress + " is not local");
}
return null;
}
//判断是否能创建BlockReaderLocalLegacy类对象
if (clientContext.getDisableLegacyBlockReaderLocal()) {
PerformanceAdvisory.LOG.debug(this + ": can't construct " +
"BlockReaderLocalLegacy because " +
"disableLegacyBlockReaderLocal is set.");
return null;
}
IOException ioe = null;
try {
//创建BlockReaderLocalLegacy类对象
return BlockReaderLocalLegacy.newBlockReader(conf,
userGroupInformation, configuration, fileName, block, token,
datanode, startOffset, length, storageType);
} catch (RemoteException remoteException) {
ioe = remoteException.unwrapRemoteException(
InvalidToken.class, AccessControlException.class);
} catch (IOException e) {
ioe = e;
}
if ((!(ioe instanceof AccessControlException)) &&
isSecurityException(ioe)) {
// Handle security exceptions.
// We do not handle AccessControlException here, since
// BlockReaderLocalLegacy#newBlockReader uses that exception to indicate
// that the user is not in dfs.block.local-path-access.user, a condition
// which requires us to disable legacy SCR.
throw ioe;
}
LOG.warn(this + ": error creating legacy BlockReaderLocal. " +
"Disabling legacy local reads.", ioe);
clientContext.setDisableLegacyBlockReaderLocal();
return null;
}
我们进入函数BlockReaderLocalLegacy.newBlockReader(conf,
userGroupInformation, configuration, fileName, block, token,
datanode, startOffset, length, storageType);中,代码如下:
/**
* The only way this object can be instantiated.
*/
static BlockReaderLocalLegacy newBlockReader(DFSClient.Conf conf,
UserGroupInformation userGroupInformation,
Configuration configuration, String file, ExtendedBlock blk,
Token<BlockTokenIdentifier> token, DatanodeInfo node,
long startOffset, long length, StorageType storageType)
throws IOException {
LocalDatanodeInfo localDatanodeInfo = getLocalDatanodeInfo(node
.getIpcPort());
// check the cache first
// 先从本地缓存中获取,如果有的话就不用再通过PRC到datanode上去获取了
BlockLocalPathInfo pathinfo = localDatanodeInfo.getBlockLocalPathInfo(blk);
if (pathinfo == null) {
if (userGroupInformation == null) {
userGroupInformation = UserGroupInformation.getCurrentUser();
}
//通过代理对象用RPC从datanode获取对应块所在的文件路径和验证码文件路径等信息
pathinfo = getBlockPathInfo(userGroupInformation, blk, node,
configuration, conf.socketTimeout, token,
conf.connectToDnViaHostname, storageType);
}
// check to see if the file exists. It may so happen that the
// HDFS file has been deleted and this block-lookup is occurring
// on behalf of a new HDFS file. This time, the block file could
// be residing in a different portion of the fs.data.dir directory.
// In this case, we remove this entry from the cache. The next
// call to this method will re-populate the cache.
FileInputStream dataIn = null;
FileInputStream checksumIn = null;
BlockReaderLocalLegacy localBlockReader = null;
boolean skipChecksumCheck = conf.skipShortCircuitChecksums ||
storageType.isTransient();
try {
// get a local file system
//获取本地文件路径后,开始直接打开本地文件
File blkfile = new File(pathinfo.getBlockPath());
dataIn = new FileInputStream(blkfile);
if (LOG.isDebugEnabled()) {
LOG.debug("New BlockReaderLocalLegacy for file " + blkfile + " of size "
+ blkfile.length() + " startOffset " + startOffset + " length "
+ length + " short circuit checksum " + !skipChecksumCheck);
}
//是否跳过文件校验
if (!skipChecksumCheck) {
// get the metadata file
//获取本地文件验证码文件路径
File metafile = new File(pathinfo.getMetaPath());
checksumIn = new FileInputStream(metafile);
//封装校验类对象
final DataChecksum checksum = BlockMetadataHeader.readDataChecksum(
new DataInputStream(checksumIn), blk);
//getBytesPerChecksum函数用来获取多少个字节(默认为512字节)数据对应一个校验和,而firstChunksumOffset就开始校验的数据块位置
long firstChunkOffset = startOffset
- (startOffset % checksum.getBytesPerChecksum());
//根据上面的这些信息来创建BlockReaderLocalLegacy类对象
localBlockReader = new BlockReaderLocalLegacy(conf, file, blk, token,
startOffset, length, pathinfo, checksum, true, dataIn,
firstChunkOffset, checksumIn);
} else {
//如果不需要进行文件校验,也会创建BlockReaderLocalLegacy类对象
localBlockReader = new BlockReaderLocalLegacy(conf, file, blk, token,
startOffset, length, pathinfo, dataIn);
}
} catch (IOException e) {
// remove from cache
// 将指定的块从缓存中移除
localDatanodeInfo.removeBlockLocalPathInfo(blk);
DFSClient.LOG.warn("BlockReaderLocalLegacy: Removing " + blk
+ " from cache because local file " + pathinfo.getBlockPath()
+ " could not be opened.");
throw e;
} finally {
if (localBlockReader == null) {
if (dataIn != null) {
dataIn.close();
}
if (checksumIn != null) {
checksumIn.close();
}
}
}
return localBlockReader;
}
这里我们需要特别讲解一下代码:
long firstChunkOffset = startOffset- (startOffset % checksum.getBytesPerChecksum());
hdfs每固定长度就会计算一次校验和,那么这个固定长度一般是512字节,也就是说hdfs文件每512个字节会计算一次校验和,而一个校验和占4个字节,所以这里checksum.getBytesPerChecksum()是512,表示每个校验和对应512个hdfs文件数据。关于校验和更详细的一篇文章
我们进入到需要校验的BlockReaderLocalLegacy类的构造函数中,代码如下:
private BlockReaderLocalLegacy(DFSClient.Conf conf, String hdfsfile,
ExtendedBlock block, Token<BlockTokenIdentifier> token, long startOffset,
long length, BlockLocalPathInfo pathinfo, DataChecksum checksum,
boolean verifyChecksum, FileInputStream dataIn, long firstChunkOffset,
FileInputStream checksumIn) throws IOException {
this.filename = hdfsfile;
this.checksum = checksum;
this.verifyChecksum = verifyChecksum;
this.startOffset = Math.max(startOffset, 0);
//每个校验块大小
bytesPerChecksum = this.checksum.getBytesPerChecksum();
//校验文件大小
checksumSize = this.checksum.getChecksumSize();
this.dataIn = dataIn;
this.checksumIn = checksumIn;
this.offsetFromChunkBoundary = (int) (startOffset-firstChunkOffset);
int chunksPerChecksumRead = getSlowReadBufferNumChunks(
conf.shortCircuitBufferSize, bytesPerChecksum);
slowReadBuff = bufferPool.getBuffer(bytesPerChecksum * chunksPerChecksumRead);
checksumBuff = bufferPool.getBuffer(checksumSize * chunksPerChecksumRead);
// Initially the buffers have nothing to read.
slowReadBuff.flip();
checksumBuff.flip();
boolean success = false;
try {
// Skip both input streams to beginning of the chunk containing startOffset
IOUtils.skipFully(dataIn, firstChunkOffset);
if (checksumIn != null) {
long checkSumOffset = (firstChunkOffset / bytesPerChecksum) * checksumSize;
IOUtils.skipFully(checksumIn, checkSumOffset);
}
success = true;
} finally {
if (!success) {
bufferPool.returnBuffer(slowReadBuff);
bufferPool.returnBuffer(checksumBuff);
}
}
}
另外一个构造函数代码如下:
private BlockReaderLocalLegacy(DFSClient.Conf conf, String hdfsfile,
ExtendedBlock block, Token<BlockTokenIdentifier> token, long startOffset,
long length, BlockLocalPathInfo pathinfo, FileInputStream dataIn)
throws IOException {
this(conf, hdfsfile, block, token, startOffset, length, pathinfo,
DataChecksum.newDataChecksum(DataChecksum.Type.NULL, 4), false,
dataIn, startOffset, null);
}
这个构造函数最终也会调用第一个构造函数。
综上一个BlockReaderLocalLegacy类对象就创建后了。也就是第一个函数getLegacyBlockReaderLocal分析完成。
针对流程总结一下:
1、从缓存中获取数据文件描述信息和校验文件描述信息,如果缓存中没有就通过RPC获取
2、将根据数据文件描述信息和校验文件信息创建相应的BlockReaderLocalLegacy类对象