Hadoop核心源码剖析系列（五）: 读数据

最新推荐文章于 2022-02-09 10:21:48 发布

数据与智能

最新推荐文章于 2022-02-09 10:21:48 发布

阅读量236

点赞数

文章标签： python java 大数据编程语言 android

点击上方“数据与智能”，“星标或置顶公众号”

第一时间获取好内容

作者 | 吴邪大数据4年从业经验，目前就职于广州一家互联网公司，负责大数据基础平台自研、离线计算&实时计算研究

编辑 | auroral-L

这篇文章我们分享HDFS读取数据的流程，相对于写数据流程来说，读数据的流程会简单不少，写完这一篇之后，对HDFS的核心代码剖析算是告一段落了，这一系列包含了NameNode的初始化、DataNode的初始、元数据管理、HDFS写数据流程、HDFS读数据流程五个核心部分，毕竟HDFS是一个百万行级别代码的技术架构，内容非常多，所以本系列只选取HDFS关键且核心的功能点来剖析。

HDFS读数据流程

图1：读数据流程图

HDFS 客户端调用DistributedFileSystem 类（FileSystem的实现类）的open（）方法。
DistributedFileSystem 通过NameNodeRPC与NameNode建立通信，调用getBlockLocations（）方法，请求block数据块的存储位置。
DistributedFileSystem 返回一个FSDataInputStream对象给客户端，FSDataInputStream携带有block数据库的元数据信息，客户端调用FSDataInputStream的read（）方法，请求存放目标block最近的DataNode节点。
DataNode和NameNode之间以数据流的方式进行通信，保证客户端可以重复调用read（）方法进行读取，选择就近的DataNode读取需要的数据块信息，如果发现读取的DataNode有异常，则尝试读取下一个DataNode的数据，直至读完最后一个数据块。
当读取完文件之后，调用close（）方法关闭DataNode和NameNode连接。

实际上，从流程上可以看出HDFS读数据数据和写入数据总体的流程是差不多的，但是读取数据会简单一些，下面我们开始进行源码的分析。

读取数据

我们还是根据流程图的过程进行分析，这样整个过程更加清晰。

找到FSDataInputStream这个类的open（）方法，传入数据路径。

/**
 * Opens an FSDataInputStream at the indicated Path.
 * @param f the file to open
 */
public FSDataInputStream open(Path f) throws IOException {
  return open(f, getConf().getInt("io.file.buffer.size", 4096));
}
/**
 * Opens an FSDataInputStream at the indicated Path.
 * @param f the file name to open
 * @param bufferSize the size of the buffer to be used.
 */
public abstract FSDataInputStream open(Path f, int bufferSize)
  throws IOException;

显而易见，open（...）方法中传入了文件路径以及配置文件中设置的文件缓冲区大小为4M。继续点进去open方法发现是一个抽象方法，那下一步就应该找到实现类的open（...）方法，根据流程图可以知道FileSystem的实现类就是DistributedFileSystem，不言而喻，直接定位到DistributedFileSystem这个类的open方法准没错。

图2：FileSystem的实现类和方法

@Override
public FSDataInputStream open(Path f, final int bufferSize)
    throws IOException {
  //计算读取操作的次数进行累加并记录
  statistics.incrementReadOps(1);
  //判断数据路径是绝对路径还是相对路径，不重要
  Path absF = fixRelativePart(f);
  return new FileSystemLinkResolver<FSDataInputStream>() {
    @Override
    public FSDataInputStream doCall(final Path p)
        throws IOException, UnresolvedLinkException {
      //重要代码，重点关注,跟写数据的套路差不多
      final DFSInputStream dfsis =
        dfs.open(getPathName(p), bufferSize, verifyChecksum);
      return dfs.createWrappedInputStream(dfsis);
    }
    @Override
    public FSDataInputStream next(final FileSystem fs, final Path p)
        throws IOException {
      return fs.open(p, bufferSize);
    }
  }.resolve(this, absF);
}

重点关注以上dfs.open（xxx）方法，调用之前会通过文件路径判断文件是否属于当前的文件系统。

/**
 * Create an input stream that obtains a nodelist from the namenode, and then
 * reads from all the right places. Creates inner subclass of InputStream that
 * does the right out-of-band work.
 */
public DFSInputStream open(String src, int buffersize, boolean verifyChecksum)
      throws IOException, UnresolvedLinkException {
    //检查文件是否处于打开状态，无关紧要的方法
   checkOpen();
   // Get block info from namenode，从namenode获取block信息
   TraceScope scope = getPathTraceScope("newDFSInputStream", src);
   try {
      return new DFSInputStream(this, src, verifyChecksum);
   } finally {
      scope.close();
   }
}

方法的最后返回了DFSInputStream（xxx）这个构造函数，并且在构造函数中调用了openInfo（）方法。

DFSInputStream(DFSClient dfsClient, String src, boolean verifyChecksum
               ) throws IOException, UnresolvedLinkException {
  this.dfsClient = dfsClient;
  this.verifyChecksum = verifyChecksum;
  this.src = src;
  synchronized (infoLock) {
    this.cachingStrategy = dfsClient.getDefaultReadCachingStrategy();
  }
  openInfo();
}

/**
 * Grab the open-file info from namenode
 * 从namenode获取要打开的文件对应的blcok信息
 */
void openInfo() throws IOException, UnresolvedLinkException {
  synchronized(infoLock) {
  //划重点，对应流程图的步骤二，从namenode获取block信息
    lastBlockBeingWrittenLength = fetchLocatedBlocksAndGetLastBlockLength();
    int retriesForLastBlockLength = dfsClient.getConf().retryTimesForGetLastBlockLength;
    //为了保证读取成功，特意用了while循环增强，循环调用fetchLocatedBlocksAndGetLastBlockLength（）
    while (retriesForLastBlockLength > 0) {
      // Getting last block length as -1 is a special case. When cluster
      // restarts, DNs may not report immediately. At this time partial block
      // locations will not be available with NN for getting the length. Lets
      // retry for 3 times to get the length.
      if (lastBlockBeingWrittenLength == -1) {
        DFSClient.LOG.warn("Last block locations not available. "
            + "Datanodes might not have reported blocks completely."
            + " Will retry for " + retriesForLastBlockLength + " times");
        waitFor(dfsClient.getConf().retryIntervalForGetLastBlockLength);
        lastBlockBeingWrittenLength = fetchLocatedBlocksAndGetLastBlockLength();
      } else {
        break;
      }
      retriesForLastBlockLength--;
    }
    if (retriesForLastBlockLength == 0) {
      throw new IOException("Could not obtain the last block locations.");
    }
  }
}

对应流程图步骤二的getBlockLocations方法，详情请看fetchLocatedBlocksAndGetLastBlockLength（）方法。

private long fetchLocatedBlocksAndGetLastBlockLength() throws IOException {
//调用DFSClient的getLocatedBlocks方法，通过文件路径获取blcok存储位置信息
  final LocatedBlocks newInfo = dfsClient.getLocatedBlocks(src, 0);
  if (DFSClient.LOG.isDebugEnabled()) {
    DFSClient.LOG.debug("newInfo = " + newInfo);
  }
  if (newInfo == null) {
    throw new IOException("Cannot open filename " + src);
  }
  if (locatedBlocks != null) {
    Iterator<LocatedBlock> oldIter = locatedBlocks.getLocatedBlocks().iterator();
    Iterator<LocatedBlock> newIter = newInfo.getLocatedBlocks().iterator();
    while (oldIter.hasNext() && newIter.hasNext()) {
      if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) {
        throw new IOException("Blocklist for " + src + " has changed!");
      }
    }
  }
  locatedBlocks = newInfo;
  long lastBlockBeingWrittenLength = 0;
  if (!locatedBlocks.isLastBlockComplete()) {
    final LocatedBlock last = locatedBlocks.getLastLocatedBlock();
    if (last != null) {
      if (last.getLocations().length == 0) {
        if (last.getBlockSize() == 0) {
          // if the length is zero, then no data has been written to
          // datanode. So no need to wait for the locations.
          return 0;
        }
        return -1;
      }
      final long len = readBlockLength(last);
      last.getBlock().setNumBytes(len);
      lastBlockBeingWrittenLength = len; 
    }
  }


  fileEncryptionInfo = locatedBlocks.getFileEncryptionInfo();


  return lastBlockBeingWrittenLength;
}

重点关注DFSClient的getBlockLocations（）方法，从namenode获取block位置信息。

public LocatedBlocks getLocatedBlocks(String src, long start) throws IOException {
   return getLocatedBlocks(src, start, dfsClientConf.prefetchSize);
}


/*
 * This is just a wrapper around callGetBlockLocations, but non-static so that
 * we can stub it out for tests.
 */
@VisibleForTesting
public LocatedBlocks getLocatedBlocks(String src, long start, long length) throws IOException {
   TraceScope scope = getPathTraceScope("getBlockLocations", src);
   try {
      return callGetBlockLocations(namenode, src, start, length);
   } finally {
      scope.close();
   }
}


/**
 * @see ClientProtocol#getBlockLocations(String, long, long)
 */
static LocatedBlocks callGetBlockLocations(ClientProtocol namenode, String src, long start, long length)
      throws IOException {
   try {
   //通过RPC远程调用NameNodeRPCServer
      return namenode.getBlockLocations(src, start, length);
   } catch (RemoteException re) {
      throw re.unwrapRemoteException(AccessControlException.class, FileNotFoundException.class,
            UnresolvedPathException.class);
   }
}


@Idempotent
public LocatedBlocks getBlockLocations(String src,
                                       long offset,
                                       long length) 
    throws AccessControlException, FileNotFoundException,
    UnresolvedLinkException, IOException;

可以看到返回的是LocatedBlocks对象，包含了List<LocatedBlock> blocks，封装了block的信息，以及block在文件中的偏移量，还有block对应DataNode的位置信息。原理上是RPC调用了NameNodeRPCServer的getBlockLocations（）方法。

@Override // ClientProtocol
public LocatedBlocks getBlockLocations(String src, 
                                        long offset, 
                                        long length) 
    throws IOException {
  //检查NameNode是否已经启动
  checkNNStartup();
  //计算获取到block信息并记录变化
  metrics.incrGetBlockLocations();
  return namesystem.getBlockLocations(getClientMachine(), 
                                      src, offset, length);
}


/**
 * Get block locations within the specified range.
 * @see ClientProtocol#getBlockLocations(String, long, long)
 */
LocatedBlocks getBlockLocations(String clientMachine, String src,
    long offset, long length) throws IOException {
  checkOperation(OperationCategory.READ);
  //创建Block信息结果对象
  GetBlockLocationsResult res = null;
  readLock();
  try {
    checkOperation(OperationCategory.READ);
    //获取block位置信息
    res = getBlockLocations(src, offset, length, true, true);
  } catch (AccessControlException e) {
    logAuditEvent(false, "open", src);
    throw e;
  } finally {
    readUnlock();
  }
  logAuditEvent(true, "open", src);
  if (res.updateAccessTime()) {
    writeLock();
    final long now = now();
    try {
      checkOperation(OperationCategory.WRITE);
      INode inode = res.iip.getLastINode();
      boolean updateAccessTime = now > inode.getAccessTime() +
          getAccessTimePrecision();
      if (!isInSafeMode() && updateAccessTime) {
        boolean changed = FSDirAttrOp.setTimes(dir,
            inode, -1, now, false, res.iip.getLatestSnapshotId());
        if (changed) {
          getEditLog().logTimes(src, -1, now);
        }
      }
    } catch (Throwable e) {
      LOG.warn("Failed to update the access time of " + src, e);
    } finally {
      writeUnlock();
    }
  }
  //将获取到的block信息赋值给LocatedBlocks 
  LocatedBlocks blocks = res.blocks;
  if (blocks != null) {
    blockManager.getDatanodeManager().sortLocatedBlocks(
        clientMachine, blocks.getLocatedBlocks());
    // lastBlock is not part of getLocatedBlocks(), might need to sort it too
    //获取到最后一个Block的位置信息
    LocatedBlock lastBlock = blocks.getLastLocatedBlock();
    if (lastBlock != null) {
      ArrayList<LocatedBlock> lastBlockList = Lists.newArrayList(lastBlock);
      blockManager.getDatanodeManager().sortLocatedBlocks(
          clientMachine, lastBlockList);
    }
  }
  //返回LocatedBlocks对象，封装了目标文件包含的所有block的位置信息
  return blocks;
}

以上就完成了步骤二获取block位置信息的分析，同样的将返回的DFSInputStream对象传递给createWrappedInputStream（...）方法中进行再次封装。接下来根据NameNode返回的LocatedBlocks对象信息，请求FSDataInputStream的 read（）方法。

/**
 * Read bytes from the given position in the stream to the given buffer.
 *
 * @param position  position in the input stream to seek
 * @param buffer    buffer into which data is read
 * @param offset    offset into the buffer in which data is written
 * @param length    maximum number of bytes to read
 * @return total number of bytes read into the buffer, or <code>-1</code>
 *         if there is no more data because the end of the stream has been
 *         reached
 */
@Override
public int read(long position, byte[] buffer, int offset, int length)
  throws IOException {
  return ((PositionedReadable)in).read(position, buffer, offset, length);
}

FSDataInputStream会调用其封装的DFSInputStream的read（xxx）方法。

/**
 * Read bytes starting from the specified position.
 * 
 * @param position start read from this position
 * @param buffer read buffer
 * @param offset offset into buffer
 * @param length number of bytes to read
 * 
 * @return actual number of bytes read
 */
@Override
public int read(long position, byte[] buffer, int offset, int length)
    throws IOException {
  TraceScope scope =
      dfsClient.getPathTraceScope("DFSInputStream#byteArrayPread", src);
  try {
    return pread(position, buffer, offset, length);
  } finally {
    scope.close();
  }
}




private int pread(long position, byte[] buffer, int offset, int length)
    throws IOException {
  // sanity checks，检查文件系统是否运行中
  dfsClient.checkOpen();
  if (closed.get()) {
    throw new IOException("Stream closed");
  }
  failures = 0;
  //获取LocatedBlocks的长度
  long filelen = getFileLength();
  if ((position < 0) || (position >= filelen)) {
    return -1;
  }
  int realLen = length;
  if ((position + length) > filelen) {
    realLen = (int)(filelen - position);
  }
  
  // determine the block and byte range within the block
  // corresponding to position and realLen
  //得到从offset到offset + length范围内的block列表
  List<LocatedBlock> blockRange = getBlockRange(position, realLen);
  int remaining = realLen;
  Map<ExtendedBlock,Set<DatanodeInfo>> corruptedBlockMap 
    = new HashMap<ExtendedBlock, Set<DatanodeInfo>>();
   //对block列表进行遍历，读取需要的block数据，因为需要的数据不一定是存在一个block列表中，通常分布在多个block
  for (LocatedBlock blk : blockRange) {
    long targetStart = position - blk.getStartOffset();
    long bytesToRead = Math.min(remaining, blk.getBlockSize() - targetStart);
    try {
      if (dfsClient.isHedgedReadsEnabled()) {
        hedgedFetchBlockByteRange(blk, targetStart, targetStart + bytesToRead
            - 1, buffer, offset, corruptedBlockMap);
      } else {
        fetchBlockByteRange(blk, targetStart, targetStart + bytesToRead - 1,
            buffer, offset, corruptedBlockMap);
      }
    } finally {
      // Check and report if any block replicas are corrupted.
      // BlockMissingException may be caught if all block replicas are
      // corrupted.
      reportCheckSumFailure(corruptedBlockMap, blk.getLocations().length);
    }
    remaining -= bytesToRead;
    position += bytesToRead;
    offset += bytesToRead;
  }
  assert remaining == 0 : "Wrong number of bytes read.";
  if (dfsClient.stats != null) {
    dfsClient.stats.incrementBytesRead(realLen);
  }
  return realLen;
}

分析getBlockRange（xxx）方法，通过指定的范围从namenode获取数据，优先从缓存中获取。

/**
 * Get blocks in the specified range.
 * Fetch them from the namenode if not cached. This function
 * will not get a read request beyond the EOF.
 * @param offset starting offset in file
 * @param length length of data
 * @return consequent segment of located blocks
 * @throws IOException
 */
private List<LocatedBlock> getBlockRange(long offset,
    long length)  throws IOException {
  // getFileLength(): returns total file length
  // locatedBlocks.getFileLength(): returns length of completed blocks
  //通常offset是要小于文件长度的
  if (offset >= getFileLength()) {
    throw new IOException("Offset: " + offset +
      " exceeds file length: " + getFileLength());
  }
  //之前有说到，block的状态有两种，一种是complete写入完成的，另一种是uncomplete构建中的状态
  synchronized(infoLock) {
    final List<LocatedBlock> blocks;
    //得到locatedBlocks的长度
    final long lengthOfCompleteBlk = locatedBlocks.getFileLength();
    final boolean readOffsetWithinCompleteBlk = offset < lengthOfCompleteBlk;
    final boolean readLengthPastCompleteBlk = offset + length > lengthOfCompleteBlk;


    if (readOffsetWithinCompleteBlk) {
      //get the blocks of finalized (completed) block range，
      blocks = getFinalizedBlockRange(offset,
        Math.min(length, lengthOfCompleteBlk - offset));
    } else {
      blocks = new ArrayList<LocatedBlock>(1);
    }


    // get the blocks from incomplete block range
    if (readLengthPastCompleteBlk) {
       blocks.add(locatedBlocks.getLastLocatedBlock());
    }


    return blocks;
  }
}


/**
 * Get blocks in the specified range.
 * Includes only the complete blocks.
 * Fetch them from the namenode if not cached.
 */
private List<LocatedBlock> getFinalizedBlockRange(
    long offset, long length) throws IOException {
  synchronized(infoLock) {
    assert (locatedBlocks != null) : "locatedBlocks is null";
    List<LocatedBlock> blockRange = new ArrayList<LocatedBlock>();
    // search cached blocks first
    //首先会先从缓存的locatedBlocks中查找offset所在的block在缓存链表中的位置
    int blockIdx = locatedBlocks.findBlock(offset);
    if (blockIdx < 0) { // block is not cached，无缓存
      blockIdx = LocatedBlocks.getInsertIndex(blockIdx);
    }
    long remaining = length;
    long curOff = offset;
    while(remaining > 0) {
      LocatedBlock blk = null;
      if(blockIdx < locatedBlocks.locatedBlockCount())
        //根据blcokIdx找到block
        blk = locatedBlocks.get(blockIdx);
        //说明没有缓存，从NameNode查找block并添加到缓存
      if (blk == null || curOff < blk.getStartOffset()) {
        LocatedBlocks newBlocks;
        newBlocks = dfsClient.getLocatedBlocks(src, curOff, remaining);
        locatedBlocks.insertRange(blockIdx, newBlocks.getLocatedBlocks());
        continue;
      }
      assert curOff >= blk.getStartOffset() : "Block not found";
      blockRange.add(blk);
      long bytesRead = blk.getStartOffset() + blk.getBlockSize() - curOff;
      remaining -= bytesRead;
      curOff += bytesRead;
      //继续读取下一个block
      blockIdx++;
    }
    return blockRange;
  }
}

其中我们看一下pread（xxx）方法下引用的fetchBlockByteRange（...）方法。

private void fetchBlockByteRange(LocatedBlock block, long start, long end,
    byte[] buf, int offset,
    Map<ExtendedBlock, Set<DatanodeInfo>> corruptedBlockMap)
    throws IOException {
    //通过偏移量获取LocatedBlock 对象
  block = getBlockAt(block.getStartOffset());
  //熟悉的while循环，为了最大程度上保证成功获取数据
  while (true) {
  //选择就近的一个DataNode进行读取
    DNAddrPair addressPair = chooseDataNode(block, null);
    try {
   //通过选择的DataNode，根据block的起始偏移量开始获取数据,获取完成后return；结束循环
      actualGetFromOneDataNode(addressPair, block, start, end, buf, offset,
          corruptedBlockMap);
      return;
    } catch (IOException e) {
      // Ignore. Already processed inside the function.
      // Loop through to try the next node.
    }
  }
}

执行完actualGetFromOneDataNode（...）方法获取完数据之后会执行close（）方法结束连接。

private void actualGetFromOneDataNode(final DNAddrPair datanode,
    LocatedBlock block, final long start, final long end, byte[] buf,
    int offset, Map<ExtendedBlock, Set<DatanodeInfo>> corruptedBlockMap)
    throws IOException {
  DFSClientFaultInjector.get().startFetchFromDatanode();
  int refetchToken = 1; // only need to get a new access token once
  int refetchEncryptionKey = 1; // only need to get a new encryption key once


  while (true) {
    // cached block locations may have been updated by chooseDataNode()
    // or fetchBlockAt(). Always get the latest list of locations at the
    // start of the loop.
    CachingStrategy curCachingStrategy;
    boolean allowShortCircuitLocalReads;
    block = getBlockAt(block.getStartOffset());
    synchronized(infoLock) {
      curCachingStrategy = cachingStrategy;
      allowShortCircuitLocalReads = !shortCircuitForbidden();
    }
    DatanodeInfo chosenNode = datanode.info;
    InetSocketAddress targetAddr = datanode.addr;
    StorageType storageType = datanode.storageType;
    //初始化BlockReader
    BlockReader reader = null;


    try {
      DFSClientFaultInjector.get().fetchFromDatanodeException();
      Token<BlockTokenIdentifier> blockToken = block.getBlockToken();
      int len = (int) (end - start + 1);
      //reader负责从DataNode读取数据，构建socket连接到DataNode
      reader = new BlockReaderFactory(dfsClient.getConf()).
          setInetSocketAddress(targetAddr).
          setRemotePeerFactory(dfsClient).
          setDatanodeInfo(chosenNode).
          setStorageType(storageType).
          setFileName(src).
          setBlock(block.getBlock()).
          setBlockToken(blockToken).
          setStartOffset(start).
          setVerifyChecksum(verifyChecksum).
          setClientName(dfsClient.clientName).
          setLength(len).
          setCachingStrategy(curCachingStrategy).
          setAllowShortCircuitLocalReads(allowShortCircuitLocalReads).
          setClientCacheContext(dfsClient.getClientContext()).
          setUserGroupInformation(dfsClient.ugi).
          setConfiguration(dfsClient.getConfiguration()).
          build();
       //读取数据
      int nread = reader.readAll(buf, offset, len);
      updateReadStatistics(readStatistics, nread, reader);


      if (nread != len) {
        throw new IOException("truncated return from reader.read(): " +
                              "excpected " + len + ", got " + nread);
      }
      DFSClientFaultInjector.get().readFromDatanodeDelay();
      return;
    } catch (ChecksumException e) {
      String msg = "fetchBlockByteRange(). Got a checksum exception for "
          + src + " at " + block.getBlock() + ":" + e.getPos() + " from "
          + chosenNode;
      DFSClient.LOG.warn(msg);
      // we want to remember what we have tried
      
      addIntoCorruptedBlockMap(block.getBlock(), chosenNode, corruptedBlockMap);
      //如果读取失败，则将该DataNode标记位异常节点
      addToDeadNodes(chosenNode);
      throw new IOException(msg);
    } catch (IOException e) {
      if (e instanceof InvalidEncryptionKeyException && refetchEncryptionKey > 0) {
        DFSClient.LOG.info("Will fetch a new encryption key and retry, " 
            + "encryption key was invalid when connecting to " + targetAddr
            + " : " + e);
        // The encryption key used is invalid.
        refetchEncryptionKey--;
        dfsClient.clearDataEncryptionKey();
        continue;
      } else if (refetchToken > 0 && tokenRefetchNeeded(e, targetAddr)) {
        refetchToken--;
        try {
          fetchBlockAt(block.getStartOffset());
        } catch (IOException fbae) {
          // ignore IOE, since we can retry it later in a loop
        }
        continue;
      } else {
        String msg = "Failed to connect to " + targetAddr + " for file "
            + src + " for block " + block.getBlock() + ":" + e;
        DFSClient.LOG.warn("Connection failure: " + msg, e);
        addToDeadNodes(chosenNode);
        throw new IOException(msg);
      }
    } finally {
      if (reader != null) {
        reader.close();
      }
    }
  }
}

本文分析了HDFS写数据流程，从HDFS客户端调用DistributedFileSystem 和FSDataInputStream这两个核心类的方法，通过NameNodeRPC调用NameNode对应的实现方法，获取目标block数据块的元数据信息，通过得到的元数据信息从对应的DataNode节点读取数据，直到读完最后一个block，关闭DataNode和NameNode之间的数据流，完成数据读取。至此完成了HDFS读取数据核心源码的剖析，可以与上一篇文章《Hadoop核心源码剖析（写数据）》串起来一起读，或许会有更多的收获，有关更多细节可以自己动手深究。

总结

本篇文章也是Hadoop HDFS核心源码剖析的完结篇，HDFS主要的核心功能模块基本都涉及到了，因为篇幅有限，可能写得不是很细致，主要还是介绍自己对HDFS核心源码的解读，分享自己的源码阅读技巧，希望对学习路上的小伙伴有所启发，接下来会先从其他组件入手，开启新的一轮分享，有兴趣的小伙伴可以关注本公众号，持续跟进源码系列。同时你也可以在本篇末尾留言，说一下你的想法和期待。

数据与智能

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Hadoop核心源码剖析系列（五）: 读数据

点击上方“数据与智能”，“星标或置顶公众号”第一时间获取好内容作者 | 吴邪大数据4年从业经验，目前就职于广州一家互联网公司，负责大数据基础平台自研、离线计算&实时计算研究...
复制链接

扫一扫