HDFS读文件的源码(hadoop2.6.0)详解--read函数(2)

接下来我们开始分析第二个函数。

针对public int read(ByteBuffer buf)函数

代码如下:

@Override
  public int read(ByteBuffer buf) throws IOException {
    if (in instanceof ByteBufferReadable) {
      return ((ByteBufferReadable)in).read(buf);
    }

    throw new UnsupportedOperationException("Byte-buffer read unsupported by input stream");
  }

这个函数会调用DFSInputStream类中的相应read函数,代码如下:

@Override
  public synchronized int read(final ByteBuffer buf) throws IOException {
    ReaderStrategy byteBufferReader = new ByteBufferStrategy(buf);

    return readWithStrategy(byteBufferReader, 0, buf.remaining());
  }

这里首先创建了一个ByteBufferStrategy类对象,我们看看这个类的代码:

/**
   * Used to read bytes into a user-supplied ByteBuffer
   */
  private static class ByteBufferStrategy implements ReaderStrategy {
    final ByteBuffer buf;
    ByteBufferStrategy(ByteBuffer buf) {
       //将ByteBuffer类对象buf的引用赋值给相应的成员变量
       this.buf = buf;
    }

    //这个函数是一个重载函数,后面读取数据的时候会调用到
    @Override
    public int doRead(BlockReader blockReader, int off, int len,
        ReadStatistics readStatistics) throws ChecksumException, IOException {
      int oldpos = buf.position();
      int oldlimit = buf.limit();
      boolean success = false;
      try {
    	//读取数据到buf变量中
        int ret = blockReader.read(buf);
        success = true;
        //更新相应的统计数据
        updateReadStatistics(readStatistics, ret, blockReader);
        return ret;
      } finally {
        if (!success) {
          // Reset to original state so that retries work correctly.
          //如果读取失败了,那么就将存储数据的buf变量重置
          buf.position(oldpos);
          buf.limit(oldlimit);
        }
      } 
    }
  }

这个类中的函数doRead后面会被调用,用来读取数据。

我们继续看readWithStrategy函数,代码如下:

private int readWithStrategy(ReaderStrategy strategy, int off, int len) throws IOException {
	dfsClient.checkOpen();
    if (closed) {
      throw new IOException("Stream closed");
    }
    Map<ExtendedBlock,Set<DatanodeInfo>> corruptedBlockMap 
      = new HashMap<ExtendedBlock, Set<DatanodeInfo>>();
    failures = 0;
    //如果当前读取文件的位置小于当前总数据的大小
    if (pos < getFileLength()) {
      //重试次数
      int retries = 2;
      while (retries > 0) {
        try {
          // currentNode can be left as null if previous read had a checksum
          // error on the same block. See HDFS-3067
          //如果pos超过数据块边界或者当前currentNode为null,需要先更新
          if (pos > blockEnd || currentNode == null) {
        	//获取该位置所在的块
            currentNode = blockSeekTo(pos);
          }
          /*获取当前读取读取的数据大小和当前块从pos开始能够读取数据大小两者之间的最小值
           * 1、如果len小,那么就从当前块中读取len大小的数据
           * 2、如果(blockEnd - pos + 1L)小,那么就读取从pos开始当前块所有数据
           * 综上就是为了保证不至于读多数据,也不至于读越界
          */
          int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
          //如果最后一个数据块完成写了
          if (locatedBlocks.isLastBlockComplete()) {
            realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
          }
          //开始读取数据
          int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
          //如果读取数据大于=0
          if (result >= 0) {
        	//更新当前读取数据的位置
            pos += result;
          } else {
            // got a EOS from reader though we expect more data on it.
            throw new IOException("Unexpected EOS from the reader");
          }
          if (dfsClient.stats != null) {
        	//统计读取数据大小
            dfsClient.stats.incrementBytesRead(result);
          }
          return result;
        } catch (ChecksumException ce) {
          throw ce;            
        } catch (IOException e) {
          if (retries == 1) {
            DFSClient.LOG.warn("DFS Read", e);
          }
          blockEnd = -1;
          if (currentNode != null) { addToDeadNodes(currentNode); }
          //重试次数超过了2次,那么就抛出异常
          if (--retries == 0) {
            throw e;
          }
        } finally {
          // Check if need to report block replicas corruption either read
          // was successful or ChecksumException occured.
          //上报校验失败的datanode信息到namenode
          reportCheckSumFailure(corruptedBlockMap, 
              currentLocatedBlock.getLocations().length);
        }
      }
    }
    return -1;
  }

我们来看看blockSeekTo函数,这个函数用来根据pos获取相应的datanode信息,代码如下:

/**
   * Open a DataInputStream to a DataNode so that it can be read from.
   * We get block ID and the IDs of the destinations at startup, from the namenode.
   */
  private synchronized DatanodeInfo blockSeekTo(long target) throws IOException {
    if (target >= getFileLength()) {
      throw new IOException("Attempted to read past end of file");
    }

    // Will be getting a new BlockReader.
    if (blockReader != null) {
      blockReader.close();
      blockReader = null;
    }

    //
    // Connect to best DataNode for desired Block, with potential offset
    //
    DatanodeInfo chosenNode = null;
    int refetchToken = 1; // only need to get a new access token once
    int refetchEncryptionKey = 1; // only need to get a new encryption key once
    
    boolean connectFailedOnce = false;

    while (true) {
      //
      // Compute desired block
      //根据offset获取相应的块信息
      LocatedBlock targetBlock = getBlockAt(target, true);
      assert (target==pos) : "Wrong postion " + pos + " expect " + target;
      long offsetIntoBlock = target - targetBlock.getStartOffset();
      //根据targetBlock的块信息,获取到指定块所在的datanode信息,然后通过chooseDataNode选择符合要求的datanode,并将信息返回
      DNAddrPair retval = chooseDataNode(targetBlock, null);
      chosenNode = retval.info;
      InetSocketAddress targetAddr = retval.addr;
      StorageType storageType = retval.storageType;

      try {
        ExtendedBlock blk = targetBlock.getBlock();
        Token<BlockTokenIdentifier> accessToken = targetBlock.getBlockToken();
        //创建BlockReader类对象
        blockReader = new BlockReaderFactory(dfsClient.getConf()).
            setInetSocketAddress(targetAddr).
            setRemotePeerFactory(dfsClient).
            setDatanodeInfo(chosenNode).
            setStorageType(storageType).
            setFileName(src).
            setBlock(blk).
            setBlockToken(accessToken).
            setStartOffset(offsetIntoBlock).
            setVerifyChecksum(verifyChecksum).
            setClientName(dfsClient.clientName).
            setLength(blk.getNumBytes() - offsetIntoBlock).
            setCachingStrategy(cachingStrategy).
            setAllowShortCircuitLocalReads(!shortCircuitForbidden()).
            setClientCacheContext(dfsClient.getClientContext()).
            setUserGroupInformation(dfsClient.ugi).
            setConfiguration(dfsClient.getConfiguration()).
            build();
        //如果之前失败过,那么这里成功了那么会打印日志
        if(connectFailedOnce) {
          DFSClient.LOG.info("Successfully connected to " + targetAddr +
                             " for " + blk);
        }
        return chosenNode;
      } catch (IOException ex) {
        if (ex instanceof InvalidEncryptionKeyException && refetchEncryptionKey > 0) {
          DFSClient.LOG.info("Will fetch a new encryption key and retry, " 
              + "encryption key was invalid when connecting to " + targetAddr
              + " : " + ex);
          // The encryption key used is invalid.
          refetchEncryptionKey--;
          dfsClient.clearDataEncryptionKey();
        } else if (refetchToken > 0 && tokenRefetchNeeded(ex, targetAddr)) {
          refetchToken--;
          fetchBlockAt(target);
        } else {
          connectFailedOnce = true;
          DFSClient.LOG.warn("Failed to connect to " + targetAddr + " for block"
            + ", add to deadNodes and continue. " + ex, ex);
          // Put chosen node into dead list, continue
          addToDeadNodes(chosenNode);
        }
      }
    }
  }

这里函数chooseDataNode我们在read函数(1)中已经讲过,这里不再赘述,跳转到相应的网页查看

同样这里拿到了datanode信息后就会去创建BlockReader类对象,这个类对象用来读取相应的块数据,后面会详细讲到。

getBlockAt函数用来获取数据块信息,优先到缓存中查找,找不到再到namenode去获取同时更新缓存中datanode的信息数据,代码如下:

/**
   * Get block at the specified position.
   * Fetch it from the namenode if not cached.
   * 
   * @param offset block corresponding to this offset in file is returned
   * @param updatePosition whether to update current position
   * @return located block
   * @throws IOException
   */
  private synchronized LocatedBlock getBlockAt(long offset,
      boolean updatePosition) throws IOException {
    assert (locatedBlocks != null) : "locatedBlocks is null";

    final LocatedBlock blk;

    //check offset
    if (offset < 0 || offset >= getFileLength()) {
      throw new IOException("offset < 0 || offset >= getFileLength(), offset="
          + offset
          + ", updatePosition=" + updatePosition
          + ", locatedBlocks=" + locatedBlocks);
    }
    else if (offset >= locatedBlocks.getFileLength()) {
      // offset to the portion of the last block,
      // which is not known to the name-node yet;
      // getting the last block 
      blk = locatedBlocks.getLastLocatedBlock();
    }
    else {
      // search cached blocks first
      // 先到缓存中找,采用二分查找法
      int targetBlockIdx = locatedBlocks.findBlock(offset);
      if (targetBlockIdx < 0) { // block is not cached
    	//如果小于0,说明没有找到,此时将targetBlockIdx取反,得到正数,也就是后面紧随的id号
        targetBlockIdx = LocatedBlocks.getInsertIndex(targetBlockIdx);
        // fetch more blocks 查找更多的数据块
        final LocatedBlocks newBlocks = dfsClient.getLocatedBlocks(src, offset);
        assert (newBlocks != null) : "Could not find target position " + offset;//如果assert为false,那么就抛出java.lang.AssertionError异常
        //将新查到的块插入到原有的块队列中
        locatedBlocks.insertRange(targetBlockIdx, newBlocks.getLocatedBlocks());
      }
      //设置指定的块
      blk = locatedBlocks.get(targetBlockIdx);
    }

    // update current position
    if (updatePosition) {
      pos = offset;
      blockEnd = blk.getStartOffset() + blk.getBlockSize() - 1;
      currentLocatedBlock = blk;
    }
    return blk;
  }

至此blockSeekTo函数的流程就走完了,我们开始读数据,调用readBuffer函数,代码如下:

/* This is a used by regular read() and handles ChecksumExceptions.
   * name readBuffer() is chosen to imply similarity to readBuffer() in
   * ChecksumFileSystem
   */ 
  private synchronized int readBuffer(ReaderStrategy reader, int off, int len,
      Map<ExtendedBlock, Set<DatanodeInfo>> corruptedBlockMap)
      throws IOException {
    IOException ioe;
    
    /* we retry current node only once. So this is set to true only here.
     * Intention is to handle one common case of an error that is not a
     * failure on datanode or client : when DataNode closes the connection
     * since client is idle. If there are other cases of "non-errors" then
     * then a datanode might be retried by setting this to true again.
     */
    boolean retryCurrentNode = true;

    while (true) {
      // retry as many times as seekToNewSource allows.
      try {
        return reader.doRead(blockReader, off, len, readStatistics);
      } catch ( ChecksumException ce ) {
        DFSClient.LOG.warn("Found Checksum error for "
            + getCurrentBlock() + " from " + currentNode
            + " at " + ce.getPos());        
        ioe = ce;
        retryCurrentNode = false;
        // we want to remember which block replicas we have tried
        addIntoCorruptedBlockMap(getCurrentBlock(), currentNode,
            corruptedBlockMap);
      } catch ( IOException e ) {
        if (!retryCurrentNode) {
          DFSClient.LOG.warn("Exception while reading from "
              + getCurrentBlock() + " of " + src + " from "
              + currentNode, e);
        }
        ioe = e;
      }
      boolean sourceFound = false;
      if (retryCurrentNode) {
        /* possibly retry the same node so that transient errors don't
         * result in application level failures (e.g. Datanode could have
         * closed the connection because the client is idle for too long).
         * 获取当前pos所在的datanode并更新
         */ 
        sourceFound = seekToBlockSource(pos);
      } else {//如果之前已经重试过了,那么就将当前currentNode加入死亡队列
        addToDeadNodes(currentNode);
        sourceFound = seekToNewSource(pos);
      }
      //如果没有发现新的datanode,那么就抛出异常
      if (!sourceFound) {
        throw ioe;
      }
      retryCurrentNode = false;
    }
  }

通过调用doRead函数来获取块数据,doRead函数属于类ByteBufferStrategy中,该类代码如下:

/**
   * Used to read bytes into a user-supplied ByteBuffer
   */
  private static class ByteBufferStrategy implements ReaderStrategy {
    final ByteBuffer buf;
    ByteBufferStrategy(ByteBuffer buf) {
      this.buf = buf;
    }

    @Override
    public int doRead(BlockReader blockReader, int off, int len,
        ReadStatistics readStatistics) throws ChecksumException, IOException {
      int oldpos = buf.position();
      int oldlimit = buf.limit();
      boolean success = false;
      try {
    	//读取数据到buf变量中
        int ret = blockReader.read(buf);
        success = true;
        //更新相应的统计数据
        updateReadStatistics(readStatistics, ret, blockReader);
        return ret;
      } finally {
        if (!success) {
          // Reset to original state so that retries work correctly.
          //如果读取失败了,那么就将存储数据的buf变量重置
          buf.position(oldpos);
          buf.limit(oldlimit);
        }
      } 
    }
  }

可以看到,最终doRead函数中会执行BlockReader类对象blockReader中的read函数。

read函数整个过程大致就分析完毕了,我们总结一下,如图1

read调用流程图
图1

接下来我们开始分析第三个函数"HDFS读文件的源码(hadoop2.6.0)详解--read函数(3)"

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值