HDFS读文件的源码(hadoop2.6.0)详解--read函数(3)

最新推荐文章于 2023-02-19 00:01:13 发布

乘风如水

最新推荐文章于 2023-02-19 00:01:13 发布

阅读量358

点赞数

分类专栏： hadoop

本文链接：https://blog.csdn.net/weixin_39935887/article/details/85316152

版权

hadoop 专栏收录该内容

36 篇文章 2 订阅

订阅专栏

接下来我们开始分析第三个函数。

public ByteBuffer read(ByteBufferPool bufferPool, int maxLength,EnumSet<ReadOption> opts)

代码如下：

@Override
  public ByteBuffer read(ByteBufferPool bufferPool, int maxLength,
      EnumSet<ReadOption> opts) 
          throws IOException, UnsupportedOperationException {
    try {
      return ((HasEnhancedByteBufferAccess)in).read(bufferPool,
          maxLength, opts);
    }
    catch (ClassCastException e) {
      ByteBuffer buffer = ByteBufferUtil.
          fallbackRead(this, bufferPool, maxLength);
      if (buffer != null) {
        extendedReadBuffers.put(buffer, bufferPool);
      }
      return buffer;
    }
  }

这个代码会调用DFSInputStream类中的函数read(ByteBufferPool bufferPool,int maxLength, EnumSet<ReadOption> opts)，代码如下：

@Override
  /*这里bufferPool用来存储读出来的数据,
    maxLength表示读取数据的大小
    opts为读数据方式(比如SKIP_CHECKSUMS,这个表示跳过文件校验)
  */
  public synchronized ByteBuffer read(ByteBufferPool bufferPool,
      int maxLength, EnumSet<ReadOption> opts) 
          throws IOException, UnsupportedOperationException {
    if (maxLength == 0) {
      return EMPTY_BUFFER;
    } else if (maxLength < 0) {
      throw new IllegalArgumentException("can't read a negative " +
          "number of bytes.");
    }
    //如果blockReader为null或者blockEnd为-1(也就是当前块无效或者块对象还没初始化)
    if ((blockReader == null) || (blockEnd == -1)) {
      //如果当前没有可读的数据,那么就返回null
      if (pos >= getFileLength()) {
        return null;
      }
      /*
       * If we don't have a blockReader, or the one we have has no more bytes
       * left to read, we call seekToBlockSource to get a new blockReader and
       * recalculate blockEnd.  Note that we assume we're not at EOF here
       * (we check this above).
       */
      /*根据pos获取对应的数据块,这里用!多此一举,因为seekToBlockSource函数要么抛出异常，要么返回true,所以!seekToBlockSource(pos)永远都为false
                        如果该函数返回false或者blockReader为null那么就抛出异常
      */
      if ((!seekToBlockSource(pos)) || (blockReader == null)) {
        throw new IOException("failed to allocate new BlockReader " +
            "at position " + pos);
      }
    }
    ByteBuffer buffer = null;
    //判断是否支持零拷贝方式
    if (dfsClient.getConf().shortCircuitMmapEnabled) {
      buffer = tryReadZeroCopy(maxLength, opts);
    }
    if (buffer != null) {
      return buffer;
    }
    //如果零拷贝不成功,那么会退化为一个普通的读取
    buffer = ByteBufferUtil.fallbackRead(this, bufferPool, maxLength);
    if (buffer != null) {
      //将数据放入到extendedReadBuffers中
      extendedReadBuffers.put(buffer, bufferPool);
    }
    return buffer;
  }

该函数首先会去判断是否支持零拷贝，如果不支持那么就采用普通的读取方式，关于零拷贝的函数tryReadZeroCopy我们这边暂时不讲，等到我们分析完BlockReader类后，我们会进行讲解，我们来看fallbackRead函数，该函数如下：

/**
   * Perform a fallback read.
   */
  public static ByteBuffer fallbackRead(
      InputStream stream, ByteBufferPool bufferPool, int maxLength)
          throws IOException {
    if (bufferPool == null) {
      throw new UnsupportedOperationException("zero-copy reads " +
          "were not available, and you did not provide a fallback " +
          "ByteBufferPool.");
    }
    //判断stream是否支持将数据读入ByteBuffer
    boolean useDirect = streamHasByteBufferRead(stream);
    ByteBuffer buffer = bufferPool.getBuffer(useDirect, maxLength);
    if (buffer == null) {
      throw new UnsupportedOperationException("zero-copy reads " +
          "were not available, and the ByteBufferPool did not provide " +
          "us with " + (useDirect ? "a direct" : "an indirect") +
          "buffer.");
    }
    Preconditions.checkState(buffer.capacity() > 0);
    Preconditions.checkState(buffer.isDirect() == useDirect);
    maxLength = Math.min(maxLength, buffer.capacity());
    boolean success = false;
    try {
      //如果stream支持ByteBufferReadable接口
      if (useDirect) {
    	//情况对象
        buffer.clear();
        //将大小限制在maxLength
        buffer.limit(maxLength);
        ByteBufferReadable readable = (ByteBufferReadable)stream;
        int totalRead = 0;
        //开始循环读数据
        while (true) {
          //如果读取的数据达到了要求那么就终止循环
          if (totalRead >= maxLength) {
            success = true;
            break;
          }
          */开始读取数据,调用相应函数,最终调用DFSInputStream类中的read(final ByteBuffer 
           buf)函数，关于这个函数在“HDFS读文件的源码(hadoop2.6.0)详解--read函数(2)”有详细分 
           析,这里不再赘述
          */
          int nRead = readable.read(buffer);
          if (nRead < 0) {
            if (totalRead > 0) {
              success = true;
            }
            break;
          }
          totalRead += nRead;
        }
        buffer.flip();
      } else {
        buffer.clear();
        /*调用InputStream.read方法,最终调用的是DFSInputStream类中的 
          readWithStrategy(ReaderStrategy strategy, int off, int len)函数，关于这个函 
          数在“HDFS读文件的源码(hadoop2.6.0)详解--read函数(2)”有详细分析,这里不再赘述
        */
        int nRead = stream.read(buffer.array(),
            buffer.arrayOffset(), maxLength);
        if (nRead >= 0) {
          buffer.limit(nRead);
          success = true;
        }
      }
    } finally {
      //如果读取失败或者没有数据可读了
      if (!success) {
        // If we got an error while reading, or if we are at EOF, we 
        // don't need the buffer any more.  We can give it back to the
        // bufferPool.
        bufferPool.putBuffer(buffer);
        buffer = null;
      }
    }
    return buffer;
  }

至此，该read函数就分析完了，我们这里总结一下，看图1

接下来我们开始分析最后一个函数"HDFS读文件的源码(hadoop2.6.0)详解--read函数(4)"

乘风如水

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
HDFS读文件的源码(hadoop2.6.0)详解--read函数(3)

接下来我们开始分析第三个函数。public ByteBuffer read(ByteBufferPool bufferPool, int maxLength,EnumSet&lt;ReadOption&gt; opts) 代码如下：@Override public ByteBuffer read(ByteBufferPool bufferPool, int maxLength,...
复制链接

扫一扫