HDFS读文件流程概述

我们之前讲过FSDataInputStream类中的read函数(总共有四篇,网址分别是read(1)read(2)read(3)read(4)),这些read函数会调用DFSInputStream类中的相应的read函数,在DFSInputStream类的read函数分别有:

//第一个read函数
public synchronized int read()

//第二个read函数
public synchronized int read(final byte buf[], int off, int len)

//第三个read函数
public synchronized int read(final ByteBuffer buf)

//第四个read函数
public int read(long position, byte[] buffer, int offset, int length)

//第五个read函数

public synchronized ByteBuffer read(ByteBufferPool bufferPool,int maxLength, EnumSet<ReadOption> opts)

 

这5个read函数都会创建BlockReaderFactory类对象,并执行该对象的build函数,这个函数代码如下:

/**
   * Build a BlockReader with the given options.
   *
   * This function will do the best it can to create a block reader that meets
   * all of our requirements.  We prefer short-circuit block readers
   * (BlockReaderLocal and BlockReaderLocalLegacy) over remote ones, since the
   * former avoid the overhead of socket communication.  If short-circuit is
   * unavailable, our next fallback is data transfer over UNIX domain sockets,
   * if dfs.client.domain.socket.data.traffic has been enabled.  If that doesn't
   * work, we will try to create a remote block reader that operates over TCP
   * sockets.
   *
   * There are a few caches that are important here.
   *
   * The ShortCircuitCache stores file descriptor objects which have been passed
   * from the DataNode. 
   *
   * The DomainSocketFactory stores information about UNIX domain socket paths
   * that we not been able to use in the past, so that we don't waste time
   * retrying them over and over.  (Like all the caches, it does have a timeout,
   * though.)
   *
   * The PeerCache stores peers that we have used in the past.  If we can reuse
   * one of these peers, we avoid the overhead of re-opening a socket.  However,
   * if the socket has been timed out on the remote end, our attempt to reuse
   * the socket may end with an IOException.  For that reason, we limit our
   * attempts at socket reuse to dfs.client.cached.conn.retry times.  After
   * that, we create new sockets.  This avoids the problem where a thread tries
   * to talk to a peer that it hasn't talked to in a while, and has to clean out
   * every entry in a socket cache full of stale entries.
   *
   * @return The new BlockReader.  We will not return null.
   *
   * @throws InvalidToken
   *             If the block token was invalid.
   *         InvalidEncryptionKeyException
   *             If the encryption key was invalid.
   *         Other IOException
   *             If there was another problem.
   */
  public BlockReader build() throws IOException {
    BlockReader reader = null;

    Preconditions.checkNotNull(configuration);
    //如果允许短路读操作
    if (conf.shortCircuitLocalReads && allowShortCircuitLocalReads) {
      //判断是否支持老版本(HDFS-2246)的短路读,这种情况是通过RPC从datanode上获取文件路径,然后客户端直接通过该文件路径读取数据,不过由于这种方式可以浏览文件所有数据,所以是不太安全的。
      if (clientContext.getUseLegacyBlockReaderLocal()) {
    	//获取BlockReaderLocalLegacy类对象
        reader = getLegacyBlockReaderLocal();
        if (reader != null) {
          if (LOG.isTraceEnabled()) {
            LOG.trace(this + ": returning new legacy block reader local.");
          }
          return reader;
        }
      } else {//如果不支持老版本的短路读,那么就进行新版(HDFS-347)的短路读
        reader = getBlockReaderLocal();
        if (reader != null) {
          if (LOG.isTraceEnabled()) {
            LOG.trace(this + ": returning new block reader local.");
          }
          return reader;
        }
      }
    }
    if (conf.domainSocketDataTraffic) {
      reader = getRemoteBlockReaderFromDomain();
      if (reader != null) {
        if (LOG.isTraceEnabled()) {
          LOG.trace(this + ": returning new remote block reader using " +
              "UNIX domain socket on " + pathInfo.getPath());
        }
        return reader;
      }
    }
    Preconditions.checkState(!DFSInputStream.tcpReadsDisabledForTesting,
        "TCP reads were disabled for testing, but we failed to " +
        "do a non-TCP read.");
    return getRemoteBlockReaderFromTcp();
  }

这个build函数会创建一个BlockReader接口实现类对象,相应的实现结构如下:

BlockReaderç±»

其中

1、getLegacyBlockReaderLocal函数返回的是BlockReaderLocalLegacy类对象;

2、getBlockReaderLocal函数返回的是BlockReaderLocal类对象;

3、getRemoteBlockReaderFromDomain函数和getRemoteBlockReaderFromTcp函数返回的是RemoteBlockReader2类对象;

        其中第一个和第二个获取到的对象是针对短路读的,只不过第一个是比较老的版本,现在已经被废弃了。所谓的短路读就是客户端和datanode在同一台服务器上,此时读取数据就没必要走网络,而是直接进行数据读取操作。关于这个:

老版本(HDFS-2246)的做法是

通过RPC获取datanode上的数据文件和对应的校验文件绝对路径,然后客户端通过两个文件的绝对路径直接进行文件的读取操作。

新版本(HDFS-347)的做法是

通过UNIX Domain Socket进程间通信方式,它使得同一台机器上的两个进程能以Socket的方式通信,并且还可以在进程间传递文件描述符。通过domain socket从datanode进程中将数据文件和对应的校验文件描述符传递到客户端进程中,然后客户端就可以通过文件描述符进行相应的数据读取操作了。

老版本的缺点是由于把文件的绝对路径提供给了客户端,这样就允许外界对datanode上的文件进行写操作,存在安全性问题,而新版本就不存在这个问题,通过domain socket可以传输只读的文件描述符给客户端,这样就可以禁止外界对文件的修改操作。

       第三个分别是通过domain socket和tcp来读取数据文件

最终的读取文件操作都落到了BlockReaderLocalLegacy、BlockReaderLocal、RemoteBlockReader2实现的read函数上。这些read函数我们后面会分别进行分析。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值