HDFS源码-写数据流程

最新推荐文章于 2022-05-31 00:02:47 发布

尹忠政

最新推荐文章于 2022-05-31 00:02:47 发布

阅读量569

点赞数

分类专栏： hadoop 文章标签： hdfs hadoop big data

本文链接：https://blog.csdn.net/qq_22271479/article/details/121568735

版权

hadoop 专栏收录该内容

13 篇文章 1 订阅

订阅专栏

HDFS源码-写数据流程

hadoop版本：2.7.0

文章目录

HDFS源码-写数据流程

前言

这里主要分析HDFS的写数据流程，包括客户端、输出流包装类对象、chunk、packet、pipeline、dataqueue、ackqueue等概念内容比较多，慢慢一点点的分析。这里我以copyToLocalFile()方法为例来通过源码整体分析HDFS的写数据流程。

重点解析

DFSOutputStream介绍
客户端应用程序写的数据被该字节流缓存起来了，数据被拆分成若干个packets，每个packet 大小为64kb。每个packet由若干个chunk组成。每个chunk大小为512字节，每个chunk都有一个cheksum。当客户端写满packet后，此packet被加入dataQueue队列。DataStreamer 线程从dataQueue中取出packet，发送给第一个管道中的第一个DataNode，然后将此packet从dataQueue移动到ackQueue。ResponseProcessor 用于接收datanode发送的ack响应，当收到所有dataNode的asck 后，那么次packet将从ackQueue中移除，至此，该packet就算是写成功了。在出错的情况下，ackQueue中所有的packet都会转移到dataQueue中，在ackQueue中的packet都是尚未写成功的数据。会建立新的pipeline，去除有问题的DataNode节点，重新发送dataQueue中的packet至pipeline中。

/****************************************************************
 * DFSOutputStream 是一个用来创建文件的字节流
 * 
 * DFSOutputStream creates files from a stream of bytes.
 * 
 *  客户端应用程序写的数据被该字节流缓存起来了，数据被拆分成若干个packets，每个packet 
 *  大小为64kb。每个packet由若干个chunk组成。每个chunk大小为512字节，每个chunk都有一个cheksum
 * 
 * The client application writes data that is cached internally by
 * this stream. Data is broken up into packets, each packet is
 * typically 64K in size. A packet comprises of chunks. Each chunk
 * is typically 512 bytes and has an associated checksum with it.
 * 
 * 当客户端写满packet后，此packet被加入dataQueue队列。DataStreamer 线程从dataQueue
 * 中取出packet，发送给第一个管道中的第一个DataNode，然后将此packet从dataQueue移动到
 * ackQueue。ResponseProcessor 用于接收datanode发送的ack响应，当收到所有dataNode的asck
 * 后，那么次packet将从ackQueue中移除，至此，该packet就算是写成功了。
 * When a client application fills up the currentPacket, it is
 * enqueued into dataQueue.  The DataStreamer thread picks up
 * packets from the dataQueue, sends it to the first datanode in
 * the pipeline and moves it from the dataQueue to the ackQueue.
 * The ResponseProcessor receives acks from the datanodes. When an
 * successful ack for a packet is received from all datanodes, the
 * ResponseProcessor removes the corresponding packet from the
 * ackQueue.
 * 
 * 在出错的情况下，ackQueue中所有的packet都会转移到dataQueue中，
 * 在ackQueue中的packet都是尚未写成功的数据。会建立新的pipeline，去除
 * 有问题的DataNode节点，重新发送dataQueue中的packet至pipeline中。
 * 
 * In case of error, all outstanding packets and moved from
 * ackQueue. A new pipeline is setup by eliminating the bad
 * datanode from the original pipeline. The DataStreamer now
 * starts sending packets from the dataQueue.
****************************************************************/

DFSPacket & chunk

一、整体写流程

1. 建立管道
在这里插入图片描述
1.1 客户端源头代码

FileSystem fileSystem = FileSystem.get(new Configuration());
fileSystem.copyFromLocalFile();

1.2 这里是通过递归的方式进行文件的复制，再次之前需要向NameNode进行文件校验，而后进行IO操作写数据。这里的流都是装饰者模式，主要的输出流对象就是FSDataOutputStream

/** Copy files between FileSystems. */
  public static boolean copy(FileSystem srcFS, FileStatus srcStatus,
                             FileSystem dstFS, Path dst,
                             boolean deleteSource,
                             boolean overwrite,
                             Configuration conf) throws IOException {
    Path src = srcStatus.getPath();
    //校验文件是否存在
    dst = checkDest(src.getName(), dstFS, dst, overwrite);
    //文件夹，这里需要递归创建目录，通过NameNode
    if (srcStatus.isDirectory()) {
      //文件夹的层级依赖 不能复制文件到本身 不能复制文件夹到子目录
      checkDependencies(srcFS, src, dstFS, dst);
      //创建文件夹
      //TODO 下回分析NameNode的EditLog的原理
      if (!dstFS.mkdirs(dst)) {
        return false;
      }
      FileStatus contents[] = srcFS.listStatus(src);
      //如果是文件，递归调用copy方法
      for (int i = 0; i < contents.length; i++) {
        //递归复制每一个文件
        copy(srcFS, contents[i], dstFS,
             new Path(dst, contents[i].getPath().getName()),
             deleteSource, overwrite, conf);
      }
    } else {
      //文件写操作
      InputStream in=null;
      OutputStream out = null;
      try {
        in = srcFS.open(src);
        //DFSOutputStream
        //创建DFSOutputStream
        out = dstFS.create(dst, overwrite);
        IOUtils.copyBytes(in, out, conf, true);
      } catch (IOException e) {
        IOUtils.closeStream(out);
        IOUtils.closeStream(in);
        throw e;
      }
    }
    if (deleteSource) {
      return srcFS.delete(src, true);
    } else {
      return true;
    }
  
  }

1.3 创建FSDataOutputStream对象，由DistributedFileSystem创建。

@Override
  public FSDataOutputStream create(final Path f, final FsPermission permission,
    final EnumSet<CreateFlag> cflags, final int bufferSize,
    final short replication, final long blockSize, final Progressable progress,
    final ChecksumOpt checksumOpt) throws IOException {
    statistics.incrementWriteOps(1);
    Path absF = fixRelativePart(f);
    return new FileSystemLinkResolver<FSDataOutputStream>() {
      @Override
      public FSDataOutputStream doCall(final Path p)
          throws IOException, UnresolvedLinkException {
        final DFSOutputStream dfsos = dfs.create(getPathName(p), permission,
                cflags, replication, blockSize, progress, bufferSize,
                checksumOpt);
        return dfs.createWrappedOutputStream(dfsos, statistics);
      }
      @Override
      public FSDataOutputStream next(final FileSystem fs, final Path p)
          throws IOException {
        return fs.create(p, permission, cflags, bufferSize,
            replication, blockSize, progress, checksumOpt);
      }
    }.resolve(this, absF);
  }

1.4管道建立
①客户端 DFSOutputStream#createBlockOutputStream()方法

			assert null == s : "Previous socket unclosed";
          assert null == blockReplyStream : "Previous blockReplyStream unclosed";
          //和DN建立了管道
          s = createSocketForPipeline(nodes[0], nodes.length, dfsClient);
          long writeTimeout = dfsClient.getDatanodeWriteTimeout(nodes.length);
          
          OutputStream unbufOut = NetUtils.getOutputStream(s, writeTimeout);
          InputStream unbufIn = NetUtils.getInputStream(s);
          IOStreamPair saslStreams = dfsClient.saslClient.socketSend(s,
            unbufOut, unbufIn, dfsClient, accessToken, nodes[0]);
          unbufOut = saslStreams.out;
          unbufIn = saslStreams.in;
          out = new DataOutputStream(new BufferedOutputStream(unbufOut,
              HdfsConstants.SMALL_BUFFER_SIZE));
          blockReplyStream = new DataInputStream(unbufIn);
  
          //
          // Xmit header info to datanode
          //

          BlockConstructionStage bcs = recoveryFlag? stage.getRecoveryStage(): stage;

          // We cannot change the block length in 'block' as it counts the number
          // of bytes ack'ed.
          ExtendedBlock blockCopy = new ExtendedBlock(block);
          blockCopy.setNumBytes(blockSize);

          boolean[] targetPinnings = getPinnings(nodes, true);
          // send the request
          //发送待写入的数据块信息
          new Sender(out).writeBlock(blockCopy, nodeStorageTypes[0], accessToken,
              dfsClient.clientName, nodes, nodeStorageTypes, null, bcs,
              nodes.length, block.getNumBytes(), bytesSent, newGS,
              checksum4WriteBlock, cachingStrategy.get(), isLazyPersistFile,
            (targetPinnings == null ? false : targetPinnings[0]), targetPinnings);

②DataNode端 DataXceiver#run()方法

readOp()，这里的OP占用2字节，short类型
processOp()
writeBlock()
Sender#writeBlock()

2.写数据流程
在这里插入图片描述

尹忠政

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
HDFS源码-写数据流程

HDFS源码-写数据流程hadoop版本：2.7.0文章目录HDFS源码-写数据流程前言一、整体写流程二、使用步骤1.引入库2.读入数据总结前言这里主要分析HDFS的写数据流程，包括客户端、输出流包装类对象、chunk、packet、pipeline、dataqueue、ackqueue等概念内容比较多，慢慢一点点的分析。这里我以copyToLocalFile()方法为例来通过源码整体分析HDFS的写数据流程。一、整体写流程示例：pandas 是基于NumPy 的一种工具，该工具是为了解决数据
复制链接

扫一扫

专栏目录