hadoop 2.6 源码 解读之写操作之DataStreamer篇

33 篇文章 0 订阅
23 篇文章 2 订阅

DataStreamer是 在创建 文件流的时候已经初始化。其主要作用是,将队列中的packet发送到DataNode

DFSClient create()方法中

{

 final DFSOutputStream result = DFSOutputStream.newStreamForCreate(this,
        src, masked, flag, createParent, replication, blockSize, progress,
        buffersize, dfsClientConf.createChecksum(checksumOpt),
        favoredNodeStrs);
    beginFileLease(result.getFileId(), result);
    return result;

}

newStreamForCreate方法里
构造 DFSOutputStream 对象

{
final DFSOutputStream out = new DFSOutputStream(dfsClient, src, stat,
        flag, progress, checksum, favoredNodes);
//启动了 线程 streamer
        out.start();
}

DFSOutputStream 构造里 ,创建了DataStreamer

/** Construct a new output stream for creating a file. */
  private DFSOutputStream(DFSClient dfsClient, String src, HdfsFileStatus stat,
      EnumSet<CreateFlag> flag, Progressable progress,
      DataChecksum checksum, String[] favoredNodes) throws IOException {
    this(dfsClient, src, progress, stat, checksum);
    this.shouldSyncBlock = flag.contains(CreateFlag.SYNC_BLOCK);

    computePacketChunkSize(dfsClient.getConf().writePacketSize, bytesPerChecksum);

    Span traceSpan = null;
    if (Trace.isTracing()) {
      traceSpan = Trace.startSpan(this.getClass().getSimpleName()).detach();
    }
    streamer = new DataStreamer(stat, traceSpan);
    if (favoredNodes != null && favoredNodes.length != 0) {
      streamer.setFavoredNodes(favoredNodes);
    }
  }

重点看 DataStreamer run()方法,相当复杂,这里不摘录全部代码

public void run() {
...
//大循环
while (!streamerClosed && dfsClient.clientRunning) 
{

//先处理ResponseProcessor  应答器 Responder 异常情况,
//ResponseProcessor  作用是:Processes responses from the datanodes.  A packet is
// removed from the ackQueue when its response arrives.
// if the Responder encountered an error, shutdown Responder
        if (hasError && response != null) {
          try {
            response.close();
            response.join();
            response = null;
          } catch (InterruptedException  e) {
            DFSClient.LOG.warn("Caught exception ", e);
          }
        }
}

...
//处理数据节点错误
//processDatanodeError 会将 ackQueue 里面 packet 移动到 dataQueue
if (hasError && (errorIndex >= 0 || restartingNodeIndex >= 0)) {
            doSleep = processDatanodeError();
          }

...

// 等待条件比较多
// 主要是dataQueue.wait,等待数据包到来
while ((!streamerClosed && !hasError && dfsClient.clientRunning 
                && dataQueue.size() == 0 && 
                (stage != BlockConstructionStage.DATA_STREAMING || 
                 stage == BlockConstructionStage.DATA_STREAMING && 
                 now - lastPacket < dfsClient.getConf().socketTimeout/2)) || doSleep ) {
              long timeout = dfsClient.getConf().socketTimeout/2 - (now-lastPacket);
              timeout = timeout <= 0 ? 1000 : timeout;
              timeout = (stage == BlockConstructionStage.DATA_STREAMING)?
                 timeout : 1000;
              try {
                dataQueue.wait(timeout);
              } catch (InterruptedException  e) {
                DFSClient.LOG.warn("Caught exception ", e);
              }
              doSleep = false;
              now = Time.now();
            }
...
//nextBlockOutputStream()方法用来向Namenode 申请块信息,返回LocatedBlock 对象,其包含了 数据流pipeline 数据流节点信息 DatanodeInfo

// get new block from namenode.
          if (stage == BlockConstructionStage.PIPELINE_SETUP_CREATE) {
            if(DFSClient.LOG.isDebugEnabled()) {
              DFSClient.LOG.debug("Allocating new block");
            }
            //设置数据流管道
            setPipeline(nextBlockOutputStream());
            initDataStreaming();
          } 
...

//发送数据
try {
            one.writeTo(blockStream);
            blockStream.flush();   
          } c



...

}
总结
  • 整个写操作过程中,除了客户端的主线程,另外还有DataStreamer 发包线程,ResponseProcessor 处理响应线程
  • 在block 粒度上,hdfs 是强一致性的,写入即能看见
  • -
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值