HDFS源码-4 ls命令解析

本文深入解析Hadoop中FsShell命令的执行流程,从源码层面解释如何使用FsShell进行文件系统操作,包括环境搭建、详细分析及关键类FsShell的工作原理。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >


在执行 hadoop fs -ls /时,会列出文件路径,如:

hadoop fs -ls /
Found 1 items
drwxr-xr-x   - didi supergroup          0 2019-06-09 13:57 /user

本文解析其执行过程。

1. 环境

  1. 在源码中添加log等调试信息,如Fsshell.java中的main函数入口,如:
     public static void main(String argv[]) throws Exception {
     	System.out.println("进入FsShell控制台..."); //加了一行打印
     	FsShell shell = newShellInstance();
     	...
    
  2. 打包
    mvn clean package -DskipTests -Pdist,native -Dtar -Dmaven.javadoc.skip=true
    
    
  3. 打包后的文件在 hadoop-dist文件夹下
  4. 将配置文件(core-site.xml等文件)复制到对应的etc/hadoop下
  5. 进入打包文件夹下,启动
    cd hadoop-dist/target/hadoop-2.7.2-2323
    sh -x ./bin/hadoop fs -ls /
    

经过上述几个步骤,通过sh -x打印详细信息可以分析。

2. 详细分析

hadoop脚本

首先进入源码的hadoop脚本:

bin=`which $0`
bin=`dirname ${bin}`
bin=`cd "$bin"; pwd`

具体信息如下:

+ set -v
bin=`which $0`
which $0
++ which ./bin/hadoop
+ bin=./bin/hadoop
bin=`dirname ${bin}`
dirname ${bin}
++ dirname ./bin/hadoop
+ bin=./bin
bin=`cd "$bin"; pwd`
cd "$bin"; pwd
++ cd ./bin
++ pwd
+ bin=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/bin

目的是得到当前执行hadoop脚本的位置并进入。
接下来:

DEFAULT_LIBEXEC_DIR="$bin"/../libexec

HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh

sh 和 ./ 一样。建子shell,子shell可用父shell变量
source 和 . 一样。相当于把文本复制到当前shell执行,同一个shell

目的是加载统计目录下的配置文件hadoop-config.sh

接下来是hadoop-config.sh内容:

this="${BASH_SOURCE-$0}" # 获得路径及脚本
common_bin=$(cd -P -- "$(dirname -- "$this")" && pwd -P)
script="$(basename -- "$this")"
this="$common_bin/$script"

[ -f "$common_bin/hadoop-layout.sh" ] && . "$common_bin/hadoop-layout.sh"

具体解析如下:

++ script=hadoop-config.sh
this="$common_bin/$script"
++ this=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/libexec/hadoop-config.sh

[ -f "$common_bin/hadoop-layout.sh" ] && . "$common_bin/hadoop-layout.sh"
++ '[' -f /Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/libexec/hadoop-layout.sh ']'

可知 hadoop-layout.sh 并不存在,不用执行。

接下来是share包的加载:

HADOOP_COMMON_DIR=${HADOOP_COMMON_DIR:-"share/hadoop/common"}
HADOOP_COMMON_LIB_JARS_DIR=${HADOOP_COMMON_LIB_JARS_DIR:-"share/hadoop/common/lib"}
HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_COMMON_LIB_NATIVE_DIR:-"lib/native"}
HDFS_DIR=${HDFS_DIR:-"share/hadoop/hdfs"}
HDFS_LIB_JARS_DIR=${HDFS_LIB_JARS_DIR:-"share/hadoop/hdfs/lib"}
YARN_DIR=${YARN_DIR:-"share/hadoop/yarn"}
YARN_LIB_JARS_DIR=${YARN_LIB_JARS_DIR:-"share/hadoop/yarn/lib"}
MAPRED_DIR=${MAPRED_DIR:-"share/hadoop/mapreduce"}
MAPRED_LIB_JARS_DIR=${MAPRED_LIB_JARS_DIR:-"share/hadoop/mapreduce/lib"}

# the root of the Hadoop installation
# See HADOOP-6255 for directory structure layout
HADOOP_DEFAULT_PREFIX=$(cd -P -- "$common_bin"/.. && pwd -P)
HADOOP_PREFIX=${HADOOP_PREFIX:-$HADOOP_DEFAULT_PREFIX}
export HADOOP_PREFIX

接下来:

if [ $# -gt 1 ]
then
    if [ "--config" = "$1" ]
	  then
	      shift
	      confdir=$1
	      if [ ! -d "$confdir" ]; then
                echo "Error: Cannot find configuration directory: $confdir"
                exit 1
             fi
	      shift
	      HADOOP_CONF_DIR=$confdir
    fi
fi

$#表示参数个数,我们通过调试信息可知:

++ '[' 3 -gt 1 ']'
++ '[' --config = fs ']'

我们的参数有3个,且第一个参数是fs,不是–config,故该条件不满足。这里可知,hadoop允许传入–config参数设定 HADOOP_CONF_DIR 目录。

其他内容暂时忽略。回到hadoop脚本。

function print_usage(){
  echo "Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]"
  echo "  CLASSNAME            run the class named CLASSNAME"
  echo " or"
  echo "  where COMMAND is one of:"
  echo "  fs                   run a generic filesystem user client"
  echo "  version              print the version"
  echo "  jar <jar>            run a jar file"
  echo "                       note: please use \"yarn jar\" to launch"
  echo "                             YARN applications, not this command."
  echo "  checknative [-a|-h]  check native hadoop and compression libraries availability"
  echo "  distcp <srcurl> <desturl> copy file or directories recursively"
  echo "  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive"
  echo "  classpath            prints the class path needed to get the"
  echo "  credential           interact with credential providers"
  echo "                       Hadoop jar and the required libraries"
  echo "  daemonlog            get/set the log level for each daemon"
  echo "  trace                view and modify Hadoop tracing settings"
  echo "  externaltrash        run a external trash tool"
  echo ""
  echo "Most commands print help when invoked w/o parameters."
}

提示信息。

if [ $# = 0 ]; then
  print_usage
  exit
fi

如果参数是0,则显示提示信息。
以下命令将显示hadoop命令可以接的所有参数:

COMMAND=$1
case $COMMAND in
  # usage flags
  --help|-help|-h)
    print_usage
    exit
    ;;

  #hdfs commands
  namenode|secondarynamenode|datanode|dfs|dfsadmin|fsck|balancer|fetchdt|oiv|dfsgroups|portmap|nfs3)
    echo "DEPRECATED: Use of this script to execute hdfs command is deprecated." 1>&2
    echo "Instead use the hdfs command for it." 1>&2
    echo "" 1>&2
    #try to locate hdfs and if present, delegate to it.  
    shift
    if [ -f "${HADOOP_HDFS_HOME}"/bin/hdfs ]; then
      exec "${HADOOP_HDFS_HOME}"/bin/hdfs ${COMMAND/dfsgroups/groups}  "$@"
    elif [ -f "${HADOOP_PREFIX}"/bin/hdfs ]; then
      exec "${HADOOP_PREFIX}"/bin/hdfs ${COMMAND/dfsgroups/groups} "$@"
    else
      echo "HADOOP_HDFS_HOME not found!"
      exit 1
    fi
    ;;

  #mapred commands for backwards compatibility
  pipes|job|queue|mrgroups|mradmin|jobtracker|tasktracker)
    echo "DEPRECATED: Use of this script to execute mapred command is deprecated." 1>&2
    echo "Instead use the mapred command for it." 1>&2
    echo "" 1>&2
    #try to locate mapred and if present, delegate to it.
    shift
    if [ -f "${HADOOP_MAPRED_HOME}"/bin/mapred ]; then
      exec "${HADOOP_MAPRED_HOME}"/bin/mapred ${COMMAND/mrgroups/groups} "$@"
    elif [ -f "${HADOOP_PREFIX}"/bin/mapred ]; then
      exec "${HADOOP_PREFIX}"/bin/mapred ${COMMAND/mrgroups/groups} "$@"
    else
      echo "HADOOP_MAPRED_HOME not found!"
      exit 1
    fi
    ;;

  #core commands  
  *)
    # the core commands
    if [ "$COMMAND" = "fs" ] ; then
      CLASS=org.apache.hadoop.fs.FsShell
    elif [ "$COMMAND" = "version" ] ; then
      CLASS=org.apache.hadoop.util.VersionInfo
    elif [ "$COMMAND" = "jar" ] ; then
      CLASS=org.apache.hadoop.util.RunJar
      if [[ -n "${YARN_OPTS}" ]] || [[ -n "${YARN_CLIENT_OPTS}" ]]; then
        echo "WARNING: Use \"yarn jar\" to launch YARN applications." 1>&2
      fi
    elif [ "$COMMAND" = "key" ] ; then
      CLASS=org.apache.hadoop.crypto.key.KeyShell
    elif [ "$COMMAND" = "checknative" ] ; then
      CLASS=org.apache.hadoop.util.NativeLibraryChecker
    elif [ "$COMMAND" = "distcp" ] ; then
      CLASS=org.apache.hadoop.tools.DistCp
      CLASSPATH=${CLASSPATH}:${TOOL_PATH}
    elif [ "$COMMAND" = "daemonlog" ] ; then
      CLASS=org.apache.hadoop.log.LogLevel
    elif [ "$COMMAND" = "archive" ] ; then
      CLASS=org.apache.hadoop.tools.HadoopArchives
      CLASSPATH=${CLASSPATH}:${TOOL_PATH}
    elif [ "$COMMAND" = "externaltrash" ]; then
      CLASS=org.apache.hadoop.externaltrash.ExternalTrash
      CLASSPATH=${CLASSPATH}:${TOOL_PATH}
    elif [ "$COMMAND" = "credential" ] ; then
      CLASS=org.apache.hadoop.security.alias.CredentialShell
    elif [ "$COMMAND" = "trace" ] ; then
      CLASS=org.apache.hadoop.tracing.TraceAdmin
    elif [ "$COMMAND" = "classpath" ] ; then
      if [ "$#" -gt 1 ]; then
        CLASS=org.apache.hadoop.util.Classpath
      else
        # No need to bother starting up a JVM for this simple case.
        if $cygwin; then
          CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null)
        fi
        echo $CLASSPATH
        exit
      fi
    elif [[ "$COMMAND" = -*  ]] ; then
        # class and package names cannot begin with a -
        echo "Error: No command named \`$COMMAND' was found. Perhaps you meant \`hadoop ${COMMAND#-}'"
        exit 1
    else
      CLASS=$COMMAND
    fi

    # cygwin path translation
    if $cygwin; then
      CLASSPATH=$(cygpath -p -w "$CLASSPATH" 2>/dev/null)
      HADOOP_LOG_DIR=$(cygpath -w "$HADOOP_LOG_DIR" 2>/dev/null)
      HADOOP_PREFIX=$(cygpath -w "$HADOOP_PREFIX" 2>/dev/null)
      HADOOP_CONF_DIR=$(cygpath -w "$HADOOP_CONF_DIR" 2>/dev/null)
      HADOOP_COMMON_HOME=$(cygpath -w "$HADOOP_COMMON_HOME" 2>/dev/null)
      HADOOP_HDFS_HOME=$(cygpath -w "$HADOOP_HDFS_HOME" 2>/dev/null)
      HADOOP_YARN_HOME=$(cygpath -w "$HADOOP_YARN_HOME" 2>/dev/null)
      HADOOP_MAPRED_HOME=$(cygpath -w "$HADOOP_MAPRED_HOME" 2>/dev/null)
    fi

    shift
    
    # Always respect HADOOP_OPTS and HADOOP_CLIENT_OPTS
    HADOOP_OPTS="$HADOOP_OPTS $HADOOP_CLIENT_OPTS"

    #make sure security appender is turned off
    HADOOP_OPTS="$HADOOP_OPTS -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,NullAppender}"

    export CLASSPATH=$CLASSPATH
    exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"
    ;;

esac

我们分析fs部分:

+ case $COMMAND in
+ '[' fs = fs ']'
+ CLASS=org.apache.hadoop.fs.FsShell
+ false
+ shift
+ HADOOP_OPTS=' -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323 -Dhadoop.id.str=didi -Dhadoop.root.logger=INFO,console -Djava.library.path=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx2048m '
+ HADOOP_OPTS=' -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323 -Dhadoop.id.str=didi -Dhadoop.root.logger=INFO,console -Djava.library.path=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx2048m  -Dhadoop.security.logger=INFO,NullAppender'
+ export 'CLASSPATH=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/etc/hadoop:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar'
+ CLASSPATH='/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/etc/hadoop:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar'
+ exec /Library/Java/JavaVirtualMachines/jdk1.8.0_171.jdk/Contents/Home/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323 -Dhadoop.id.str=didi -Dhadoop.root.logger=INFO,console -Djava.library.path=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx2048m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.fs.FsShell -ls /

真正开始执行的命令是:

export CLASSPATH=$CLASSPATH
exec "$JAVA" $JAVA_HEAP_MAX $HADOOP_OPTS $CLASS "$@"

我们将上述exportexec复制出来在shell中执行:

$ export CLASSPATH='/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/etc/hadoop:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/common/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/hdfs/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/yarn/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/lib/*:/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/share/hadoop/mapreduce/*:/contrib/capacity-scheduler/*.jar'
$ /Library/Java/JavaVirtualMachines/jdk1.8.0_171.jdk/Contents/Home/bin/java -Xmx1000m -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323 -Dhadoop.id.str=didi -Dhadoop.root.logger=INFO,console -Djava.library.path=/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Xmx2048m -Dhadoop.security.logger=INFO,NullAppender org.apache.hadoop.fs.FsShell -ls /
进入FsShell控制台...
Found 1 items
drwxr-xr-x   - didi supergroup          0 2019-06-09 13:57 /user

在这里插入图片描述
可以看到也能运行成功。我们解析一下CLASSPATH的内容。
1.
/Users/didi/CodeFile/xx_hadoop/hadoop-dist/target/hadoop-2.7.2-2323/etc/hadoop
配置环境目录
2.
.../share/common/lib/*.../share/common/*
common包目录
3.
其他目录如下:

FsShell类
  public static void main(String argv[]) throws Exception {
    System.out.println("进入FsShell控制台...");
    FsShell shell = newShellInstance(); //fs实例
    Configuration conf = new Configuration(); //配置类
    conf.setQuietMode(false); //设置成“非安静模式”,默认为“安静模式”,在安静模式下,error和information的信息不会被记录。
    shell.setConf(conf);
    int res;
    try {
      res = ToolRunner.run(shell, argv); //ToolRunner就是一个工具类,用于执行实现了接口`Tool`的类
    } finally {
      shell.close();
    }
    System.exit(res);
  }

参考:https://blog.csdn.net/strongyoung88/article/details/68952248

第1章 HDFS 1 1.1 HDFS概述 1 1.1.1 HDFS体系结构 1 1.1.2 HDFS基本概念 2 1.2 HDFS通信协议 4 1.2.1 Hadoop RPC接口 4 1.2.2 流式接口 20 1.3 HDFS主要流程 22 1.3.1 HDFS客户端读流程 22 1.3.2 HDFS客户端写流程 24 1.3.3 HDFS客户端追加写流程 25 1.3.4 Datanode启动、心跳以及执行名字节点指令流程 26 1.3.5 HA切换流程 27 第2章 Hadoop RPC 29 2.1 概述 29 2.1.1 RPC框架概述 29 2.1.2 Hadoop RPC框架概述 30 2.2 Hadoop RPC的使用 36 2.2.1 Hadoop RPC使用概述 36 2.2.2 定义RPC协议 40 2.2.3 客户端获取Proxy对象 45 2.2.4 服务器获取Server对象 54 2.3 Hadoop RPC实现 63 2.3.1 RPC类实现 63 2.3.2 Client类实现 64 2.3.3 Server类实现 76 第3章 Namenode(名字节点) 88 3.1 文件系统树 88 3.1.1 INode相关类 89 3.1.2 Feature相关类 102 3.1.3 FSEditLog类 117 3.1.4 FSImage类 138 3.1.5 FSDirectory类 158 3.2 数据块管理 162 3.2.1 Block、Replica、BlocksMap 162 3.2.2 数据块副本状态 167 3.2.3 BlockManager类(done) 177 3.3 数据节点管理 211 3.3.1 DatanodeDescriptor 212 3.3.2 DatanodeStorageInfo 214 3.3.3 DatanodeManager 217 3.4 租约管理 233 3.4.1 LeaseManager.Lease 233 3.4.2 LeaseManager 234 3.5 缓存管理 246 3.5.1 缓存概念 247 3.5.2 缓存管理命令 247 3.5.3 HDFS集中式缓存架构 247 3.5.4 CacheManager类实现 248 3.5.5 CacheReplicationMonitor 250 3.6 ClientProtocol实现 251 3.6.1 创建文件 251 3.6.2 追加写文件 254 3.6.3 创建新的数据块 257 3.6.4 放弃数据块 265 3.6.5 关闭文件 266 3.7 Namenode的启动和停止 268 3.7.1 安全模式 268 3.7.2 HDFS High Availability 276 3.7.3 名字节点的启动 301 3.7.4 名字节点的停止 306 第4章 Datanode(数据节点) 307 4.1 Datanode逻辑结构 307 4.1.1 HDFS 1.X架构 307 4.1.2 HDFS Federation 308 4.1.3 Datanode逻辑结构 310 4.2 Datanode存储 312 4.2.1 Datanode升级机制 312 4.2.2 Datanode磁盘存储结构 315 4.2.3 DataStorage实现 317 4.3 文件系统数据集 334 4.3.1 Datanode上数据块副本的状态 335 4.3.2 BlockPoolSlice实现 335 4.3.3 FsVolumeImpl实现 342 4.3.4 FsVolumeList实现 345 4.3.5 FsDatasetImpl实现 348 44 BlockPoolManager 375 44.1 BPServiceActor实现 376 44.2 BPOfferService实现 389 44.3 BlockPoolManager实现 396 4.5 流式接口 398 4.5.1 DataTransferProtocol定义 398 4.5.2 Sender和Receiver 399 4.5.3 DataXceiverServer 403 4.5.4 DataXceiver 406 4.5.5 读数据 408 4.5.6 写数据(done) 423 4.5.7 数据块替换、数据块拷贝和读数据块校验 437 4.5.8 短路读操作 437 4.6 数据块扫描器 437 4.6.1 DataBlockScanner实现 438 4.6.2 BlockPoolSliceScanner实现 439 4.7 DirectoryScanner 442 4.8 DataNode类的实现 443 4.8.1 DataNode的启动 444 4.8.2 DataNode的关闭 446 第5章 HDFS客户端 447 5.1 DFSClient实现 447 5.1.1 构造方法 448 5.1.2 关闭方法 449 5.1.3 文件系统管理与配置方法 450 5.1.4 HDFS文件与操作方法 451 5.1.5 HDFS文件读写方法 452 5.2 文件读操作与输入流 452 5.2.1 打开文件 452 5.2.2 读操作――DFSInputStream实现 461 5.3 文件短路读操作 481 5.3.1 短路读共享内存 482 5.3.2 DataTransferProtocol 484 5.3.3 DFSClient短路读操作流程 488 5.3.4 Datanode短路读操作流程 509 5.4 文件写操作与输出流 512 5.4.1 创建文件 512 5.4.2 写操作――DFSOutputStream实现 516 5.4.3 追加写操作 543 5.44 租约相关 546 5.4.5 关闭输出流 548 5.5 HDFS常用工具 549 5.5.1 FsShell实现 550 5.5.2 DFSAdmin实现 552
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值