HDFS读取文件过程

最新推荐文章于 2024-07-22 02:03:32 发布
tinyid
最新推荐文章于 2024-07-22 02:03:32 发布
阅读量6.9k
点赞数 2
分类专栏： Hadoop 数据仓库与数据挖掘 Java 文章标签： string null statistics file 存储 path
本文链接：https://blog.csdn.net/cnweike/article/details/7102766
版权
Hadoop 同时被 3 个专栏收录
34 篇文章 0 订阅
订阅专栏
Java
19 篇文章 0 订阅
订阅专栏
数据仓库与数据挖掘
12 篇文章 1 订阅
订阅专栏
从HDFS中读取一个文件，都需要做些什么呢？我们拿一个简单的例子来看一下：
import java.io.InputStream;
import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;


public class FileSystemCat {

	/**
	 * @param args
	 */
	public static void main(String[] args) throws Exception{
		// TODO Auto-generated method stub
		String uri = args[0];  /**
   * Get block locations within the specified range.
   * @see ClientProtocol#getBlockLocations(String, long, long)
   */
  public LocatedBlocks getBlockLocations(String src, long offset, long length,
      boolean doAccessTime, boolean needBlockToken) throws IOException {
    if (isPermissionEnabled) {
      checkPathAccess(src, FsAction.READ);
    }

    if (offset < 0) {
      throw new IOException("Negative offset is not supported. File: " + src );
    }
    if (length < 0) {
      throw new IOException("Negative length is not supported. File: " + src );
    }
    final LocatedBlocks ret = getBlockLocationsInternal(src, 
        offset, length, Integer.MAX_VALUE, doAccessTime, needBlockToken);  
    if (auditLog.isInfoEnabled() && isExternalInvocation()) {
      logAuditEvent(UserGroupInformation.getCurrentUser(),
                    Server.getRemoteIp(),
                    "open", src, null, null);
    }
    return ret;
  }


Configuration conf = new Configuration();FileSystem fs = FileSystem.get(URI.create(uri), conf);InputStream in = null;try{in = fs.open(new Path(uri));IOUtils.copyBytes(in, System.out, 4096, false);}finally{IOUtils.closeStream(in);}}}



这个程序就是一个HDFS客户端程序，我们通过这个程序可以指定一个文件的路径，然后读取这个文件的内容。这个过程中都做了哪些事情呢？从总体上看非常简单：打开文件-读取文件-关闭文件。

众所周知，HDFS是以块的形式存储文件的，在我们读取HDFS中的一个文件的时候，我们必须搞清楚这个文件由哪些块组成，这个操作就涉及到对名字节点的访问，因为名字节点存储了文件-块序列的映射信息，并将这些信息持久存储于名字节点上，当然，为了效率方面的考虑，在HDFS启动时会把这些信息加载到内存中，这也从一定程度上反映了名字节点的内存大小限制了真个HDFS集群能存储的文件量的事实。



其实，在这个小的客户端程序中，最重要的就是一行代码：

in = fs.open(new Path(uri));


open方法返回了一个InputStream，对于这个InputStream我向大家都比较熟悉了，是Java内置的数据类型，就是一个输入流。但是怎样来形成这个流就需要费一番周折了，那我们就看一下DistributedFileSystem是怎么实现的，下面我们就一步步追踪代码：

  public FSDataInputStream open(Path f) throws IOException {
    return open(f, getConf().getInt("io.file.buffer.size", 4096));
  }


继而追踪到：


  public FSDataInputStream open(Path f, int bufferSize) throws IOException {
    statistics.incrementReadOps(1);
    return new DFSClient.DFSDataInputStream(
          dfs.open(getPathName(f), bufferSize, verifyChecksum, statistics));
  }


很明显，dfs.open(getPathName(f), bufferSize, verifyChecksum, statistics)返回了一个输入流，而外层的包装又装饰了一层，添加了一些其他功能。我们这里只看主要的，这个dfs.open到底做了什么。首先要弄清楚dfs是什么类型的，好吧，我们来看一下它的声明：


DFSClient dfs;


看到它是一个DFSClient对象。我们继续追溯到DFSClient这个类里open方法的实现：


  DFSInputStream open(String src, int buffersize, boolean verifyChecksum,
                      FileSystem.Statistics stats
      ) throws IOException {
    checkOpen();
    //    Get block info from namenode
    return new DFSInputStream(src, buffersize, verifyChecksum);
  }


我们看到了一个非常关键的注释：Get block info from namenode（从名称节点获取块信息）。那么我们就来看一下hdfs是怎样获取到这个文件的块信息的:


  private static LocatedBlocks callGetBlockLocations(ClientProtocol namenode,
      String src, long start, long length) throws IOException {
    try {
      return namenode.getBlockLocations(src, start, length);
    } catch(RemoteException re) {
      throw re.unwrapRemoteException(AccessControlException.class,
                                    FileNotFoundException.class);
    }
  }

这是DFSClient中获取块信息的方法，它的第一个参数是一个实现了客户端协议的对象，第二个就是指定的文件路径，然后就是文件偏移量和长度了。里面还是只有一行关键代码：return namenode.getBlockLocations(src, start, length)，这个NameNode的一个方法，我们定位到这个方法：

  public LocatedBlocks   getBlockLocations(String src, 
                                          long offset, 
                                          long length) throws IOException {
    myMetrics.incrNumGetBlockLocations();
    return namesystem.getBlockLocations(getClientMachine(), 
                                        src, offset, length);
  }


其中的namesystem是一个FSNamesystem类的实例，定位到上面调用那个方法：


  /**
   * Get block locations within the specified range.
   * 
   * @see #getBlockLocations(String, long, long)
   */
  LocatedBlocks getBlockLocations(String clientMachine, String src,
      long offset, long length) throws IOException {
    LocatedBlocks blocks = getBlockLocations(src, offset, length, true, true);
    if (blocks != null) {
      //sort the blocks
      DatanodeDescriptor client = host2DataNodeMap.getDatanodeByHost(
          clientMachine);
      for (LocatedBlock b : blocks.getLocatedBlocks()) {
        clusterMap.pseudoSortByDistance(client, b.getLocations());
      }
    }
    return blocks;
  }


这个方法的第一个参数是客户端机器，这个参数主要是在计算数据节点与客户端机器的距离时用到的，组成一个文件的所有的块都会按照配置（默认3个）冗余地存储于不同的数据节点上，为了最大程度的提高性能，客户端要直接从最近的那个数据节点上接收数据。后面的3个参数很明了，就是文件的路径、偏移量和长度。然后获取到组成这个文件的块信息、按照与客户端机器的距离排序这些块信息，最后返回。

其实到这里我们还是没有看到到底是怎样获取块信息的，只有再次追溯代码：

  /**
   * Get block locations within the specified range.
   * @see ClientProtocol#getBlockLocations(String, long, long)
   */
  public LocatedBlocks getBlockLocations(String src, long offset, long length,
      boolean doAccessTime, boolean needBlockToken) throws IOException {
    if (isPermissionEnabled) {
      checkPathAccess(src, FsAction.READ);
    }

    if (offset < 0) {
      throw new IOException("Negative offset is not supported. File: " + src );
    }
    if (length < 0) {
      throw new IOException("Negative length is not supported. File: " + src );
    }
    final LocatedBlocks ret = getBlockLocationsInternal(src, 
        offset, length, Integer.MAX_VALUE, doAccessTime, needBlockToken);  
    if (auditLog.isInfoEnabled() && isExternalInvocation()) {
      logAuditEvent(UserGroupInformation.getCurrentUser(),
                    Server.getRemoteIp(),
                    "open", src, null, null);
    }
    return ret;
  }


这个方法首先做了一个权限相关的检测，然后对于偏移量和文件长度参数做了一些合法性验证，然后又调用了一个静态方法来获取到相关的块信息，做了一些日志相关的记录工作，最后返回获取的块信息。继续追溯获取块信息的代码：


  private synchronized LocatedBlocks getBlockLocationsInternal(String src,
                                                       long offset, 
                                                       long length,
                                                       int nrBlocksToReturn,
                                                       boolean doAccessTime, 
                                                       boolean needBlockToken)
                                                       throws IOException {
    INodeFile inode = dir.getFileINode(src);
    if(inode == null) {
      return null;
    }
    if (doAccessTime && isAccessTimeSupported()) {
      dir.setTimes(src, inode, -1, now(), false);
    }
    Block[] blocks = inode.getBlocks();
    if (blocks == null) {
      return null;
    }
    if (blocks.length == 0) {
      return inode.createLocatedBlocks(new ArrayList<LocatedBlock>(blocks.length));
    }
    List<LocatedBlock> results;
    results = new ArrayList<LocatedBlock>(blocks.length);

    int curBlk = 0;
    long curPos = 0, blkSize = 0;
    int nrBlocks = (blocks[0].getNumBytes() == 0) ? 0 : blocks.length;
    for (curBlk = 0; curBlk < nrBlocks; curBlk++) {
      blkSize = blocks[curBlk].getNumBytes();
      assert blkSize > 0 : "Block of size 0";
      if (curPos + blkSize > offset) {
        break;
      }
      curPos += blkSize;
    }
    
    if (nrBlocks > 0 && curBlk == nrBlocks)   // offset >= end of file
      return null;
    
    long endOff = offset + length;
    
    do {
      // get block locations
      int numNodes = blocksMap.numNodes(blocks[curBlk]);
      int numCorruptNodes = countNodes(blocks[curBlk]).corruptReplicas();
      int numCorruptReplicas = corruptReplicas.numCorruptReplicas(blocks[curBlk]); 
      if (numCorruptNodes != numCorruptReplicas) {
        LOG.warn("Inconsistent number of corrupt replicas for " + 
            blocks[curBlk] + "blockMap has " + numCorruptNodes + 
            " but corrupt replicas map has " + numCorruptReplicas);
      }
      boolean blockCorrupt = (numCorruptNodes == numNodes);
      int numMachineSet = blockCorrupt ? numNodes : 
                            (numNodes - numCorruptNodes);
      DatanodeDescriptor[] machineSet = new DatanodeDescriptor[numMachineSet];
      if (numMachineSet > 0) {
        numNodes = 0;
        for(Iterator<DatanodeDescriptor> it = 
            blocksMap.nodeIterator(blocks[curBlk]); it.hasNext();) {
          DatanodeDescriptor dn = it.next();
          boolean replicaCorrupt = corruptReplicas.isReplicaCorrupt(blocks[curBlk], dn);
          if (blockCorrupt || (!blockCorrupt && !replicaCorrupt))
            machineSet[numNodes++] = dn;
        }
      }
      LocatedBlock b = new LocatedBlock(blocks[curBlk], machineSet, curPos,
          blockCorrupt);
      if(isAccessTokenEnabled && needBlockToken) {
        b.setBlockToken(accessTokenHandler.generateToken(b.getBlock(), 
            EnumSet.of(BlockTokenSecretManager.AccessMode.READ)));
      }
      
      results.add(b); 
      curPos += blocks[curBlk].getNumBytes();
      curBlk++;
    } while (curPos < endOff 
          && curBlk < blocks.length 
          && results.size() < nrBlocksToReturn);
    
    return inode.createLocatedBlocks(results);
  }


在上面这个静态方法中，获取块信息的操作更加具体了，这个方法首先引入了一个（以这个文件路径为参数创建的）索引节点（inode），如果这个索引节点对象为null，说明这个文件不存在，直接返回null。然后做一些访问标记，获取这个索引节点对象的所有块信息（Block[] blocks = inode.getBlocks()）。如果获取的块信息为null直接返回null，如果块列表的长度为0，返回一个空的结果；否则先将位置移动到偏移量的位置，并同时计数块编号。如果最后看到满足条件的块编号大于块总数，直接返回null。如果以上的条件都不满足，将在范围之内的块组织到一个ArrayList中，最后用这个列表创建一个LocatedBlocks对象，并返回。