hadoop虽然没有提供POSIX那样的操作,但是提供的基本的文件操作open,create,delete,write,seek,read还是令用户可以方便的操作文件。下面是一段寻常的hadoop打开文件并且读取文件内容的代码:
hdfs=hdfsPath.getFileSystem(conf);
inFsData=hdfs.open(p);
inFsData.seek(place);
inFsData.readLong();
hdfs是FileSystem的实例,FileSystem是一个抽象类,根据conf中url的内容,返回的hdfs可能是本地文件系统的实例,也可能是分布式文件系统的实例。hadoop文件操作的实际类是DistributedFileSystem
下面来看一下DistributedFileSystem的open操作:
public FSDataInputStream open(Path f, int bufferSize) throws IOException {
statistics.incrementReadOps(1);
return new DFSClient.DFSDataInputStream(
dfs.open(getPathName(f), bufferSize, verifyChecksum, statistics));
}
可以看出open操作是返回一个FSDataInputStream的输入流,open里面生成了DFSClient中内部类DFSDataInputStream的对象,对象的其中参数是DFSClent的open函数返回值下面是DFSClient的open函数
public DFSInputStream open(String src, int buffersize, boolean verifyChecksum,
FileSystem.Statistics stats
) throws IOException {
checkOpen();
// Get block info from namenode
return new DFSInputStream(src, buffersize, verifyChecksum);
}
这个open函数返回的是DFSInputStream对象,下面是DFSInputStream的构造函数:
DFSInputStream(String src, int buffersize, boolean verifyChecksum
) throws IOException {
this.verifyChecksum = verifyChecksum;
this.buffersize = buffersize;
this.src = src;
prefetchSize = conf.getLong("dfs.read.prefetch.size", prefetchSize);
openInfo();
}
下面是DFSInputStream的openInfo函数,这个函数式整个open系列的核心操作。
synchronized void openInfo() throws IOException {
LocatedBlocks newInfo = callGetBlockLocations(namenode, src, 0, prefetchSize);
if (newInfo == null) {
throw new FileNotFoundException("File does not exist: " + src);
}
// I think this check is not correct. A file could have been appended to
// between two calls to openInfo().
if (locatedBlocks != null && !locatedBlocks.isUnderConstruction() &&
!newInfo.isUnderConstruction()) {
Iterator<LocatedBlock> oldIter = locatedBlocks.getLocatedBlocks().iterator();
Iterator<LocatedBlock> newIter = newInfo.getLocatedBlocks().iterator();
while (oldIter.hasNext() && newIter.hasNext()) {
if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) {
throw new IOException("Blocklist for " + src + " has changed!");
}
}
}
updateBlockInfo(newInfo);
this.locatedBlocks = newInfo;
this.currentNode = null;
}
其中callGetBlockLocations是通过RPC和namenode通信来访问该文件的前prefetchSize个块(配置文件里的,默认为10)。把这10个块的位置存放在这个流中。后面有一个updateBlockInfo函数是选最后一块的datanode的信息与namenode上的信息做比较,若不一致,则遵从datanode上的信息(因为namenode和datanode上的信息可能存在不一致)。
然后的seek和read函数都是针对于stream的。下面看下DFSInputStream的seek函数
public synchronized void seek(long targetPos) throws IOException {
if (targetPos > getFileLength()) {
throw new IOException("Cannot seek after EOF");
}
boolean done = false;
if (pos <= targetPos && targetPos <= blockEnd) {
//
// If this seek is to a positive position in the current
// block, and this piece of data might already be lying in
// the TCP buffer, then just eat up the intervening data.
//
int diff = (int)(targetPos - pos);
if (diff <= TCP_WINDOW_SIZE) {
try {
pos += blockReader.skip(diff);
if (pos == targetPos) {
done = true;
}
} catch (IOException e) {//make following read to retry
LOG.debug("Exception while seek to " + targetPos + " from "
+ currentBlock +" of " + src + " from " + currentNode +
": " + StringUtils.stringifyException(e));
}
}
}
if (!done) {
pos = targetPos;
blockEnd = -1;
}
}