转载URL: http://greatwqs.iteye.com/blog/1845897
一. HLog在HDFS上位置和RegionServer对应关系
HLog持久化在HDFS之上, HLog存储位置查看:
- hadoop fs -ls /hbase/.logs
通过HBase架构图, HLog与HRegionServer一一对应,
- Found 5 items
- drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS02,61020,1365661380729
- drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS03,61020,1365661378638
- drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS04,61020,1365661379200
- drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:22 /hbase/.logs/HADOOPCLUS05,61020,1365661378053
- drwxr-xr-x - hadoop cug-admin 0 2013-04-11 14:23 /hbase/.logs/HADOOPCLUS06,61020,1365661378832
HADOOPCLUS02 ~ HADOOPCLUS06 为RegionServer.
上面显示的文件目录为HLog存储. 如果HLog已经失效(所有之前的写入MemStore已经持久化在HDFS),HLog存在于HDFS之上的文件会从/hbase/.logs转移至/hbase/.oldlogs, oldlogs会删除, HLog的生命周期结束.
二. HBase写流程和写HLog的阶段点.
向HBase Put数据时通过HBaseClient-->连接ZooKeeper--->-ROOT--->.META.-->RegionServer-->Region:
Region写数据之前会先检查MemStore.
1. 如果此Region的MemStore已经有缓存已有写入的数据, 则直接返回;
2. 如果没有缓存, 写入HLog(WAL), 再写入MemStore.成功后再返回.
MemStore内存达到一定的值调用flush成为StoreFile,存到HDFS.
在对HBase插入数据时,插入到内存MemStore所以很快,对于安全性不高的应用可以关闭HLog,可以获得更高的写性能.
三. HLog相关源码.
1. 总览.
写入HLog主要靠HLog对象的doWrite(HRegionInfo info, HLogKey logKey, WALEdit logEdit)
或者completeCacheFlush(final byte [] encodedRegionName, final byte [] tableName, final long logSeqId, final boolean isMetaRegion),
在这两方法中调用this.writer.append(new HLog.Entry(logKey, logEdit));方法写入操作.
在方法内构造HLog.Entry:使用当前构造好的writer, 见上图引用对象,
完整实现类: org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogWriter,
HLog 方法createWriterInstance(fs, newPath, conf) 创建 Writer对象.
2. SequenceFileLogWriter 和SequenceFileLogReader
在SequenceFileLogWriter 类中可以看到, 使用Hadoop SequenceFile.Writer写入到文件系统. SequenceFile是HLog在Hadoop存储的文件格式.
HLog.Entry为HLog存储的最小单位.
- public class SequenceFileLogWriter implements HLog.Writer {
- private final Log LOG = LogFactory.getLog(this.getClass());
- // The hadoop sequence file we delegate to.
- private SequenceFile.Writer writer;
- // The dfsclient out stream gotten made accessible or null if not available.
- private OutputStream dfsClient_out;
- // The syncFs method from hdfs-200 or null if not available.
- private Method syncFs;
- // init writer need the key;
- private Class<? extends HLogKey> keyClass;
- @Override
- public void init(FileSystem fs, Path path, Configuration conf)
- throws IOException {
- // 1. create Hadoop file SequenceFile.Writer for writer initation.
- // 2. Get at the private FSDataOutputStream inside in SequenceFile so we can call sync on it. for dfsClient_out initation.
- }
- @Override
- public void append(HLog.Entry entry) throws IOException {
- this.writer.append(entry.getKey(), entry.getEdit());
- }
- @Override
- public void sync() throws IOException {
- if (this.syncFs != null) {
- try {
- this.syncFs.invoke(this.writer, HLog.NO_ARGS);
- } catch (Exception e) {
- throw new IOException("Reflection", e);
- }
- }
- }
- }
SequenceFileLogReader为读取HLog.Entry对象使用.
3. HLog.Entry与属性logSeqNum
每一个Entry包含了 HLogKey和WALEdit
HLogKey包含了基本信息:
- private byte [] encodedRegionName;
- private byte [] tablename;
- private long logSeqNum;
- // Time at which this edit was written.
- private long writeTime;
- private byte clusterId;
logSeqNum是一个重要的字段值, sequence number是作为StoreFile里的一个元数据字段,可以针对StoreFile直接得到longSeqNum;
- public class StoreFile {
- static final String HFILE_BLOCK_CACHE_SIZE_KEY = "hfile.block.cache.size";
- private static BlockCache hfileBlockCache = null;
- // Is this from an in-memory store
- private boolean inMemory;
- // Keys for metadata stored in backing HFile.
- // Set when we obtain a Reader. StoreFile row 140
- private long sequenceid = -1;
- /**
- * @return This files maximum edit sequence id.
- */
- public long getMaxSequenceId() {
- return this.sequenceid;
- }
- /**
- * Return the highest sequence ID found across all storefiles in
- * the given list. Store files that were created by a mapreduce
- * bulk load are ignored, as they do not correspond to any edit
- * log items.
- * @return 0 if no non-bulk-load files are provided or, this is Store that
- * does not yet have any store files.
- */
- public static long getMaxSequenceIdInList(List<StoreFile> sfs) {
- long max = 0;
- for (StoreFile sf : sfs) {
- if (!sf.isBulkLoadResult()) {
- max = Math.max(max, sf.getMaxSequenceId());
- }
- }
- return max;
- }
- /**
- * Writes meta data. important for maxSequenceId WRITE!!
- * Call before {@link #close()} since its written as meta data to this file.
- * @param maxSequenceId Maximum sequence id.
- * @param majorCompaction True if this file is product of a major compaction
- * @throws IOException problem writing to FS
- */
- public void appendMetadata(final long maxSequenceId, final boolean majorCompaction)
- throws IOException {
- writer.appendFileInfo(MAX_SEQ_ID_KEY, Bytes.toBytes(maxSequenceId));
- writer.appendFileInfo(MAJOR_COMPACTION_KEY,
- Bytes.toBytes(majorCompaction));
- appendTimeRangeMetadata();
- }
- }
- /**
- * Opens reader on this store file. Called by Constructor.
- * @return Reader for the store file.
- * @throws IOException
- * @see #closeReader()
- */
- private Reader open() throws IOException {
- // ........
- this.sequenceid = Bytes.toLong(b);
- if (isReference()) {
- if (Reference.isTopFileRegion(this.reference.getFileRegion())) {
- this.sequenceid += 1;
- }
- }
- this.reader.setSequenceID(this.sequenceid);
- return this.reader;
- }
- }
Store 类对StoreFile进行了管理, 如compact.在 很多StoreFile进行合并时, 取值最大的longSeqNum;
- public class Store implements HeapSize {
- /**
- * Compact the StoreFiles. This method may take some time, so the calling
- * thread must be able to block for long periods. *
- * <p>During this time, the Store can work as usual, getting values from
- * StoreFiles and writing new StoreFiles from the memstore. *
- * Existing StoreFiles are not destroyed until the new compacted StoreFile is
- * completely written-out to disk. *
- * <p>The compactLock prevents multiple simultaneous compactions.
- * The structureLock prevents us from interfering with other write operations. *
- * <p>We don't want to hold the structureLock for the whole time, as a compact()
- * can be lengthy and we want to allow cache-flushes during this period. *
- * @param forceMajor True to force a major compaction regardless of thresholds
- * @return row to split around if a split is needed, null otherwise
- * @throws IOException
- */
- StoreSize compact(final boolean forceMajor) throws IOException {
- boolean forceSplit = this.region.shouldForceSplit();
- boolean majorcompaction = forceMajor;
- synchronized (compactLock) {
- /* get store file sizes for incremental compacting selection.
- * normal skew:
- *
- * older ----> newer
- * _
- * | | _
- * | | | | _
- * --|-|- |-|- |-|---_-------_------- minCompactSize
- * | | | | | | | | _ | |
- * | | | | | | | | | | | |
- * | | | | | | | | | | | |
- */
- // .............
- this.lastCompactSize = totalSize;
- // Max-sequenceID is the last key in the files we're compacting
- long maxId = StoreFile.getMaxSequenceIdInList(filesToCompact);
- // Ready to go. Have list of files to compact.
- LOG.info("Started compaction of " + filesToCompact.size() + " file(s) in cf=" +
- this.storeNameStr +
- (references? ", hasReferences=true,": " ") + " into " +
- region.getTmpDir() + ", seqid=" + maxId +
- ", totalSize=" + StringUtils.humanReadableInt(totalSize));
- StoreFile.Writer writer = compact(filesToCompact, majorcompaction, maxId);
- // Move the compaction into place.
- StoreFile sf = completeCompaction(filesToCompact, writer);
- }
- return checkSplit(forceSplit);
- }
- /**
- * Do a minor/major compaction. Uses the scan infrastructure to make it easy.
- *
- * @param filesToCompact which files to compact
- * @param majorCompaction true to major compact (prune all deletes, max versions, etc)
- * @param maxId Readers maximum sequence id.
- * @return Product of compaction or null if all cells expired or deleted and
- * nothing made it through the compaction.
- * @throws IOException
- */
- private StoreFile.Writer compact(final List<StoreFile> filesToCompact,
- final boolean majorCompaction, final long maxId)
- throws IOException {
- // Make the instantiation lazy in case compaction produces no product; i.e.
- // where all source cells are expired or deleted.
- StoreFile.Writer writer = null;
- try {
- // ......
- } finally {
- if (writer != null) {
- // !!!! StoreFile.Writer write Metadata for maxid.
- writer.appendMetadata(maxId, majorCompaction);
- writer.close();
- }
- }
- return writer;
- }
- }
在compact时, 第一个compact(final boolean forceMajor)调用
compact(final List<StoreFile> filesToCompact, final boolean majorCompaction, final long maxId)
此方法最后写入writer.appendMetadata(maxId, majorCompaction); 也就是StoreFile中的appendMetadata方法.
可见, 是在finally中写入最大的logSeqNum. 这样StoreFile在取得每个logSeqNum, 可以由open读取logSeqNum;
clusterId 保存在Hadoop集群ID.
4. HLog的生命周期
这里就涉及到HLog的生命周期问题了.如果HLog的logSeqNum对应的HFile已经存储在HDFS了(主要是比较HLog的logSeqNum是否比与其对应的表的HDFS StoreFile的maxLongSeqNum小),那么HLog就没有存在的必要了.移动到.oldlogs目录,最后删除.
反过来如果此时系统down了,可以通过HLog把数据从HDFS中读取,把要原来Put的数据读取出来, 重新刷新到HBase.
补充资料:
HBase 架构101 –预写日志系统 (WAL)
http://cloudera.iteye.com/blog/911700
HLog的结构和生命周期
http://www.spnguru.com/2011/03/hlog%e7%9a%84%e7%bb%93%e6%9e%84%e5%92%8c%e7%94%9f%e5%91%