这里分析kafka LogSegment源代码
通过一步步分析LogManager,Log源代码之后就会发现,最终的log操作都在LogSegment上实现.LogSegment负责分片的读写恢复刷新删除等动作都在这里实现.LogSegment代码同样在源代码目录log下.
LogSegment是一个日志分片的操作最小单元.直接作用与messages之上.负责实体消息的读写追加等等.
LogSegment实际上是FileMessageSet类的代理类.LogSegment中的所有最终处理都在FileMessageSet类中实现.FileMessageSet类的最终操作建立在ByteBufferMessageSet这个消息实体类的基础上.通过操作FileChannel对象来实现消息读写.
下面来看看主要的一些函数方法.
初始化部分
class LogSegment(val log: FileMessageSet, //实际构造是这个. val index: OffsetIndex, val baseOffset: Long, val indexIntervalBytes: Int, val rollJitterMs: Long, time: Time) extends Logging { var created = time.milliseconds /* the number of bytes since we last added an entry in the offset index */ private var bytesSinceLastIndexEntry = 0 //在Log中被调用的构造是这个.可以看见是通过topicAndPartition路径和startOffset来创建index和logfile的. def this(dir: File, startOffset: Long, indexIntervalBytes: Int, maxIndexSize: Int, rollJitterMs: Long, time: Time) = this(new FileMessageSet(file = Log.logFilename(dir, startOffset)), new OffsetIndex(file = Log.indexFilename(dir, startOffset), baseOffset = startOffset, maxIndexSize = maxIndexSize), startOffset, indexIntervalBytes, rollJitterMs, time)
添加消息函数append
def append(offset: Long, messages: ByteBufferMessageSet) { if (messages.sizeInBytes > 0) { //判断消息不为空. trace("Inserting %d bytes at offset %d at position %d".format(messages.sizeInBytes, offset, log.sizeInBytes())) // append an entry to the index (if needed) if(bytesSinceLastIndexEntry > indexIntervalBytes) { index.append(offset, log.sizeInBytes()) this.bytesSinceLastIndexEntry = 0 } // append the messages log.append(messages) //调用FileMessageSet类的append方法想写消息.实际上最终调用的是ByteBufferMessageSet类方法来操作消息实体的. this.bytesSinceLastIndexEntry += messages.sizeInBytes } }
刷新消息到磁盘的flush函数
def flush() { LogFlushStats.logFlushTimer.time { log.flush() //可以看见调用的FileMessageSet类的方法.最终FileMessageSet.flush方法调用channel.force方法刷新存储设备. index.flush() //同上. } }
读取消息的read函数
def read(startOffset: Long, maxOffset: Option[Long], maxSize: Int): FetchDataInfo = { if(maxSize < 0) throw new IllegalArgumentException("Invalid max size for log read (%d)".format(maxSize)) val logSize = log.sizeInBytes // this may change, need to save a consistent copy val startPosition = translateOffset(startOffset) //获取对应offset的读取点位置. // if the start position is already off the end of the log, return null if(startPosition == null) //没有读取点位置则返回空 return null val offsetMetadata = new LogOffsetMetadata(startOffset, this.baseOffset, startPosition.position) //定义offsetMetadata // if the size is zero, still return a log segment but with zero size if(maxSize == 0) //最大读取尺寸是0的话.返回空消息. return FetchDataInfo(offsetMetadata, MessageSet.Empty) // calculate the length of the message set to read based on whether or not they gave us a maxOffset val length = //计算最大读取的消息总长度. maxOffset match { case None => //未设置maxoffset则使用maxsize. // no max offset, just use the max size they gave unmolested maxSize case Some(offset) => { //如果设置了Maxoffset,则计算对应的消息长度. // there is a max offset, translate it to a file position and use that to calculate the max read size if(offset < startOffset) //maxoffset小于startoffset则返回异常 throw new IllegalArgumentException("Attempt to read with a maximum offset (%d) less than the start offset (%d).".format(offset, startOffset)) val mapping = translateOffset(offset, startPosition.position) //获取相对maxoffset读取点. val endPosition = if(mapping == null) logSize // the max offset is off the end of the log, use the end of the file else mapping.position min(endPosition - startPosition.position, maxSize) //用maxoffset读取点减去开始的读取点.获取需要读取的数据长度.如果长度比maxsize大则返回maxsize } } FetchDataInfo(offsetMetadata, log.read(startPosition.position, length)) //使用FileMessageSet.read读取相应长度的数据返回FetchDataInfo的封装对象. }
读取函数通过映射offset到读取长度.来读取多个offset.
private[log] def translateOffset(offset: Long, startingFilePosition: Int = 0): OffsetPosition = { //用来将offset映射到读取指针位置的函数. val mapping = index.lookup(offset) //通过查找index获取对应的指针对象. log.searchFor(offset, max(mapping.position, startingFilePosition)) //通过FileMessageSet获取对应的指针位置. }
recover函数.kafka启动检查时用到的各层调用的最后代理函数.
def recover(maxMessageSize: Int): Int = { index.truncate() index.resize(index.maxIndexSize) var validBytes = 0 var lastIndexEntry = 0 val iter = log.iterator(maxMessageSize) try { while(iter.hasNext) { val entry = iter.next entry.message.ensureValid() if(validBytes - lastIndexEntry > indexIntervalBytes) { // we need to decompress the message, if required, to get the offset of the first uncompressed message val startOffset = entry.message.compressionCodec match { case NoCompressionCodec => entry.offset case _ => ByteBufferMessageSet.decompress(entry.message).head.offset } index.append(startOffset, validBytes) lastIndexEntry = validBytes } validBytes += MessageSet.entrySize(entry.message) } } catch { case e: InvalidMessageException => logger.warn("Found invalid messages in log segment %s at byte offset %d: %s.".format(log.file.getAbsolutePath, validBytes, e.getMessage)) } val truncated = log.sizeInBytes - validBytes log.truncateTo(validBytes) index.trimToValidSize() truncated }
分片删除函数
def delete() { val deletedLog = log.delete() //最终是删除文件,关闭内存数组.在FileMessageSet里实现. val deletedIndex = index.delete() //同上. if(!deletedLog && log.file.exists) throw new KafkaStorageException("Delete of log " + log.file.getName + " failed.") if(!deletedIndex && index.file.exists) throw new KafkaStorageException("Delete of index " + index.file.getName + " failed.") }
到这里LogSegment主要函数都分析完了.