背景:
线上出现复现率很小的问题,log4j2带SizeBasedTriggeringPolicy和CronTriggeringPolicy两种策略的自定义log,出现覆盖现象
经分析,不是触发size 2MB的问题,是cron发生的问题
log文件 分析
最重要的是获取
1.被压缩文件的创建时间
2.zip文件的lastmodify时间
因为是用创建时间+5分来命名zip文件的 ★
源码分析
每次log.info 最后都会调入RollingFileAppender 就是log.info都会判断是否要rollover,crontrigger是到时间了判断,size是每打印一句就会判断
public void append(LogEvent event) {
((RollingFileManager)this.getManager()).checkRollover(event);
super.append(event);
}
然后经过这一步根据上面的两个策略判断是否需要rollover
this.triggeringPolicy = {CompositeTriggeringPolicy@4082}
"CompositeTriggeringPolicy(policies=[SizeBasedTriggeringPolicy(size=1024), CronTriggeringPolicy(schedule=0 0/1 * * * ?)])"
triggeringPolicies = {TriggeringPolicy[2]@4312}
0 = {SizeBasedTriggeringPolicy@4316} "SizeBasedTriggeringPolicy(size=1024)"
1 = {CronTriggeringPolicy@4317} "CronTriggeringPolicy(schedule=0 0/1 * * * ?)"
state = {LifeCycle$State@4119} "STARTED"
name = "STARTED"
ordinal = 3
public synchronized void checkRollover(LogEvent event) { //event 将所有的log的信息都封装进去
if (this.triggeringPolicy.isTriggeringEvent(event)) {
this.rollover();
}
}
在这里的判断中 cronPolicy直接返回false,因为cron是直接在异步线程池里操作的
public boolean isTriggeringEvent(LogEvent event) {
return false;
}
/*
sizebasedPolicy 会根据文件大小来判断是否要更新manager的时间 然后在rollover()//这个策略没有触发,不用管
public boolean isTriggeringEvent(LogEvent event) {
boolean triggered = this.manager.getFileSize() > this.maxFileSize;
if (triggered) {
this.manager.getPatternProcessor().updateTime();
}
return triggered;
}
*/
然后执行manager的rollover()逻辑 ★这里是主要逻辑
public synchronized void rollover() {
if (this.hasOutputStream()) {
if (this.rollover(this.rolloverStrategy)) { ★这里move()
try {
this.size = 0L;
this.initialTime = System.currentTimeMillis();
this.createFileAfterRollover(); ★这里create()
} catch (IOException var2) {
this.logError("Failed to create file after rollover", var2);
}
}
}
}
//这里封装了一个带俩action的description rename和zipAction 都是cron衍生出来的
public RolloverDescription rollover(final RollingFileManager manager) throws SecurityException {
rename:
writeFooter(); 不知道写的啥,暂时先不管...
success = descriptor.getSynchronous().execute();//同步的rename 直接改 aaa(文件名) move到对应的位置
zip:
if (success && descriptor.getAsynchronous() != null) {//异步的zip coreSize为0 maxeSize为INTEGER_MAX的线程池,执行将上步remove过来的文件zip的操作
LOGGER.debug("RollingFileManager executing async {}", descriptor.getAsynchronous());
asyncExecutor.execute(new AsyncAction(descriptor.getAsynchronous(), this));
releaseRequired = false;
}
aaalog是源文件
aaa是生成的zip文件和中间文件
synchronous = {FileRenameAction@16811} "FileRenameAction[D:\app1\files3\aaalog.txt to D:\app1\files3\20191102\aaa, renameEmptyFiles=false]"
asynchronous = {ZipCompressAction@16179} "ZipCompressAction[D:\app1\files3\20191102\aaa to D:\app1\files3\20191102\aaa.zip, level=9, deleteSource=true]"
source = {File@19877} "D:\app1\files3\aaalog.txt"
destination = {File@19878} "D:\app1\files3\20191102\aaa"
重命名其实就是move
public static boolean execute(final File source, final File destination, final boolean renameEmptyFiles) {
Files.move(Paths.get(source.getAbsolutePath()), Paths.get(destination.getAbsolutePath()),//★这里是真实的move操作
StandardCopyOption.ATOMIC_MOVE, StandardCopyOption.REPLACE_EXISTING);
LOGGER.trace("Renamed file {} to {} with Files.move", source.getAbsolutePath(),
destination.getAbsolutePath());
return true;
try {
source.delete(); //定时跑的时候,如果aaalog为空,就直接删除
} catch (final Exception exDelete) {
LOGGER.error("Unable to delete empty file {}: {} {}", source.getAbsolutePath(),
exDelete.getClass().getName(), exDelete.getMessage());
}
进入cron
RollingFileManager.initialize();
里面有初始化cron线程池的操作
public void initialize(final RollingFileManager aManager) {
this.manager = aManager;
final Date now = new Date();
final Date lastRollForFile = cronExpression.getPrevFireTime(new Date(this.manager.getFileTime()));
final Date lastRegularRoll = cronExpression.getPrevFireTime(new Date());
aManager.getPatternProcessor().setCurrentFileTime(lastRegularRoll.getTime());
LOGGER.debug("LastRollForFile {}, LastRegularRole {}", lastRollForFile, lastRegularRoll);
aManager.getPatternProcessor().setPrevFileTime(lastRegularRoll.getTime());
aManager.getPatternProcessor().setTimeBased(true);
if (checkOnStartup && lastRollForFile != null && lastRegularRoll != null &&
lastRollForFile.before(lastRegularRoll)) {
lastRollDate = lastRollForFile;
rollover();
}
final ConfigurationScheduler scheduler = configuration.getScheduler();
if (!scheduler.isExecutorServiceSet()) {
// make sure we have a thread pool
scheduler.incrementScheduledItems();
}
if (!scheduler.isStarted()) {
scheduler.start();
}
lastRollDate = lastRegularRoll;
future = scheduler.scheduleWithCron(cronExpression, now, new CronTrigger());
LOGGER.debug(scheduler.toString());
}
crontriggerstage 会在初始化的时候 执行这一句
future = scheduler.scheduleWithCron(cronExpression, now, new CronTrigger());
每次都会跑 跑的就是
private void rollover() {
manager.rollover(cronExpression.getPrevFireTime(new Date()), lastRollDate);
if (future != null) {
lastRollDate = future.getFireTime();
}
}
核心代码 ★ 为啥跟上面一样呢?因为有个size策略 所以每次log.info都会执行,又有个cron策略,所以线程池也会执行
if (rollover(rolloverStrategy)) { 这里有个同步的action和异步的action
try {
size = 0;
initialTime = System.currentTimeMillis();
createFileAfterRollover(); //★这里直接同步创建新的源文件
} catch (final IOException e) {
logError("Failed to create file after rollover", e);
}
}
//★ 同上,这里创建文件
@Override
protected OutputStream createOutputStream() throws IOException {
final String filename = getFileName();
LOGGER.debug("Now writing to {} at {}", filename, new Date());
final File file = new File(filename);
final FileOutputStream fos = new FileOutputStream(file, isAppend);
if (file.exists() && file.length() == 0) {
try {
// 而且重命名的时间和当前时间一致 把源文件的创建时间设置成当前时间
FileTime now = FileTime.fromMillis(System.currentTimeMillis());
Files.setAttribute(file.toPath(), "creationTime", now);
} catch (Exception ex) {
LOGGER.warn("Unable to set current file tiem for {}", filename);
}
}
defineAttributeView(Paths.get(filename));
return fos;
}
cronpolicy
manager.rollover(cronExpression.getPrevFireTime(new Date()), lastRollDate);
if (future != null) {
lastRollDate = future.getFireTime();
}
分析文件创建,删除,zip时间代码如下
获取descriptor 里面俩操作
一个同步的rename 一个异步的zip
rename是直接执行的 下面的代码 这步骤是同步执行的,将 source move 到指定位置
Files.move(Paths.get(source.getAbsolutePath()), Paths.get(destination.getAbsolutePath()),
StandardCopyOption.ATOMIC_MOVE, StandardCopyOption.REPLACE_EXISTING);
LOGGER.trace("Renamed file {} to {} with Files.move", source.getAbsolutePath(),
destination.getAbsolutePath());
return true;
source是怎么重新出现的 因为是定时任务 所以就是定时的时间 createFileAfterRollover() 创建的就是当时的时间
protected OutputStream createOutputStream() throws IOException {
final String filename = getFileName();
LOGGER.debug("Now writing to {} at {}", filename, new Date());
final File file = new File(filename);
final FileOutputStream fos = new FileOutputStream(file, isAppend);
if (file.exists() && file.length() == 0) {
try {
FileTime now = FileTime.fromMillis(System.currentTimeMillis());
Files.setAttribute(file.toPath(), "creationTime", now);
} catch (Exception ex) {
LOGGER.warn("Unable to set current file tiem for {}", filename);
}
}
defineAttributeView(Paths.get(filename));
return fos;
}
又怎么被删除 只要为空 每次定时任务扫到就会删掉
又出现 同上
/* This executor pool will create a new Thread for every work async action to be performed. Using it allows
us to make sure all the Threads are completed when the Manager is stopped. */
private final ExecutorService asyncExecutor = new ThreadPoolExecutor(0, Integer.MAX_VALUE, 0, TimeUnit.MILLISECONDS,
new EmptyQueue(), threadFactory);
一种合理的解释:线程池里新建了15分钟 的zip任务,由于某种原因卡住了,卡到20才执行完,
此时20的已经创建好,且zip完了,就会覆盖,至于是哪个覆盖哪个就不知道了
zip的创建时间
public static boolean execute(final File source, final File destination, final boolean deleteSource,
final int level) throws IOException {
if (source.exists()) {
try (final FileInputStream fis = new FileInputStream(source);
final ZipOutputStream zos = new ZipOutputStream(new FileOutputStream(destination))) {
zos.setLevel(level);
final ZipEntry zipEntry = new ZipEntry(source.getName());
zos.putNextEntry(zipEntry);
final byte[] inbuf = new byte[BUF_SIZE];
int n;
while ((n = fis.read(inbuf)) != -1) {//创建时间
zos.write(inbuf, 0, n);
}
}
if (deleteSource && !source.delete()) {
LOGGER.warn("Unable to delete " + source.toString() + '.');
}
return true;
}
return false;
}
总结为 定时move,同步create 异步 zip(modifytime)
我们的逻辑为,如果 zip modifytime-createtime>5分 重命名文件+5分 文件本来名就是move createtime 的名
如果<5分,就不重命名
正常情况下
move和createtime 是一致的 都是0/5这样的分钟数
modifytime 是异步的zip创建时间
异常情况见图片吧,概率很低