多线程进程,logging写入日志到同一文件的处理方法
python logging系统切分问题
由于python logging系统本身并不是线程安全的,所以在多线程多进程情况下,切分日志,写入日志到同一文件会导致日志错乱。
- 多线程执行日志切分 ,会造成只有一个线程切换成功;
- 日志丢失问题,根据日志切分源码,多线程同时切分日志时,会导致后面的线程删除前面线程切分出来的日志,导致日志丢失;
TimedRotatingFileHandler切分逻辑源码
def doRollover(self):
"""
do a rollover; in this case, a date/time stamp is appended to the filename
when the rollover happens. However, you want the file to be named for the
start of the interval, not the current time. If there is a backup count,
then we have to get a list of matching filenames, sort them and remove
the one with the oldest suffix.
"""
if self.stream:
self.stream.close()
self.stream = None
# get the time that this sequence started at and make it a TimeTuple
currentTime = int(time.time())
dstNow = time.localtime(currentTime)[-1]
t = self.rolloverAt - self.interval
if self.utc:
timeTuple = time.gmtime(t)
else:
timeTuple = time.localtime(t)
dstThen = timeTuple[-1]
if dstNow != dstThen:
if dstNow:
addend = 3600
else:
addend = -3600
timeTuple = time.localtime(t + addend)
dfn = self.rotation_filename(self.baseFilename + "." +
time.strftime(self.suffix, timeTuple))
# ***注意这一步,逻辑是先判断是否有切分文件,如果有则删除这个文件
# ***即若当前需要切分出来的日志文件是 09-10.log,则会检查是否有这个文件,有就先删除!
if os.path.exists(dfn):
os.remove(dfn)
# ***rotate这个就是重命名最新日志文件进行真正的日志切分!即将当天的日志重命名为切分日期。
self.rotate(self.baseFilename, dfn)
if self.backupCount > 0:
for s in self.getFilesToDelete():
os.remove(s)
if not self.delay:
self.stream = self._open()
newRolloverAt = self.computeRollover(currentTime)
while newRolloverAt <= currentTime:
newRolloverAt = newRolloverAt + self.interval
#If DST changes and midnight or weekly rollover, adjust for this.
if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
dstAtRollover = time.localtime(newRolloverAt)[-1]
if dstNow != dstAtRollover:
if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
addend = -3600
else: # DST bows out before next rollover, so we need to add an hour
addend = 3600
newRolloverAt += addend
self.rolloverAt = newRolloverAt
注意看上述源码带***的注释,当只有一个线程的情况下,日志切分无疑是ok的。
当多线程执行日志切分,场面就变成不可控了:
例如有线程A,B;
倘若线程A先切换日志完成,那么线程B则会先判断有无已存在的切分日志,有就删除。那么线程A切分出来的日志就会遭到线程B的删除,导致日志丢失;
接着B就会将当前最新日志重命名切分日志,再生成新的日志文件进行日志写入。
这里面又会导致另一个问题,已切分出的日志,跟最新的日志,都会有日志写入!
因为线程A切换日志完成后,获取的句柄是最新的日志文件,而后这个最新的日志文件遭到了线程B的重命名,但是线程A获取的句柄是不会变的!所以会导致线程B写入最新日志,线程A写入切分日志的情况!
还有多线程的种种情况,反正python logging就是一个线程不安全的日志系统。
解决方案
重写TimedRotatingFileHandler中的切分逻辑,生成一个线程安全的Handler
直接上码
# 线程安全的Handler
class SafeRotatingFileHandler(TimedRotatingFileHandler):
def __init__(self, filename, when='h', interval=1, backupCount=0, encoding=None, delay=False, utc=False):
TimedRotatingFileHandler.__init__(self, filename, when, interval, backupCount, encoding, delay, utc)
"""
重写 doRollover 切分逻辑
"""
def doRollover(self):
"""
do a rollover; in this case, a date/time stamp is appended to the filename
when the rollover happens. However, you want the file to be named for the
start of the interval, not the current time. If there is a backup count,
then we have to get a list of matching filenames, sort them and remove
the one with the oldest suffix.
"""
if self.stream:
self.stream.close()
self.stream = None
# get the time that this sequence started at and make it a TimeTuple
currentTime = int(time.time())
dstNow = time.localtime(currentTime)[-1]
t = self.rolloverAt - self.interval
if self.utc:
timeTuple = time.gmtime(t)
else:
timeTuple = time.localtime(t)
dstThen = timeTuple[-1]
if dstNow != dstThen:
if dstNow:
addend = 3600
else:
addend = -3600
timeTuple = time.localtime(t + addend)
dfn = self.rotation_filename(self.baseFilename + "." +
time.strftime(self.suffix, timeTuple))
# 当判断有需要切分的文件则删除
# if os.path.exists(dfn):
# os.remove(dfn)
# Issue 18940: A file may not have been created if delay is True.
## if os.path.exists(self.baseFilename):
# 重写切分逻辑
# linux 文件锁,同os系统下 多进程锁
if platform.system() != 'Windows':
# fcntl锁的粒度是整个文件,别的进程不能对这个文件加锁
fcntl = __import__("fcntl")
f = open('logrename.lock', 'wb')
try:
fcntl.flock(f, fcntl.LOCK_EX) # 互斥锁
print('切分文件日志加锁')
# 如果不存在切分文件并且存在currentlog,重新命名
if not os.path.exists(dfn) and os.path.exists(self.baseFilename):
os.rename(self.baseFilename, dfn)
print('切分文件日志解锁')
# 解锁
fcntl.flock(f, fcntl.LOCK_UN)
except:
pass
else:
with LOCK:
# 如果不存在切分文件并且存在currentlog,重新命名
if not os.path.exists(dfn) and os.path.exists(self.baseFilename):
os.rename(self.baseFilename, dfn)
# 重写完成
if self.backupCount > 0:
for s in self.getFilesToDelete():
os.remove(s)
if not self.delay:
self.stream = self._open()
newRolloverAt = self.computeRollover(currentTime)
while newRolloverAt <= currentTime:
newRolloverAt = newRolloverAt + self.interval
#If DST changes and midnight or weekly rollover, adjust for this.
if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:
dstAtRollover = time.localtime(newRolloverAt)[-1]
if dstNow != dstAtRollover:
if not dstNow: # DST kicks in before next rollover, so we need to deduct an hour
addend = -3600
else: # DST bows out before next rollover, so we need to add an hour
addend = 3600
newRolloverAt += addend
self.rolloverAt = newRolloverAt
别看代码好像很多,其实是继承TimedRotatingFileHandler,将doRollover这个切分方法里面的几行主逻辑注释掉,然后重新写一遍新的切分逻辑:
如果不存在切分文件并且存在currentlog,则进行切分。在这个逻辑前加上锁,就能解决多线程多进程同时切分问题。因为服务器是linux,所以用的是linux 上的fcntl文件锁。
至此,用新的handler进行日志写入,可以解决多线程多进程写入日志混乱问题。
#设置日志回滚,一天一个文件,保留15天
file_handler = SafeRotatingFileHandler(filename=file_name,when=‘MIDNIGHT’,interval=1,backupCount=15)
作者:十八