现在线上ats 6.2.3现在都开启了diags.log日志回滚的功能,采用下面的配置方法
CONFIG proxy.config.diags.logfile.rolling_enabled INT 3
CONFIG proxy.config.diags.logfile.rolling_interval_sec INT 86400
CONFIG proxy.config.diags.logfile.rolling_size_mb INT 1000
但是我发现有一台设备,每天同一时间都会发生相同的assert断言失败,导致的coredump,内容如下:
FATAL: switching to user root, failed to set group ID 0
traffic_server: received signal 6 (Aborted)
traffic_server - STACK TRACE:
/usr/local/bin/traffic_server(crash_logger_invoke(int, siginfo_t*, void*)+0x8e)[0x4c938e]
/lib64/libpthread.so.0(+0x12dd0)[0x7f106d1b9dd0]
/lib64/libc.so.6(gsignal+0x10f)[0x7f106c2e899f]
/lib64/libc.so.6(abort+0x127)[0x7f106c2d2cf5]
/usr/local/trafficserver/lib/libtsutil.so.6(+0x6bb0d)[0x7f106e9ecb0d]
/usr/local/trafficserver/lib/libtsutil.so.6(+0x6eb83)[0x7f106e9efb83]
/usr/local/bin/traffic_server(Diags::error(DiagsLevel, char const*, char const*, int, char const*, ...) const+0x78)[0x4badc8]
/usr/local/trafficserver/lib/libtsutil.so.6(+0x805b7)[0x7f106ea015b7]
/usr/local/trafficserver/lib/libtsutil.so.6(ImpersonateUserID(unsigned int, ImpersonationLevel)+0x60)[0x7f106ea01660]
/usr/local/trafficserver/lib/libtsutil.so.6(ElevateAccess::elevate(unsigned int)+0x29)[0x7f106ea01899]
/usr/local/trafficserver/lib/libtsutil.so.6(ElevateAccess::ElevateAccess(unsigned int)+0x25)[0x7f106ea018c5]
/usr/local/trafficserver/lib/libtsutil.so.6(elevating_open(char const*, unsigned int, unsigned int)+0x5b)[0x7f106ea01cfb]
/usr/local/trafficserver/lib/libtsutil.so.6(BaseMetaInfo::_write_to_file()+0x19)[0x7f106e9edbd9]
/usr/local/trafficserver/lib/libtsutil.so.6(BaseLogFile::open_file(int)+0x177)[0x7f106e9eddd7]
/usr/local/trafficserver/lib/libtsutil.so.6(Diags::should_roll_diagslog()+0xe6)[0x7f106e9efcb6]
/usr/local/bin/traffic_server(DiagsLogContinuation::periodic(int, Event*)+0x55)[0x5012d5]
/usr/local/bin/traffic_server(EThread::process_event(Event*, int)+0x85)[0x7e72e5]
/usr/local/bin/traffic_server(EThread::execute()+0x482)[0x7e7e52]
/usr/local/bin/traffic_server[0x7e6dd5]
/lib64/libpthread.so.0(+0x82de)[0x7f106d1af2de]
/lib64/libc.so.6(clone+0x43)[0x7f106c3ad4b3]
通过与同机房的其它设备比对,发现.meta文件的用户和组不同,有问题的这台都是root,其它设备都是nobody
解决方法
chown nobody:nobody .diags.log.meta