- 2021-09-19 14:27:55:SingleMonitor handlerTime:20210918
19-09-2021 14:27:55 CST signal_monitor INFO - Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.
19-09-2021 14:27:56 CST signal_monitor INFO - Sun Sep 19 14:27:56 CST 2021 WARN: Establishing SSL connection without server's identity verification is not recommended. According to MySQL 5.5.45+, 5.6.26+ and 5.7.6+ requirements SSL connection must be established by default if explicit option isn't set. For compliance with existing applications not using SSL the verifyServerCertificate property is set to 'false'. You need either to explicitly disable SSL by setting useSSL=false, or set useSSL=true and provide truststore for server certificate verification.
19-09-2021 14:28:29 CST signal_monitor INFO - Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007fd24719a000, 12288, 0) failed; error='Cannot allocate memory' (errno=12)
19-09-2021 14:28:29 CST signal_monitor INFO - #
19-09-2021 14:28:29 CST signal_monitor INFO - # There is insufficient memory for the Java Runtime Environment to continue.
19-09-2021 14:28:29 CST signal_monitor INFO - # Native memory allocation (mmap) failed to map 12288 bytes for committing reserved memory.
19-09-2021 14:28:29 CST signal_monitor INFO - # An error report file with more information is saved as:
19-09-2021 14:28:29 CST signal_monitor INFO - # /opt/azkaban/azkaban-exec-server-0.1.0-SNAPSHOT/bin/executions/5991/gbhc/hs_err_pid132046.log
19-09-2021 14:28:46 CST signal_monitor INFO - Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00007fde77b20000, 65536, 1) failed; error='Cannot allocate memory' (errno=12)
19-09-2021 14:28:46 CST signal_monitor INFO - [thread 140591398864640 also had an error]
19-09-2021 14:30:29 CST signal_monitor INFO - # [ timer expired, abort... ]
19-09-2021 14:40:49 CST signal_monitor INFO - /usr/local/saic_jar/saic_rvm_coll/bin/signal_monitor.sh: line 13: 132046 Aborted (core dumped) /usr/java/jdk1.8.0_161/bin/java -Xmx10g -Xms10g -Xmn5g -cp /usr/local/saic_jar/saic_rvm_coll/lib/saic-rvm3.jar incar.SingleMonitor1211 $date 10
19-09-2021 14:40:49 CST signal_monitor INFO - Process completed unsuccessfully in 6664 seconds.
19-09-2021 14:40:49 CST signal_monitor ERROR - Job run failed!
java.lang.RuntimeException: azkaban.jobExecutor.utils.process.ProcessFailureException
at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:304)
at azkaban.execapp.JobRunner.runJob(JobRunner.java:787)
at azkaban.execapp.JobRunner.doRun(JobRunner.java:602)
at azkaban.execapp.JobRunner.run(JobRunner.java:563)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: azkaban.jobExecutor.utils.process.ProcessFailureException
at azkaban.jobExecutor.utils.process.AzkabanProcess.run(AzkabanProcess.java:130)
at azkaban.jobExecutor.ProcessJob.run(ProcessJob.java:296)
... 8 more
19-09-2021 14:40:49 CST signal_monitor ERROR - azkaban.jobExecutor.utils.process.ProcessFailureException cause: azkaban.jobExecutor.utils.process.ProcessFailureException
19-09-2021 14:40:49 CST signal_monitor INFO - Finishing job signal_monitor at 1632033649511 with status FAILED
有个定时任务每天都会跑两个小时,之前一直执行的没有问题,19号时报了一次红打开看了,发现任务应该是执行完了,错略一看应该是内存溢出,因为是定时任务,觉得偶尔一次的内存溢出应该没有关系。
可今天检查任务时发现昨天跑的任务也报红了,打开看了是一模一样的报错。检查了代码其实都已经执行完毕了。上面日志有打印mysql的警告,其实最后一行代码已经执行完了,mysql里也插入成功数据了。
如图 14:27:55 执行完毕
dmap.clear();
dmap1.clear();
beformap.clear();
beformap1.clear();
maxMileageMap.clear();
} catch (Exception e) {
e.printStackTrace();
taskJob.setStatus(3);
taskJob.setMessage(e.getMessage());
} finally {
// executorService.shutdownNow();
executorService1.shutdown();
}
taskJob.setTaskEndTime((new SimpleDateFormat("yyyyMMddHHmmss")).format(new Date()));
ScriptUtil.updateStatus(taskJob);
}
于是猜测很可能不是java的问题,可能是linux的问题。
于是查到一篇文章
正好今天的任务也正在运行我去试了下
[appuser@schedule-02 bin]$ sudo cat /proc/328928/maps | wc -l
32193
[appuser@schedule-02 bin]$ cat /proc/sys/vm/max_map_count
65530
[appuser@schedule-02 bin]$ sudo cat /proc/328928/maps | wc -l
30206
65530/2=32765
而根据pid打印了两次值分别是 32193和30206,说明这个值是一直在波动的,距离上限32765也不是很远了,超过限制被linux强制关闭很有可能
优化:
sudo vim /etc/sysctl.conf
#加入
vm.max_map_count=262144
#生效
sysctl -p /etc/sysctl.conf
#检查
[appuser@schedule-02 bin]$ sudo sysctl -a|grep vm.max_map_count
sysctl: reading key "net.ipv6.conf.all.stable_secret"
sysctl: reading key "net.ipv6.conf.bond0.stable_secret"
sysctl: reading key "net.ipv6.conf.default.stable_secret"
sysctl: reading key "net.ipv6.conf.enp61s0f0.stable_secret"
sysctl: reading key "net.ipv6.conf.enp61s0f1.stable_secret"
sysctl: reading key "net.ipv6.conf.ens1f0.stable_secret"
sysctl: reading key "net.ipv6.conf.ens1f1.stable_secret"
sysctl: reading key "net.ipv6.conf.ens1f2.stable_secret"
sysctl: reading key "net.ipv6.conf.ens1f3.stable_secret"
sysctl: reading key "net.ipv6.conf.lo.stable_secret"
vm.max_map_count = 262144
over