NodeJS 内存溢出排查分析: JavaScript heap out of memory

一、起因

之前内网部分服务不稳定,请求偶尔会挂掉,需要频繁重启。本着实事求是的思想,决定排查并解决这个顽疾。

二、日志分析

通过登录机器查找日志发现具体的错误信息,“JavaScript heap out of memory”,可以看出是由于内存不足导致的服务异常。

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-UHGUwNls-1599037880598)(http://km.oa.com/files/photos/pictures/202009/1599016963_55_w2900_h1810.png)]

三、内存溢出分析

1.分析

通过执行top指令,发现该进程([TSW/worker/0]) CPU占有率和 memery 占比一直居高不下。
在这里插入图片描述
在这里插入图片描述

可以发现一些问题:

  • 内存占用暴涨,大约600M
  • CPU占用率极高

猜测: 可能是 kubeapi 服务,读取数据库或者业务自身写法错误引起的内存泄漏(例如:闭包、mysql连接未释放等)。

在这里插入图片描述

通过分析代码逻辑+日志追踪,真相逐渐清晰:

在这里插入图片描述

从这里就基本可以排除,数据库连接超时导致的问题。

2.定位

那么,异常问题应该就是在返回的数据中。

通过查看接口的调用名,findAll 全查询,那么很有可能是因为数据量返回过大,导致内存以及CPU消耗过大。这里也符合一开始的错误定位。

await models.M.ConfBat.findAll({
  where: {
    jsonid,
    type
  }
})

3.优化

为了验证这一个猜想,我们首先加一个分页逻辑,减小返回的数据量。(PS:此处操作数据库使用的是 sequelie API)

格式如下:

 await models.M.ConfBat.findAll({
   offset: offset || 0, // 偏移量
   limit: limit || 20, // 分页数
   where: {
     jsonid,
     type
   }
 })

为了验证猜想,我们继续重放原来的请求:

在这里插入图片描述

可以看到,数据成功返回。

四、思考

通过上面的问题定位分析,我们可以看到异常的原因是由数据库返回的数据过大导致的。但是,深入思考一下,为什么会出现内存溢出呢?

1. 为什么会出现内存溢出?

Node.js 做密集型运算,或者所操作的数组、对象本身较大时,容易出现内存溢出的问题,这是由于 Node.js 的运行环境依赖 V8 引擎导致的。如果经常有较大数据量运算等操作,需要对 Node.js 运行环境限制有充分的了解。

2. 常见的异常场景

2.1 密集型计算

for (var i = 0; i < 10000000; i++) {
    ((i) => {
        var user = {};
        user.id = i;
        user.name = 'KUBE真苦逼';
        setTimeout(()=>console.log(i,user),0)
    })(i)
}

CPU和内存占用情况

在这里插入图片描述

错误堆栈信息
在这里插入图片描述

2.2 海量数据量操作

let users = []

for (let i =0; i< 10000;i++) {
     let user = {}
     for(let j =0; j< 10000; j++) {
	user[j] = `KUBE${j}`
	users.push(user[j])
     }
}

CPU和内存占用情况

在这里插入图片描述

错误堆栈信息

在这里插入图片描述

上述两种操作,都会出现下面的错误信息:

<--- Last few GCs --->

[3126:0x2ca9be0]    34735 ms: Mark-sweep 1280.6 (1331.5) -> 1280.6 (1300.5) MB, 1763.9 / 0.0 ms  (average mu = 0.223, current mu = 0.000) last resort GC in old space requested

<--- JS stacktrace --->

3. 内存溢出的原因

NodeJS 程序出现内存溢出问题,可以分为三个方面:

  • V8内存限制
  • 使用不规范 (闭包,对象内存及时未释放)

4. 解决方案

4.1 使用 async/await

对于第一种异常情况,可以通过 async/await 将异步操作改为同步操作。防止异步事件堆积过多,操作内存溢出。

4.2 使用 Buffer 系统内存

对于第二种异常情况,将数据保存到Buffer中,而不是转换成字符串等JS对象的形式,这样也可以避免V8内存占用过多。

五、附录

参考资料

完整错误日志:

<--- Last few GCs --->

[8072:0x3d72dc0] 140630 ms: Mark-sweep 1102.9 (1259.7) -> 1102.8 (1227.2) MB, 48.0 / 0.0 ms (average mu = 0.942, current mu = 0.000) last resort GC in old space requested
[8072:0x3d72dc0] 140677 ms: Mark-sweep 1102.8 (1227.2) -> 1102.8 (1226.7) MB, 47.1 / 0.0 ms (average mu = 0.881, current mu = 0.000) last resort GC in old space requested

<— JS stacktrace —>

==== JS stack trace =========================================

0: ExitFrame [pc: 0x258199dd4eab]

Security context: 0x15dc384aede9
1: indexOf [0x15dc384b40d9](this=0x15c00320a309 <Very long string[265008978]>,0x117693fec241 <String[1]: <>)
2: /* anonymous */ [0x1add5005a291] [/usr/local/services/TSW-1.0/bin/tsw/plug.js:38] [bytecode=0x136dff410d9 offset=36](this=0x15dc384bc881 )
3: arguments adaptor frame: 1->0
4: response [0x30db838c6c19] [/…

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 0x92b820 node::Abort() [TSW/worker/1]
2: 0x92b86c [TSW/worker/1]
3: 0xb36dfe v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [TSW/worker/1]
4: 0xb37031 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [TSW/worker/1]
5: 0xf45c02 [TSW/worker/1]
6: 0xf5627e v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [TSW/worker/1]
7: 0xf235b1 v8::internal::Factory::NewRawTwoByteString(int, v8::internal::PretenureFlag) [TSW/worker/1]
8: 0x1070d82 v8::internal::String::SlowFlatten(v8::internal::Handlev8::internal::ConsString, v8::internal::PretenureFlag) [TSW/worker/1]
9: 0xb33e9c v8::internal::String::Flatten(v8::internal::Handlev8::internal::String, v8::internal::PretenureFlag) [TSW/worker/1]
10: 0x1071f9c v8::internal::String::IndexOf(v8::internal::Isolate*, v8::internal::Handlev8::internal::String, v8::internal::Handlev8::internal::String, int) [TSW/worker/1]
11: 0x1224c27 v8::internal::Runtime_StringIndexOfUnchecked(int, v8::internal::Object**, v8::internal::Isolate*) [TSW/worker/1]
12: 0x258199dd4eab
2020-09-02 11:19:13.465 [INFO] [11732 cpu 0] [null] [master.js:438] worker1 pid=8072 disconnect event fired.
2020-09-02 11:19:13.465 [INFO] [11732 cpu 0] [null] [master.js:242] restart new worker instead of worker1 pid=8072
2020-09-02 11:19:13.468 [INFO] [11732 cpu 0] [null] [master.js:399] worker1 fork success! pid:19001, cpu: 1
2020-09-02 11:19:13.468 [INFO] [11732 cpu 0] [null] [cpu.js:121] taskset -cp 19001
2020-09-02 11:19:13.470 [INFO] [11732 cpu 0] [null] [master.js:451] worker1 pid=8072 exit event fired.
2020-09-02 11:19:13.472 [INFO] [11732 cpu 0] [null] [cpu.js:153]
pid 19001’s current affinity list: 0-7

2020-09-02 11:19:13.472 [INFO] [11732 cpu 0] [null] [cpu.js:160] taskset -cp 1 19001
2020-09-02 11:19:13.476 [INFO] [11732 cpu 0] [null] [cpu.js:174]
pid 19001’s current affinity list: 0-7
pid 19001’s new affinity list: 1

2020-09-02 11:19:14.162 [INFO] [19001 cpu 0] [null] [master.js:154] start worker…
2020-09-02 11:19:14.164 [INFO] [19001 cpu 0] [null] [master.js:155] config from: /data/release/tsw.config.js
2020-09-02 11:19:14.211 [INFO] [19001 cpuX 0] [null] [http.proxy.js:76] pid:19001 createServer ok
2020-09-02 11:19:14.311 [INFO] [19001 cpuX 0] [null] [http.proxy.js:97] receive message, cmd: listen
2020-09-02 11:19:14.312 [INFO] [19001 cpu1 0] [null] [http.proxy.js:249] cpu: 1, beforeStartup…
2020-09-02 11:19:14.312 [INFO] [19001 cpu1 0] [null] [http.proxy.js:255] cpu: 1, listen…
2020-09-02 11:19:14.315 [INFO] [19001 cpu1 0] [null] [http.proxy.js:263] cpu: 1, listen http ok 0.0.0.0:8081
2020-09-02 11:19:14.316 [INFO] [19001 cpu1 0] [null] [http.proxy.js:318] cpu: 1, listen http ok 100.113.190.39:12802
2020-09-02 11:19:14.316 [INFO] [19001 cpu1 0] [null] [http.proxy.js:271] start heartbeat
2020-09-02 11:19:14.316 [INFO] [19001 cpu1 0] [null] [http.proxy.js:286] switch to uid: user_00
2020-09-02 11:19:14.317 [INFO] [19001 cpu1 0] [null] [http.proxy.js:293] cpu: 1, afterStartup…
2020-09-02 11:19:19.906 [INFO] [11732 cpu 0] [null] [admin.actions.js:89] typeof config: object
check config ok

2020-09-02 11:19:19.906 [INFO] [11732 cpu 0] [null] [master.js:460] reload event fired.
2020-09-02 11:19:19.906 [INFO] [11732 cpu 0] [null] [master.js:480] cpu0 reload after 0ms
2020-09-02 11:19:19.906 [INFO] [11732 cpu 0] [null] [master.js:480] cpu1 reload after 1000ms
2020-09-02 11:19:19.906 [INFO] [11732 cpu 0] [null] [master.js:242] restart new worker instead of worker0 pid=11499
2020-09-02 11:19:19.909 [INFO] [11732 cpu 0] [null] [master.js:399] worker0 fork success! pid:19200, cpu: 0
2020-09-02 11:19:19.909 [INFO] [11732 cpu 0] [null] [cpu.js:121] taskset -cp 19200
2020-09-02 11:19:19.913 [INFO] [11732 cpu 0] [null] [cpu.js:153]
pid 19200’s current affinity list: 0-7

2020-09-02 11:19:19.913 [INFO] [11732 cpu 0] [null] [cpu.js:160] taskset -cp 0 19200
2020-09-02 11:19:19.917 [INFO] [11732 cpu 0] [null] [cpu.js:174]
pid 19200’s current affinity list: 0-7
pid 19200’s new affinity list: 0

2020-09-02 11:19:20.610 [INFO] [19200 cpu 0] [null] [master.js:154] start worker…
2020-09-02 11:19:20.612 [INFO] [19200 cpu 0] [null] [master.js:155] config from: /data/release/tsw.config.js
2020-09-02 11:19:20.659 [INFO] [19200 cpuX 0] [null] [http.proxy.js:76] pid:19200 createServer ok
2020-09-02 11:19:20.770 [INFO] [19200 cpuX 0] [null] [http.proxy.js:97] receive message, cmd: listen
2020-09-02 11:19:20.770 [INFO] [19200 cpu0 0] [null] [http.proxy.js:249] cpu: 0, beforeStartup…
2020-09-02 11:19:20.770 [INFO] [19200 cpu0 0] [null] [http.proxy.js:255] cpu: 0, listen…
2020-09-02 11:19:20.773 [INFO] [19200 cpu0 0] [null] [http.proxy.js:263] cpu: 0, listen http ok 0.0.0.0:8081
2020-09-02 11:19:20.774 [INFO] [19200 cpu0 0] [null] [http.proxy.js:318] cpu: 0, listen http ok 100.113.190.39:12801
2020-09-02 11:19:20.774 [INFO] [19200 cpu0 0] [null] [http.proxy.js:271] start heartbeat
2020-09-02 11:19:20.775 [INFO] [19200 cpu0 0] [null] [http.proxy.js:286] switch to uid: user_00
2020-09-02 11:19:20.775 [INFO] [19200 cpu0 0] [null] [http.proxy.js:293] cpu: 0, afterStartup…
2020-09-02 11:19:20.906 [INFO] [11732 cpu 0] [null] [master.js:242] restart new worker instead of worker1 pid=19001
2020-09-02 11:19:20.909 [INFO] [11732 cpu 0] [null] [master.js:399] worker1 fork success! pid:19258, cpu: 1
2020-09-02 11:19:20.909 [INFO] [11732 cpu 0] [null] [cpu.js:121] taskset -cp 19258
2020-09-02 11:19:20.910 [INFO] [19001 cpu1 0] [null] [index.js:35] report on disconnect event…
2020-09-02 11:19:20.911 [INFO] [11732 cpu 0] [null] [master.js:438] worker1 pid=19001 disconnect event fired.
2020-09-02 11:19:20.912 [INFO] [19001 cpu1 0] [null] [is-test.js:54] save h5test on disconnect event…
2020-09-02 11:19:20.913 [INFO] [11732 cpu 0] [null] [cpu.js:153]
pid 19258’s current affinity list: 0-7

2020-09-02 11:19:20.913 [INFO] [11732 cpu 0] [null] [cpu.js:160] taskset -cp 1 19258
2020-09-02 11:19:20.916 [INFO] [11732 cpu 0] [null] [cpu.js:174]
pid 19258’s current affinity list: 0-7
pid 19258’s new affinity list: 1

2020-09-02 11:19:21.610 [INFO] [19258 cpu 0] [null] [master.js:154] start worker…
2020-09-02 11:19:21.611 [INFO] [19258 cpu 0] [null] [master.js:155] config from: /data/release/tsw.config.js
2020-09-02 11:19:21.660 [INFO] [19258 cpuX 0] [null] [http.proxy.js:76] pid:19258 createServer ok
2020-09-02 11:19:21.774 [INFO] [19258 cpuX 0] [null] [http.proxy.js:97] receive message, cmd: listen
2020-09-02 11:19:21.774 [INFO] [19258 cpu1 0] [null] [http.proxy.js:249] cpu: 1, beforeStartup…
2020-09-02 11:19:21.775 [INFO] [19258 cpu1 0] [null] [http.proxy.js:255] cpu: 1, listen…
2020-09-02 11:19:21.777 [INFO] [19258 cpu1 0] [null] [http.proxy.js:263] cpu: 1, listen http ok 0.0.0.0:8081
2020-09-02 11:19:21.778 [INFO] [19258 cpu1 0] [null] [http.proxy.js:318] cpu: 1, listen http ok 100.113.190.39:12802
2020-09-02 11:19:21.778 [INFO] [19258 cpu1 0] [null] [http.proxy.js:271] start heartbeat
2020-09-02 11:19:21.779 [INFO] [19258 cpu1 0] [null] [http.proxy.js:286] switch to uid: user_00
2020-09-02 11:19:21.780 [INFO] [19258 cpu1 0] [null] [http.proxy.js:293] cpu: 1, afterStartup…
2020-09-02 11:19:23.470 [INFO] [11732 cpu 0] [null] [master.js:216] worker/1 8072 has killed
2020-09-02 11:19:29.910 [INFO] [11732 cpu 0] [null] [master.js:216] worker/0 11499 has killed
2020-09-02 11:19:29.910 [INFO] [11732 cpu 0] [null] [master.js:438] worker0 pid=11499 disconnect event fired.
2020-09-02 11:19:29.933 [INFO] [11732 cpu 0] [null] [master.js:451] worker0 pid=11499 exit event fired.
2020-09-02 11:19:30.909 [INFO] [11732 cpu 0] [null] [master.js:216] worker/1 19001 has killed
2020-09-02 11:19:30.913 [INFO] [11732 cpu 0] [null] [master.js:451] worker1 pid=19001 exit event fired.

  • 3
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值