node js 调试_如何在Node JS生产代码中调试无限循环

最新推荐文章于 2023-06-27 00:23:21 发布

weixin_26755331

最新推荐文章于 2023-06-27 00:23:21 发布

阅读量194

点赞数

文章标签： debug python leetcode 算法 nodejs

原文链接：https://blog.apify.com/how-to-debug-an-infinite-loop-in-node-js-production-code-9ec0e1442da0

版权

node js 调试

Print the stack trace of an infinite loop in a Node.js application running in a Docker container using GNU debugger (GDB) and Kubernetes’ livenessProbe.

打印无限循环的堆栈跟踪在使用泊坞容器中运行的应用程序的Node.js GNU调试器 (GDB)和Kubernetes' livenessProbe。

Debugging infinite loops in Node.js code locally is easy — just launch the debugger, break the execution, see where your code is stuck, fix and you’re done. However, in production systems, this becomes much more complicated.

在本地调试Node.js代码中的无限循环很容易-只需启动调试器，中断执行，查看代码卡在哪里，修复并完成。但是，在生产系统中，这变得更加复杂。

Node.js doesn’t have any out-of-the-box tool to break a running program and print its stack trace. So, when your production code suddenly peaks at 100% CPU, it’s tricky to find out where exactly it got stuck. Neither -prof nor -cpu-prof (native Node.js profiling tools provided with the V8 debugger) helped since the infinite loop in application code was caused non-deterministically.

Node.js没有任何现成的工具可以中断正在运行的程序并打印其堆栈跟踪。因此，当您的生产代码在100％CPU上突然达到峰值时，要找出它到底卡在哪里是很棘手的。 -prof和-cpu-prof(随V8调试器提供的本地Node.js分析工具)都没有帮助，因为应用程序代码中的无限循环是不确定的。

At Apify, we had this type of problem in a production application running inside a stateless Kubernetes (K8s) container. The application is a simple express.js based web server. This article describes the solution that worked for us. Hopefully, it can also help you.

在Apify上，我们在无状态Kubernetes (K8s)容器中运行的生产应用程序中遇到了这类问题。该应用程序是一个基于express.js的简单Web服务器。本文介绍了适用于我们的解决方案。希望它也可以为您提供帮助。

TL;DR — We used a script based on this GitHub gist, which attaches the GNU debugger (GDB) to Node.js processes to print the leaking code’s stack trace. We had to run the script with K8s’ livenessProbe check to get the stack trace and save it to a persistent volume.

TL; DR —我们使用了基于该GitHub要点的脚本，该脚本将GNU调试器 (GDB)附加到Node.js进程，以打印泄漏代码的堆栈跟踪。我们不得不与K8S'运行脚本livenessProbe检查，以获得堆栈跟踪并将其保存到一个持续的音量。

在应用容器中使用GDB调试器 (Using the GDB debugger in the app container)

As a Node.js developer with a basic knowledge of V8 and the underlying C++ code, you probably haven’t used GDB for debugging your Node.js applications. You probably have no use for it most of the time but in this specific case, GDB proved to be extremely useful.

作为具有V8和底层C ++代码基础知识的Node.js开发人员，您可能没有使用GDB调试Node.js应用程序。您可能大部分时间都没有用，但是在这种特定情况下，GDB被证明是非常有用的。

GDB allows you to attach the debugger to a running Node.js process and set up a breakpoint in C++ where the infinite loop occurs. This place in V8 is called the stack guard and we got the idea to use it from this GitHub gist (it includes an explanation of the whole script if you need to know more).

GDB允许您将调试器附加到正在运行的Node.js进程，并在C ++中设置发生无限循环的断点。 V8中的这个地方称为堆栈保护 ，我们从GitHub的要点中想到了使用它的想法(如果您需要了解更多，它包括整个脚本的说明)。

With some basic knowledge of GDB and V8’s stack guard, you can reproduce the steps that cause the infinite loop and print the stack trace of your app’s code where it occurs. The code below attaches a breakpoint to the stack guard and prints the stack trace.

具备GDB和V8堆栈保护的一些基本知识，您可以重现导致无限循环的步骤，并在发生该错误的位置打印应用程序代码的堆栈跟踪。以下代码将断点附加到堆栈保护并打印堆栈跟踪。

You can easily test it by running a simple Docker container with GDB installed. First, run an infinite loop, then run the GDB command.

您可以通过运行安装了GDB的简单Docker容器轻松地对其进行测试。首先，运行无限循环，然后运行GDB命令。

Below are the steps for testing it in your local terminal using Docker.

以下是使用Docker在本地终端中对其进行测试的步骤。

After running these commands, your terminal should display myLoop function’s stack trace.

运行这些命令后，终端应显示myLoop函数的堆栈跟踪。

Image for post — The printed stack trace in terminal

更新K8s部署以使用GDB脚本 (Update K8s deployment to use the GDB script)

Now you know how to get the infinite loop’s stack trace, you can use it in the production container. First, add GDB to your Docker container. In this case, update the Dockerfile using the commands used in the test.

现在您知道了如何获取无限循环的堆栈跟踪，可以在生产容器中使用它。首先，将GDB添加到您的Docker容器中。在这种情况下，请使用测试中使用的命令更新Dockerfile。

apt-get update
apt-get install gdb

Below is the Dockerfile for this scenario.

以下是此场景的Dockerfile。

Now you have GDB installed in your Docker container, you need to ensure that the GDB command will be attached in case of an infinite loop. As mentioned above, our loop was caused non-deterministically, so we used the liveness probe command to find it.

现在，您已经在Docker容器中安装了GDB，您需要确保在无限循环的情况下将附加GDB命令。如上所述，我们的循环是不确定性引起的，因此我们使用了活动探测命令来找到它。

In our case, we had a basic HTTP liveness probe check set up. It checks the /health-check path every 5 seconds, allowing 3 failed attempts.

在我们的案例中，我们设置了基本的HTTP活动性探针检查。它每5秒检查一次/ health-check路径，从而允许3次失败尝试。

If this probe fails a 4th time, the K8s scheduler pronounces the container as dead and replaces it in the pool. This place in the container’s runtime where the container is pronounced as dead is the place where the GDB command will need to run.

如果该探针第4次失败，则K8s调度程序将容器声明为已死，并将其替换到池中。容器运行时中将容器声明为“死”的位置是需要运行GDB命令的位置。

You want to preserve the loop-causing behavior; however, if the health check fails, the GDB script should run and save the infinite loop’s stack trace into a specific file. The bash script below does exactly that.

您想保留引起循环的行为；但是，如果运行状况检查失败，则GDB脚本应运行并将无限循环的堆栈跟踪保存到特定文件中。下面的bash脚本正是这样做的。

This saves the script as liveness_probe.sh into your app’s root directory. You can see that the bash script does exactly the same as the HTTP liveness probe. However, if the health check fails 4 times, it runs the GDB command and prints the stack trace.

这会将脚本作为liveness_probe.sh保存到应用程序的根目录中。您可以看到bash脚本的功能与HTTP活动探针完全相同。但是，如果运行状况检查失败4次，它将运行GDB命令并打印堆栈跟踪。

To use this script in our app, we needed to edit the liveness probe in the K8s deployment specification as shown below.

要在我们的应用程序中使用此脚本，我们需要编辑K8s部署规范中的活动探针，如下所示。

This ensures our health check script runs every 40 seconds, which is enough time to run HTTP probe 4 times every 5 seconds. But be careful: since we’re using a debugger here, we need to allow processes using process trace with the SYS_PTRACE flag.

这样可以确保我们的运行状况检查脚本每40秒运行一次，这足以使HTTP探针每5秒运行4次。但请注意：由于我们在这里使用调试器，因此需要允许使用带有SYS_PTRACE标志的进程跟踪的进程。

We can do this using securityContext in K8s deployment.

我们可以在K8s部署中使用securityContext做到这一点。

将堆栈跟踪文件保存到永久卷 (Saving the stack trace file to a persistent volume)

Once you are able to track and print the loop into a specific file, you need to ensure that the file will not be deleted after the restart. The application runs as stateless, so after the container restarts, you lose all the data in memory and storage.

一旦能够跟踪循环并将其打印到特定文件中，就需要确保重启后不会删除该文件。该应用程序以无状态运行，因此在容器重新启动后，您将丢失内存和存储中的所有数据。

To attach a persistent volume to your K8s pod, you can follow these steps. The attachable volume is a little different on each K8s-managed cluster. Our app uses the AWS Elastic Kubernetes Service (EKS), which is easily compatible with the Elastic File System (EFS).

要将持久卷附加到K8s吊舱，可以按照以下步骤操作。在每个K8s管理的群集上，可附加的卷有所不同。我们的应用程序使用AWS Elastic Kubernetes服务 (EKS)，该服务可轻松与Elastic File System (EFS)兼容。

You can do a very basic setup of EFS by running the command below.

您可以通过运行以下命令对EFS进行非常基本的设置。

aws efs create-file-system

From the output, you will need the FileSystemId property for further use. To attach EFS as a persistent volume to your EKS cluster, launch the Amazon EFS CSI Driver. After installing it, let your application know about it by creating a StorageClass K8s resource.

从输出中，您将需要FileSystemId属性以进一步使用。要将EFS作为持久卷附加到EKS集群，请启动Amazon EFS CSI驱动程序。安装后，通过创建StorageClass K8s资源使您的应用程序知道它。

Next, create a persistent volume and persistent volume claim.Note: Use FileSystemId as volumeHandle.

接下来，创建一个持久卷和持久卷声明。注：将FileSystemId用作volumeHandle 。

Finally, mount the persistent volume claim to the deployment.

最后，将持久卷声明安装到部署中。

When the persistent volume is set up, use SSH to connect it to one of the app’s containers. The files containing stack traces will be in the debugger folder.

设置永久卷后，使用SSH将其连接到应用程序的容器之一。包含堆栈跟踪的文件将位于debugger文件夹中。

结论 (Conclusion)

To summarize, our app had a non-deterministic infinite loop, which occurred only on production. We identified it by attaching the GNU debugger to the app’s Node.js processes, which allowed us to print the leaking code’s stack trace. We then ran Kubernetes’ livenessProbe check to get the stack trace and save it to a persistent volume.

总而言之，我们的应用程序有一个不确定的无限循环，仅在生产时才发生。我们通过将GNU调试器附加到应用程序的Node.js进程来识别它，这使我们可以打印泄漏代码的堆栈跟踪。然后，我们Kubernetes' livenessProbe检查，以获得堆栈跟踪并将其保存到一个持久的体积。

In our case, the infinite loop was caused by a third-party package.

在我们的例子中，无限循环是由第三方程序包引起的。

We hope you will find this article useful if you encounter an infinite loop in your Node.js application.

如果您在Node.js应用程序中遇到无限循环，我们希望本文对您有所帮助。

Additionally, we added a sidecar container into k8s cluster to sync stack trace files directly to AWS S3 bucket. If you are interested in how we did it, let us know in the comments, and we will describe it in a future blog post.

此外，我们在k8s集群中添加了一个sidecar容器，以将堆栈跟踪文件直接同步到AWS S3存储桶。如果您对我们的操作方式感兴趣，请在评论中告知我们，我们将在以后的博客文章中对其进行描述。

翻译自: https://blog.apify.com/how-to-debug-an-infinite-loop-in-node-js-production-code-9ec0e1442da0

node js 调试

weixin_26755331

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
node js 调试_如何在Node JS生产代码中调试无限循环

node js 调试Print the stack trace of an infinite loop in a Node.js application running in a Docker container using GNU debugger (GDB) and Kubernetes’ livenessProbe. 打印无限循环的堆栈跟踪在使用泊坞容器中运行的应用程序的Node.js GN...
复制链接

扫一扫