haproxy 负载_负载测试HAProxy（第2部分）

最新推荐文章于 2023-02-07 11:33:57 发布

cumian8165

最新推荐文章于 2023-02-07 11:33:57 发布

阅读量263

点赞数

文章标签： python linux 人工智能 java 编程语言

原文链接：https://www.freecodecamp.org/news/load-testing-haproxy-part-2-4c8677780df6/

版权

haproxy 负载

by Sachin Malhotra

由Sachin Malhotra

负载测试HAProxy(第2部分) (Load Testing HAProxy (Part 2))

This is the second part in the 3 part series on performance testing of the famous TCP load balancer and reverse proxy, HAProxy. If you haven’t gone through the previous post, I would highly suggest you do so to get some sort of context.

这是有关著名的TCP负载平衡器和反向代理HAProxy的性能测试的3部分系列的第二部分。如果您还没有阅读过上一篇文章，我强烈建议您这样做以获取某种上下文。

Load Testing HAProxy (Part-1)Load Testing ? HAProxy ? If all this seems greek to you, don’t worry. I will provide inline links to read up on what…medium.com

负载测试HAProxy(第1部分) 负载测试？ HAProxy？ 如果这一切对您来说都很希腊，请不用担心。 我将提供内联链接以阅读有关... media.com的内容

This post will focus on the TCP Port exhaustion problem and how we can deal with it. In the last post we talked about how we can tune the kernel level and process level ulimit settings. This post is focussed on modifying the sysctl settings to get over the port exhaustion limits.

这篇文章将重点讨论TCP端口耗尽问题以及我们如何处理它。在上一篇文章中，我们讨论了如何调整内核级别和进程级别ulimit设置。这篇文章的重点是修改sysctl设置以克服端口耗尽限制。

SYSCTL本地端口范围和孤立套接字 (SYSCTL Local Port Range and Orphaned Sockets)

Port exhaustion is a problem that will cause TCP communications with other machines over the network to fail. Most of the times there is a single process that leads to this problem and restarting it will fix the issue, temporarily. It will however come back to bite in a few hours or days depending on the system load.

端口耗尽是一个问题，它将导致网络上与其他计算机的TCP通信失败。大多数情况下，只有一个过程会导致此问题，重新启动它会暂时解决该问题。但是，根据系统负载，它会在数小时或数天后重新咬起来。

Port exhaustion does not mean that the ports actually get tired. Of course, that is not possible because the computer is not human and the ports are not capable of getting tired. The truth is much more insidious. Port exhaustion simply means that the system does not have any more ephemeral ports left to communicate with other machines / servers.

端口耗尽并不意味着端口实际上会累。当然，这是不可能的，因为计算机不是人类，并且端口不容易疲劳。事实真是阴险得多。端口耗尽只是意味着系统不再有与其他计算机/服务器通信的临时端口 。

Before going further, let us understand what constitutes a TCP connection and what really does an inbound and an outbound connection means.

在继续之前， 让我们了解什么构成TCP连接以及入站和出站连接的真正含义。

In majority of the cases whenever we talk about TCP connections and high scalability and ability to support concurrent connections, we usually refer to the number of inbound connections.

在大多数情况下，每当我们谈论TCP连接，高可伸缩性和支持并发连接的能力时，我们通常都指入站连接的数量。

Say, the HAProxy is listening on port 443 for new inbound connections. If we say that the HAProxy can support X number of concurrent connections, what we really mean are X number of incoming connections and all of them are established on port 443 on the HAProxy machine.

说，HAProxy正在端口443上侦听新的入站连接。如果我们说HAProxy可以支持X个并发连接，那么我们真正的意思是X个传入连接，并且所有连接都建立在HAProxy计算机的端口443上。

If these connections are inbound for the HAProxy, then these have to be outbound for the client machines where the connection originated. Any sort of communication from the client requires them to initiate outbound connections to the servers.

如果这些连接对于HAProxy是入站的，那么对于连接始发的客户端计算机，这些连接必须是出站的。 来自客户端的任何形式的通信都要求它们启动与服务器的出站连接。

When a connection is established over TCP, a socket is created on both the local and the remote host. These sockets are then connected to create a socket pair, which is described by a unique 4-tuple consisting of the local IP address and port along with the remote IP address and port.

通过TCP建立连接时，将在本地和远程主机上创建一个套接字。然后将这些套接字连接起来以创建一个套接字对，该套接字对由一个唯一的四元组描述，该四元组由本地IP地址和端口以及远程IP地址和端口组成。

If you understood the concept of the quadruple, you will realise that in an outbound connection or rather multiple outbound connections to the SAME backend server, 2things always remain the same i.e. Destination IP and Destination Port. Assuming we are only taking into account a single client machine, the client IP will also remain the same.

如果您了解四元组的概念，您将认识到，在与SAME后端服务器的出站连接或多个出站连接中，第二个始终保持相同，即目标IP和目标端口。假设我们仅考虑单个客户端计算机，则客户端IP也将保持不变。

This means that the number of outbound connections is dependent on the number of client ports that can be used for establishing the connection. While establishing an outbound connection, the source port is randomly selected from the ephemeral port range and this port gets freed up once the connection is destroyed. That’s why such ports are called as ephemeral ports.

这意味着出站连接的数量取决于可用于建立连接的客户端端口的数量。建立出站连接时，会从临时端口范围中随机选择源端口，并且一旦连接被破坏，该端口就会被释放。这就是为什么将此类端口称为临时端口的原因。

By default, the total number of local ephemeral ports available are around 28000.

默认情况下，可用的本地临时端口总数约为28000。

Now you might be thinking that 28k is a pretty large number and what can possibly cause 28k connections to get used up at a single point of time? In order to understand this, we have to understand the TCP connection lifecycle.

现在您可能会认为28k是一个很大的数字，什么可能导致28k连接在单个时间点耗尽？为了理解这一点， 我们必须了解TCP连接生命周期。

During the TCP handshake, the connection state goes from

在TCP握手期间，连接状态从

SYN_SENT → SYN_RECV → ESTABLISHED. Once the connection is in ESTABLISHED state, it means that the TCP connection is now active. However, once the connection is terminated, the local port that was being used earlier does not become active immediately.

SYN_SENT→SYN_RECV→已建立。一旦连接处于ESTABLISHED状态，则表示TCP连接现在处于活动状态。 但是，一旦连接终止，先前使用的本地端口就不会立即变为活动状态。

The connection enters a state known as the TIME_WAIT state for a period of 120 seconds before it is finally terminated. This is a kernel level setting that exists to allow any delayed or out of order packets to be ignored by the network.

在最终终止连接之前，该连接进入称为TIME_WAIT状态的状态达120秒。 这是一个内核级别的设置，可以让网络忽略任何延迟或乱序的数据包。

If you do the math, it won’t take more than 230 concurrent connections per second before the supposedly large limit of 28000 ephemeral ports on the system is reached. This limit is very easy to reach on proxies like HAProxy or NGINX because all the traffic is routed through them to the backend servers.

如果您进行数学计算，则在达到系统上假定的28000个临时端口的大限制之前， 每秒并发连接不会超过230个 。在HAProxy或NGINX之类的代理中，此限制非常容易达到，因为所有流量都通过它们路由到后端服务器。

When a connection enters the TIME_WAIT state, it is known as an orphaned socket because the TCP socket in this case is not help by any socket descriptor but are still held by the system for the designated time i.e. 120 seconds by default.

当连接进入TIME_WAIT状态时，它被称为孤立套接字，因为在这种情况下TCP套接字不受任何套接字描述符的帮助，但仍保留在系统指定的时间(即默认为120秒)中。

如何检测呢？ (How to detect this?)

Enough with all the theoretical stuff. Let’s jump in and see how we can identify if this limit has been hit on the system. There are two commands I absolutely love to use to find out the number of TCP connections established on the system.

所有理论上的东西都足够了。让我们进入并看看如何确定系统上是否已达到此限制。我绝对喜欢使用两个命令来找出系统上建立的TCP连接数。

ss(套接字统计信息) (ss (Socket Statistics))

The socket statistics command is a sort of replacement of the famous netstat command and is much faster than the netstat command in rendering information because it fetches the connections info directly from the kernel space. In order to get a hang of the different options supported by the ss command, check out

socket statistics命令替代了著名的netstat命令，并且在渲染信息方面比netstat命令快得多，因为它直接从内核空间中获取连接信息。为了了解ss命令支持的不同选项，请签出

10 examples of Linux ss command to monitor network connectionsIn a previous tutorial we saw how to use the netstat command to get statistics on network/socket connections. However…www.binarytides.com

Linux ss命令监视网络连接的10个示例 在上一教程中，我们看到了如何使用netstat命令获取有关网络/套接字连接的统计信息。 但是… www.binarytides.com

The `ss -s` command will show the total number of TCP established connections on the machine. If you see this reach the 28000 mark, it is very much possible that the ephemeral ports have been exhausted on that machine. BEWARE: This might be higher than the 28k number if multiple services are running on the same machine on different ports.

ss -s命令将显示机器上TCP建立的连接总数。如果看到达到28000标记，则极有可能是该计算机上的临时端口已用尽。 注意：如果多个服务在同一计算机上的不同端口上运行，则该数字可能高于28k。

网络统计 (Netstat)

The netstat command is a very famous command that provides information about all sorts of connections established on the machine’s networking stack.

netstat命令是一个非常著名的命令，它提供有关在计算机网络堆栈上建立的各种连接的信息。

sudo netstat -anptl

This will show you the details about all the connections on the machine. The details include

这将显示有关机器上所有连接的详细信息。细节包括

local address
当地地址
remote address
远端地址
connection state
连接状态
process pid
进程pid

We can also use this to see if a single process has established 28k connections to an outbound server which gives us insights into the port exhaustion problem.

我们还可以使用它来查看单个进程是否已建立与出站服务器的28k连接，这使我们能够深入了解端口耗尽问题。

For eg:- the above image shows that a process with pid 9758 has established multiple connections with the foreign machine with IP 192.168.0.168 and port 443. As we can clearly see, on the source side of things, there are numerous ports being used.

例如：-上图显示pid 9758的进程已与IP 192.168.0.168和端口443的外部计算机建立了多个连接。正如我们可以清楚地看到的那样，从源头上看，正在使用许多端口。

sachinm@ip-192-168-0-122:~$ sudo netstat -anptl | grep '192.168.0.168:443' | cut -c69-79 | sort | uniq -c | sort -rn

5670 ESTABLISHED

This modified command will show the status of the different connections established with 192.168.0.168 on port 443. Currently there are 5670 connections. If this limit were to reach 28k, then you should look at options to increase the ephemeral port range on the machine.

修改后的命令将显示在端口443上与192.168.0.168建立的不同连接的状态。当前有5670个连接。如果此限制达到28k，则应查看一些选项以增加计算机上的临时端口范围。

Let’s look at another interesting command that you can issue at the server end or the proxy end to find out how many inbound connections have been established and by which IPs. So for example check out the result of the below command

让我们看一下您可以在服务器端或代理端发出的另一个有趣的命令，以了解已建立了多少个入站连接以及由哪个IP组成。因此，例如，检查以下命令的结果

ss -tan 'sport = :443' | awk '{print $(NF)" "$(NF-1)}' | sed 's/:[^ ]*//g' | sort | uniq -c

This shows that there are about 14 different machines who have established around 2300 connections each with 192.168.0.168 and if you look at the command closely, we have filtered out results only for port 443.

这表明大约有14台不同的计算机建立了大约2300个连接，每个连接都使用192.168.0.168进行连接，如果仔细查看该命令，则仅筛选出端口443的结果。

Enough with finding the problem already. Let’s dive straight into finding the solution(s) to this problem.

已经足够找到问题了。让我们直接去寻找解决这个问题的方法。

出路是什么？ (What’s the way out?)

Well don’t be afraid because sysctl just happens to be a friendly monster. There are many ways by which we can solve this problem.

好吧，不要害怕，因为sysctl恰好是一个友好的怪物。有很多方法可以解决此问题。

方法1 (Approach 1)

One of the most practical approaches to solve this problem and one that you most likely will or rather should end up doing is to increase the local ephemeral port range to the maximum possible value. As mentioned before, the default range is very small.

解决此问题的最实用方法之一是将本地临时端口范围增加到最大可能值，您很有可能最终会或应该最终这样做。如前所述，默认范围很小。

echo 1024 65535 > /proc/sys/net/ipv4/ip_local_port_range

This will increase the local port range to a bigger value. We cannot increase the range beyond this as there can only be a maximum of 65535 ports and the first 1024 are reserved for select services and purposes.

这会将本地端口范围增加到更大的值。我们不能将范围扩大到此范围之外，因为最多只能有65535个端口，并且前1024个保留用于某些服务和目的。

Note that you might still get bottleneck on this issue. However, instead of 28000 ports being used locally, it will be 64000 ports. Not a full proof solution but this is something that you can do to give you some breathing room.

请注意，您可能仍然会在此问题上遇到瓶颈。但是，它将是64000个端口，而不是在本地使用28000个端口。这不是一个完整的解决方案，但这是您可以做的以给您一些喘息的空间。

Does this mean I can only get about 64k concurrent connections from a single client machine? The answer is NO.

这是否意味着我只能从单个客户端计算机上获得约64k并发连接？答案是不。

In this scenario, a single client machine will be able to generate about 120k concurrent connections because both the processes are connecting to two different backend servers or proxies and hence different destination IPs.

在这种情况下，单个客户端计算机将能够生成大约12万个并发连接，因为这两个进程都连接到两个不同的后端服务器或代理，因此也连接到不同的目标IP。

方法2 (Approach 2)

Another simple solution is to enable a Linux TCP option called tcp_tw_reuse. This option enables the Linux kernel to reclaim a connection slot from a connection in TIME_WAIT state and reallocate it to a new connection.

另一个简单的解决方案是启用名为tcp_tw_reuse的Linux TCP选项。使用此选项，Linux内核可以从处于TIME_WAIT状态的连接中回收连接插槽，并将其重新分配给新连接。

--> vim /etc/sysctl.conf

--&gt; Add the following line in the end# Allow reuse of sockets in TIME_WAIT state for new connections# only when it is safe from the network stack’s perspective.net.ipv4.tcp_tw_reuse = 1

--&gt; Reload sysctl settingssysctl -p

方法3 (Approach 3)

Use more server ports. Till now we have talked about port exhaustion problems arising because in the quadruplet logic discussed before, the destination Ip, destination port and source Ip remain constant. The only thing that changes is the client ports.

使用更多服务器端口。到目前为止，我们已经讨论了端口耗尽问题，因为在前面讨论的四重逻辑中，目标Ip，目标端口和源Ip保持不变。唯一更改的是客户端端口。

However, if the server listens on two different ports instead of one, then we have twice the number of ephemeral ports available instead of one. This clubbed with the first approach gives you about 120k concurrent connections on a single machine.

但是，如果服务器在两个不同的端口上侦听而不是在一个端口上侦听，那么我们的临时端口数量将是一个，而不是一个。与第一种方法结合在一起，可以在一台计算机上为您提供约12万个并发连接。

You have to however take care that running the server on two ports — which essentially means running two servers on the same machine — does not have a huge impact on the hardware.

但是，您必须注意，在两个端口上运行服务器-本质上意味着在同一台计算机上运行两个服务器-不会对硬件产生巨大影响。

方法4 (Approach 4)

In a real production scenario, you may have millions of concurrent users simultaneously hitting the system. But in a load testing scenario, these users are to be artificially generated by a client running on a machine.

在实际的生产场景中，您可能有数百万个并发用户同时访问系统。但是在负载测试方案中，这些用户将由运行在计算机上的客户端人工生成。

Here again the 65k port limit comes to bite on the client side. The only way to overcome this from the client’s perspective is to increase the number of client machines that are generating the load. As you will read the next part in this series you will find that we had to use about 14 different machines to generate the kind of load we wanted to test HAProxy.

在这里，客户端再次受到65k端口限制的限制。从客户端角度解决此问题的唯一方法是增加生成负载的客户端计算机的数量。当您阅读本系列的下一部分时，您会发现我们不得不使用大约14种不同的机器来生成我们要测试HAProxy的负载。

全部放在一起 (Putting it all together)

There isn’t one single configuration that will solve all your woes and work like a charm. It is always the combination of multiple things that work out in the end.

没有任何一种配置可以解决您的所有麻烦并像魅力一样工作。最终，总会综合多种因素。

For us as a prerequisite to load testing HAProxy, we followed approach #1 and approach #2 and eventually approach #3 to generate a huge…huge load of 2 million concurrent connections on a single HAProxy machine.

对于我们来说，作为负载测试HAProxy的先决条件，我们遵循方法＃1和方法＃2，最后是方法＃3，以在一台HAProxy计算机上生成200万个并发连接的巨大负载。

Here’s the final part of this series, where I’ll put together all the components that went into generating this kind of load, the tunings we did and the learnings that came out of it.

这是本系列的最后一部分，我将把生成这种负载的所有组件，我们所做的调整以及由此产生的学习放在一起。

Do let me know how this blog post helped you and stay tuned for the final part in this series of posts. Also, please recommend (❤) this post if you think this may be useful for someone.

请让我知道此博客文章如何为您提供帮助，并继续关注本系列文章的最后一部分。另外，如果您认为这可能对某人有用，请推荐(❤)这篇文章。

翻译自: https://www.freecodecamp.org/news/load-testing-haproxy-part-2-4c8677780df6/

haproxy 负载

cumian8165

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
haproxy 负载_负载测试HAProxy（第2部分）

haproxy 负载by Sachin Malhotra 由Sachin Malhotra 负载测试HAProxy(第2部分) (Load Testing HAProxy (Part 2))This is the second part in the 3 part series on performance testing of the famous TCP load balancer an...
复制链接

扫一扫