Nginx Socket Sharding

Socket Sharding


Socket sharding was first introduced in NGINX 1.9.1. This feature leverages the SO_REUSEPORT socket option introduced in version 3.9 of the Linux kernel. When the option is enabled, the Linux kernel itself distributes new connections evenly across the NGINX worker processes in a round‑robin fashion. The worker processes then do the work of request limiting, caching, load balancing, and everything else you have configured.

Without SO_REUSEPORT, new connections are put up for grabs to all available worker processes. The first to take a connection off the queue gets it. As there is no algorithm for distributing the load evenly, it can easily get skewed, with a few worker processes taking the majority of the load while others are underutilized. Its also inefficient to have processes fight over packets, as this can lead to lock contention.

Socket sharding can improve performance up to 3x by ensuring work is distributed evenly among NGINX worker processes. To enable this functionality, add the new reuseport parameter to existing listen directives.

server {
    listen 12345 reuseport;
    # ...
}

To learn more about this feature, please refer to this blog post.

Note: This feature requires Linux kernel version 3.9 or later. Ubuntu 13.10 and later and Red Hat Enterprise Linux 7 and later include the required functionality.

 

Socket Sharding in NGINX Release 1.9.1


socket shardingbenchmarkingreleases

NGINX 1.9.1 introduces a new feature that enables use of the SO_REUSEPORT socket option, which is available in newer versions of many operating systems, including DragonFly BSD and Linux (kernel version 3.9 and later). This socket option allows multiple sockets to listen on the same IP address and port combination. The kernel then load balances incoming connections across the sockets.

Editor – For NGINX Plus users, this feature is supported in NGINX Plus Release 7 (R7) and later. For an overview of all the new features in that release, see Announcing NGINX Plus R7 on our blog.

The SO_REUSEPORT socket option has many potential real‑world applications. Other services can use it for easy rolling upgrades of executables (NGINX already supports rolling upgrades through different means). For NGINX, enabling this socket option can improve performance in certain scenarios by reducing lock contention.

As depicted in the figure, when the SO_REUSEPORT option is not enabled, a single listening socket notifies workers about incoming connections, and each worker tries to take a connection.

 

 With the SO_REUSEPORT option enabled, there are multiple socket listeners for each IP address and port combination, one for each worker process. The kernel determines which available socket listener (and by implication, which worker) gets the connection. This can reduce lock contention between workers accepting new connections, and improve performance on multicore systems. However, it can also mean that when a worker is stalled by a blocking operation, the block affects not only connections that the worker has already accepted, but also connection requests that the kernel has assigned to the worker since it became blocked.

 

Configuring Socket Sharding


To enable the SO_REUSEPORT socket option, include the new reuseport parameter to the listen directive for HTTP or TCP (stream module) traffic, as in these examples:

http {
     server {
          listen 80 reuseport;
          server_name  localhost;
          # ...
     }
}

stream {
     server {
          listen 12345 reuseport;
          # ...
     }
}

Including the reuseport parameter also disables the accept_mutex directive for the socket, because the mutex is redundant with reuseport. It can still be worth setting accept_mutex if there are ports on which you don’t set reuseport.

 

Benchmarking Performance with reuseport


I ran a wrk benchmark with 4 NGINX workers on a 36‑core AWS instance. To eliminate network effects, I ran both client and NGINX on localhost, and also had NGINX return the string OK instead of a file. I compared three NGINX configurations: the default (equivalent to accept_mutex on), with accept_mutex off, and with reuseport. As shown in the figure, reuseport increases requests per second by 2 to 3 times, and reduces both latency and the standard deviation for latency.

reuseport-benchmark

I also ran a related benchmark with the client and NGINX on separate hosts and with NGINX returning an HTML file. As shown in the following table, with reuseport the decrease in latency was similar to the previous benchmark, and the standard deviation decreased even more dramatically (almost ten‑fold). Other results (not shown in the table) were also encouraging. With reuseport, the load was spread evenly across the worker processes. In the default condition (equivalent to accept_mutex on), some workers got a higher percentage of the load, and with accept_mutex off all workers experienced high load.

 Latency (ms)Latency stdev (ms)CPU Load
Default15.6526.590.3
accept_mutex off15.5926.4810
reuseport12.353.150.3

In these benchmarks, the rate of connection requests is high but the requests don’t require extensive processing. Other preliminary testing also indicates that reuseport improves performance the most when traffic matches this profile. (The reuseport parameter is not available on the listen directive in the mail context, for example, because email traffic definitely does not match the profile.) We encourage you to test reuseport to determine whether it improves performance in your NGINX deployment, rather than applying it wholesale. For some tips on testing NGINX performance, check out Konstantin Pavlov’s talk at nginx.conf 2014.

Acknowledgments

Thanks to Yingqi Lu at Intel and Sepherosa Ziehau, who each contributed a solution to the NGINX project that enables use of the SO_REUSEPORT socket option. The NGINX team combined ideas from both contributions to create what we believe is an ideal solution.

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值