在跑es压测的时候发现,有很多线程卡在java.net.InetAddress$NameServiceAddresses.get的一个锁上面,本文记录一下整个问题的分析过程和根因
现象
用户在跑es的时候使用jstack能看到很多线程BLOCKED
"pool-3-thread-1" #456 prio=5 os_prio=0 cpu=40493.55ms elapsed=4118.31s tid=0x00007f1b3017d9f0 nid=0x48a83 waiting for monitor entry [0x00007f1b21bda000]
java.lang.Thread.State: BLOCKED (on object monitor)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:835)
- waiting to lock <0x000000079780efb0> (a java.net.InetAddress$NameServiceAddresses)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:886)
at java.net.InetAddress.getAllByName0(java.base@15.0.1/InetAddress.java:1507)
at java.net.InetAddress.getAllByName0(java.base@15.0.1/InetAddress.java:1427)
at java.net.SocketPermission.getIP(java.base@15.0.1/SocketPermission.java:814)
at java.net.SocketPermission.impliesIgnoreMask(java.base@15.0.1/SocketPermission.java:976)
at java.net.SocketPermissionCollection.implies(java.base@15.0.1/SocketPermission.java:1446)
at java.security.Permissions.implies(java.base@15.0.1/Permissions.java:178)
at sun.security.provider.PolicyFile.implies(java.base@15.0.1/PolicyFile.java:994)
at sun.security.provider.PolicySpiFile.engineImplies(java.base@15.0.1/PolicySpiFile.java:75)
at java.security.Policy$PolicyDelegate.implies(java.base@15.0.1/Policy.java:796)
at org.elasticsearch.bootstrap.ESPolicy.implies(ESPolicy.java:86)
at java.security.ProtectionDomain.implies(java.base@15.0.1/ProtectionDomain.java:321)
at java.security.ProtectionDomain.impliesWithAltFilePerm(java.base@15.0.1/ProtectionDomain.java:353)
at java.security.AccessControlContext.checkPermission(java.base@15.0.1/AccessControlContext.java:450)
at java.security.AccessController.checkPermission(java.base@15.0.1/AccessController.java:1036)
at java.lang.SecurityManager.checkPermission(java.base@15.0.1/SecurityManager.java:408)
at java.lang.SecurityManager.checkConnect(java.base@15.0.1/SecurityManager.java:907)
at java.net.InetAddress.getLocalHost(java.base@15.0.1/InetAddress.java:1628)
分析是被下面的线程占用了锁
"pool-3-thread-64" #596 prio=5 os_prio=0 cpu=42789.62ms elapsed=4107.41s tid=0x00007f1b1c28a360 nid=0x48b33 runnable [0x00007f0ce31f0000]
java.lang.Thread.State: RUNNABLE
at java.net.Inet4AddressImpl.lookupAllHostAddr(java.base@15.0.1/Native Method)
at java.net.InetAddress$PlatformNameService.lookupAllHostAddr(java.base@15.0.1/InetAddress.java:932)
at java.net.InetAddress.getAddressesFromNameService(java.base@15.0.1/InetAddress.java:1517)
at java.net.InetAddress$NameServiceAddresses.get(java.base@15.0.1/InetAddress.java:851)
- locked <0x000000079780efb0> (a java.net.InetAddress$NameServiceAddresses)
at java.net.InetAddress.getAllByName0(java.base@15.0.1/InetAddress.java:1507)
at java.net.InetAddress.getLocalHost(java.base@15.0.1/InetAddress.java:1639)
嗯,看起来是在通过本机主机名拿本机ip的时候慢了
这个时候执行一下hostname,发现hostname的输出并没有在/etc/hosts中配置
尝试把主机名配置到/etc/hosts中,解析为127.0.0.1,问题就能得到恢复
那么疑问来了,在解析主机名的时候,配置hosts和不配置的速度差别有多少呢?为什么会block呢?
查看jdk 1.5.0 源码 看一下NameServiceAddresses.get的实现
https://github.com/openjdk/jdk/blob/jdk-15-ga/src/java.base/share/classes/java/net/InetAddress.java#L835
发现有一个synchronized,ok锁的位置确认,继续向下
观察到runnable的线程正在执行的函数是
java.net.Inet4AddressImpl.lookupAllHostAddr(java.base@15.0.1/Native Method)
是一个native函数
jdk的实现在这里
https://github.com/openjdk/jdk/blob/jdk-15-ga/src/java.base/unix/native/libnet/Inet4AddressImpl.c#L86
在109行会调用到getaddrinfo函数,是glibc提供的
Given node and service, which identify an Internet host and a
service, getaddrinfo() returns one or more addrinfo structures,
each of which contains an Internet address that can be specified
in a call to bind(2) or connect(2).
ok, 我们写一个demo,来测试一下配置hosts的主机名和不配置hosts的主机名解析速度的差异,我们就将jdk中的实现抠出来
#include <ctype.h>
#include <errno.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netinet/in_systm.h>
#include <netinet/ip.h>
#include <netinet/ip_icmp.h>
#include <stdlib.h>
#include <string.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <stdio.h>
#include <arpa/inet.h>
int main(int argc, char *argv[]) {
if (argc < 2) {
printf("arg err\n");
return -1;
}
struct addrinfo hints, *rp = NULL, *result = NULL, *resNew = NULL, *last = NULL,
*iterator;
int error;
char str[INET_ADDRSTRLEN] = {0};
// try once, with our static buffer
memset(&hints, 0, sizeof(hints));
hints.ai_flags = AI_CANONNAME;
hints.ai_family = AF_INET;
for (int i = 0; i < 100; i ++ )
error = getaddrinfo(argv[1], NULL, &hints, &result);
return 0;
}
我们看执行100次getaddrinfo的性能区别
看一下当前/etc/hosts的配置
[root@VM-0-13-centos c]# cat /etc/hosts
127.0.0.1 VM-0-13-centos VM-0-13-centos
127.0.0.1 localhost.localdomain localhost
127.0.0.1 localhost4.localdomain4 localhost4
127.0.0.1 f11.maxxipoint.com
::1 VM-0-13-centos VM-0-13-centos
::1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
将当前hostname修改一下,使之不存在于/etc/hosts中
[root@VM-0-13-centos c]# hostname wanglihost
[root@VM-0-13-centos c]# hostname
wanglihost
由于用户机器上开启了nscd,我们也开启
nscd 全称是 “Name Service Cache Daemon”,开启之后glibc会通过unixsocket /var/run/nscd/socket与之交互
看一下性能对比
[root@VM-0-13-centos c]# time ./getaddrinfo VM-0-13-centos
real 0m0.001s
user 0m0.001s
sys 0m0.000s
[root@VM-0-13-centos c]# time ./getaddrinfo wanglihost
real 0m0.463s
user 0m0.001s
sys 0m0.003s
看起来差了400倍左右的速度,难怪不配置hosts会block
但是不配置hosts对glibc的行为影响到底是什么呢
抓一下strace,发现有这样的区别
共同点
connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = 0
sendto(3, "\2\0\0\0\r\0\0\0\6\0\0\0hosts\0", 18, MSG_NOSIGNAL, NULL, 0) = 18
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
recvmsg(3, {msg_name=NULL, msg_namelen=0, msg_iov=[{iov_base="hosts\0", iov_len=6}, {iov_base="\310O\3\0\0\0\0\0", iov_len=8}], msg_iovlen=2, msg_control=[{cmsg_len=20, cmsg_level=SOL_SOCKET, cmsg_type=SCM_RIGHTS, cmsg_data=[4]}], msg_controllen=20, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 14
mmap(NULL, 217032, PROT_READ, MAP_SHARED, 4, 0) = 0x7f2741251000
无论主机名称是否在/etc/hosts中,glibc都会连接/var/run/nscd/socket
并发送一个hosts相关命令
之后mmap一段内存
不同点
不在/etc/hosts中的主机名称,会在之后不断和nscd通过socket交互
sendto(3, "\2\0\0\0\16\0\0\0\f\0\0\0wanglihost1\0", 24, MSG_NOSIGNAL, NULL, 0) = 24
poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 5000) = 1 ([{fd=3, revents=POLLIN|POLLHUP}])
read(3, "\2\0\0\0\1\0\0\0\2\0\0\0\24\0\0\0\f\0\0\0\0\0\0\0", 24) = 24
read(3, "\254\25\0\r\376\200\0\0\0\0\0\0PT\0\377\3762\327i\2\nwanglihost"..., 34) = 34
为了弄清楚实现
我们查找getaddrinfo在glibc中的实现
https://codebrowser.dev/glibc/glibc/sysdeps/posix/getaddrinfo.c.html#getaddrinfo
发现一段跟nscd相关的代码
https://codebrowser.dev/glibc/glibc/sysdeps/posix/getaddrinfo.c.html#610
#ifdef USE_NSCD
...
int err = __nscd_getai (name, &air, &h_errno);
继续看__nscd_getai的实现
https://codebrowser.dev/glibc/glibc/nscd/nscd_getai.c.html#62
mapped = __nscd_get_map_ref (GETFDHST, "hosts", &__hst_map_handle,
&gc_cycle);
继续看__nscd_get_map_ref的实现
__nscd_get_map_ref (request_type type, const char *name,
volatile struct locked_map_ptr *mapptr, int *gc_cyclep)
{
struct mapped_database *cur = mapptr->mapped;
if (cur == NO_MAPPING)
return cur;
if (!__nscd_acquire_maplock (mapptr))
return NO_MAPPING;
cur = mapptr->mapped;
if (__glibc_likely (cur != NO_MAPPING))
{
/* If not mapped or timestamp not updated, request new map. */
if (cur == NULL
|| (cur->head->nscd_certainly_running == 0
&& cur->head->timestamp + MAPPING_TIMEOUT < time_now ())
|| cur->head->data_size > cur->datasize)
cur = __nscd_get_mapping (type, name,
(struct mapped_database **) &mapptr->mapped);
if (__glibc_likely (cur != NO_MAPPING))
{
if (__builtin_expect (((*gc_cyclep = cur->head->gc_cycle) & 1) != 0,
0))
cur = NO_MAPPING;
else
atomic_increment (&cur->counter);
}
}
mapptr->lock = 0;
return cur;
}
可以看到这里会去访问一个缓存的mapptr,这个缓存对象是根据传入的name(这里是"hosts")去向nscd要的,
具体怎么要的呢,看__nscd_get_mapping的实现
https://codebrowser.dev/glibc/glibc/nscd/nscd_helper.c.html#__nscd_get_mapping
int sock = open_socket (type, key, keylen);
继续看open_socket实现
https://codebrowser.dev/glibc/glibc/nscd/nscd_helper.c.html#open_socket
strcpy (sun.sun_path, _PATH_NSCDSOCKET);
if (__connect (sock, (struct sockaddr *) &sun, sizeof (sun)) < 0
&& errno != EINPROGRESS)
这里_PATH_NSCDSOCKET就是"/var/run/nscd/socket"
这就和strace对应上了
总结
调用逻辑如下
java.net.InetAddress$NameServiceAddresses.get
->
lookupAllHostAddr
->
getaddrinfo
->
创建socket连接nscd
->
发送hosts命令,nscd会创建共享内存将hosts信息写入,客户端(glibc)这边mmap读入
->
如果没有读到,继续sendto 主机名信息给nscd
却别就在如果hosts中有主机名,那么这个信息就在共享内存中
这个行为的区别会造成400倍的性能差异
分析完毕