php socket keepalive,linux keepalive探测对应用层socket api的影响

问题

大部分人都知道tcp的keepalive. 假设读者知道keepalive会如何触发. 这篇文章想讨论keepalive触发后, 对socket使用者的影响.

keepalive设置

修改/etc/sysctl.conf

ubuntu# vim /etc/sysctl.conf

ubuntu# sysctl -p

fs.file-max = 131072

net.ipv4.tcp_keepalive_time = 10

net.ipv4.tcp_keepalive_intvl = 5

net.ipv4.tcp_keepalive_probes = 3

验证

ubuntu# sysctl -a | grep keepalive

net.ipv4.tcp_keepalive_intvl = 5

net.ipv4.tcp_keepalive_probes = 3

net.ipv4.tcp_keepalive_time = 10

tcp_server.py

import socket

import sys

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

server_address = ('localhost', 22345)

sock.bind(server_address)

sock.listen(1)

connection, client_address = sock.accept()

while True:

data = connection.recv(1024)

print("data", data)

tcp_client.py

import socket

import sys

import time

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

server_address = ('localhost', 22345)

sock.connect(server_address)

time.sleep(999999999)

bVbrwPB?w=1749&h=551

可以看到, 因为tcp_client开启了SO_KEEPALIVE, 所以tcp_client主动往tcp_server发起KEEPALIVE探测.

若tcp_server开启SO_KEEPALIVE, 则是tcp_server往tcp_client发送KEEPALIVE探测.

如果tcp_server/tcp_client都开启KEEPALIVE, 则会双向探测.

对应用层socket api的影响

准备工作

为了模拟keepalive生效的情况, 用docker模拟断网线的情况.

准备好安装有docker, python, vim, tcpdump的ubuntu镜像, 创建好docker 网络.

跑起来, 修改heartbeat设置.

ubuntu# sudo docker run -it \

--volume=//home/enjolras/code_repo/python/keepalive_test://home/enjolras/code_repo/python/keepalive_test \

--detach=true \

--name=tcp_server \

--privileged=true \

--network=multi-host-network \

ubuntu_with_python

08f89dcff3547bb15c7aed975dfa5a0821e4d0246d6d812e02fd1470f3cef6c3

ubuntu# sudo docker run -it \

--volume=//home/enjolras/code_repo/python/keepalive_test://home/enjolras/code_repo/python/keepalive_test \

--detach=true \

--name=tcp_client \

--privileged=true \

--network=multi-host-network \

ubuntu_with_python

对阻塞式send/recv的影响

tcp_server

import socket

import sys

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

server_address = ('0.0.0.0', 22345)

sock.bind(server_address)

sock.listen(1)

connection, client_address = sock.accept()

connection.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

data = connection.recv(1024)

print("data", data)

tcp_client

import socket

import sys

import time

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

server_address = ('tcp_server', 22345)

sock.connect(server_address)

time.sleep(999999999)

send/recv会以异常/错误码方式得知 heartbeat 检测到的链接断开.

可以看到, tcp_server/tcp_client互发心跳.

root@0b3f1ee81446:/# tcpdump -i any port 22345

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode

listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes

12:29:34.491239 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [S], seq 2347845399, win 28200, options [mss 1410,sackOK,TS val 951128354 ecr 0,nop,wscale 7], length 0

12:29:34.491279 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [S.], seq 1169988006, ack 2347845400, win 27960, options [mss 1410,sackOK,TS val 2298965862 ecr 951128354,nop,wscale 7], length 0

12:29:34.491299 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951128354 ecr 2298965862], length 0

12:29:44.666952 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2298976038 ecr 951128354], length 0

12:29:44.666969 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951138530 ecr 2298965862], length 0

12:29:44.666978 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2298976038 ecr 951128354], length 0

12:29:44.666987 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951138530 ecr 2298976038], length 0

12:29:54.907019 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2298986278 ecr 951138530], length 0

12:29:54.907054 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951148770 ecr 2298976038], length 0

12:29:54.907059 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951148770 ecr 2298976038], length 0

12:29:54.907062 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2298986278 ecr 951138530], length 0

将tcp_server/tcp_client断网.

ubuntu# docker network disconnect multi-host-network tcp_client

可以看到tcp_server在连续3个探测包没有回复后, 往tcp_client发了一个RST.

12:31:47.547010 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951261408 ecr 2299088676], length 0

12:31:47.547019 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2299098916 ecr 951251168], length 0

12:31:47.547061 IP tcp_client.multi-host-network.57130 > 0b3f1ee81446.22345: Flags [.], ack 1, win 221, options [nop,nop,TS val 951261408 ecr 2299098916], length 0

12:31:57.787226 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2299109156 ecr 951261408], length 0

12:32:02.906612 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2299114276 ecr 951261408], length 0

12:32:08.026829 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [.], ack 1, win 219, options [nop,nop,TS val 2299119396 ecr 951261408], length 0

12:32:13.146776 IP 0b3f1ee81446.22345 > tcp_client.multi-host-network.57130: Flags [R.], seq 1, ack 1, win 219, options [nop,nop,TS val 2299124516 ecr 951261408], length 0

可以看到, 在心跳机制检测到socket状态异常后, 会通过异常/错误码等方式通知调用者.

3f1ee81446:/home/enjolras/code_repo/python/keepalive_test# python tcp_serv

Traceback (most recent call last):

File "tcp_server.py", line 11, in

data = connection.recv(1024)

socket.error: [Errno 110] Connection timed out

对select的影响

tcp_server

import socket

import sys

import select

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

server_address = ('0.0.0.0', 22345)

sock.bind(server_address)

sock.listen(1)

connection, client_address = sock.accept()

connection.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)

readable, writable, exeptional = select.select([connection], [], [])

print("readable", readable, writable, exeptional)

data = connection.recv(1024)

print("data", data)

对套接字select返回可读事件.

3f1ee81446:/home/enjolras/code_repo/python/keepalive_test# python tcp_serv

('readable', [], [], [])

Traceback (most recent call last):

File "tcp_server.py", line 14, in

data = connection.recv(1024)

socket.error: [Errno 110] Connection timed out

对epoll的影响

不做实验, 应该和select一致.

结论

heartbeat检测到tcp链接断开后, 会以可读事件方式通知应用层. 若无tcp heartbeat, 也无应用层heartbeat, 应用层无法得知链接的真实状态.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值