TCP的keep-alive小结

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/rainharder/article/details/47022903

TCP的keep-alive可以在不增加服务器处理逻辑的前提下,检测客户端连接是否中断

/proc/sys/net/ipv4/tcp_keepalive_time 开始首次KeepAlive探测前的TCP空闭时间
/proc/sys/net/ipv4/tcp_keepalive_intvl 两次KeepAlive探测间的时间间隔
/proc/sys/net/ipv4/tcp_keepalive_probes 判定断开前的KeepAlive探测次数

对 于一个已经建立的tcp连接。如果在keepalive_time时间内双方没有任何的数据包传输,则开启keepalive功能的一端将发送 keepalive数据包,若没有收到应答,则每隔keepalive_intvl时间再发送该数据包,发送keepalive_probes次。一直没有 收到应答,则发送rst包关闭连接。若收到应答,则将计时器清零。

谁想定期检查连接状况,谁就启用keep alive。另一端可以不起,只是被动地对探测包进行响应,这种响应是tcp协议的基本要求,跟keep alive无关。并不需要客户端和服务器端都开启keep alive。

Remember that keepalive support, even if configured in the kernel, is not the default behavior in Linux. Programs must request keepalive control for their sockets using the setsockopt interface.

虽然是全局配置,但想要生效,还得程序各自开启

int keepalive = 1; // 开启keepalive属性
int keepidle = 60; // 如该连接在60秒内没有任何数据往来,则进行探测
int keepinterval = 5; // 探测时发包的时间间隔为5 秒
int keepcount = 3; // 探测尝试的次数。如果第1次探测包就收到响应了,则后2次的不再发。
setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, (void *)&keepalive , sizeof(keepalive));
// 可配置各socket自己的keepalive参数
setsockopt(sockfd, SOL_TCP, TCP_KEEPIDLE, (void*)&keepidle , sizeof(keepidle));
setsockopt(sockfd, SOL_TCP, TCP_KEEPINTVL, (void *)&keepinterval , sizeof(keepinterval));
setsockopt(sockfd, SOL_TCP, TCP_KEEPCNT, (void *)&keepcount , sizeof(keepcount));

Remember that keepalive is not program-related, but socket-related, so if you have multiple sockets, you can handle keepalive for each of them separately.

从上面的接口也可以看出,keepalive设置是socket相关的。
另外这些属性是sockt继承的,即listen的套接字设置该属性后,后面建立连接后的accept 套接字同样继承该属性(心跳属性)。

import socket

HOST = ''                 # Symbolic name meaning all available interfaces
PORT = 50007              # Arbitrary non-privileged port
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
s.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
s.setsockopt(socket.SOL_TCP, socket.SO_KEEPIDEL, 10)
s.setsockopt(socket.SOL_TCP, socket.SO_KEEPINTVL, 3)
s.setsockopt(socket.SOL_TCP, socket.SO_KEEPCNT, 2)

conn, addr = s.accept()
print 'Connected by', addr
while 1:
    data = conn.recv(1024)
    if not data: break
    conn.sendall(data)
conn.close()

但是,tcp自己的keepalive有这样的一个bug:
正常情况下,连接的另一端主动调用colse关闭连接,tcp会通知,我们知道了该连接已经关闭。但是如果tcp连接的另一端突然掉线,或者重启断电,这个时候我们并不知道网络已经关闭。而此时,如果有发送数据失败,tcp会自动进行重传。重传包的优先级高于keepalive,那就意味着,我们的keepalive总是不能发送出去。 而此时,我们也并不知道该连接已经出错而中断。在较长时间的重传失败之后,我们才会知道。

参考

http://www.tldp.org/HOWTO/html_single/TCP-Keepalive-HOWTO/
http://blog.csdn.net/ctthuangcheng/article/details/8596818
http://blog.csdn.net/ctthuangcheng/article/details/9450087

没有更多推荐了,返回首页