TCP网络应用程序开发中,如果遇到了需要检查Socket链接问题,通常是对这个TCP通道的时效性提出了要求。
应用开发诉求
1)客户端需要了解管道提供正常数据通信链路
2)客户端需要确保管道异常后重新建链
3)服务端需要了解链接客户端链接状态
4)服务端需要及时释放通道异常的客户端管道资源
常用解决方法
开发应用过程中遇到这种问题,通常有以下两种方法解决:
1)通过TCP协议栈自带链路检测功能
2)通过网络应用程序心跳机制实现检测功能(这里不做展开,最简单的方式就是心跳握手)
Keepalive机制介绍
当建立一个TCP链接的时候,系统会设置一系列的定时器,其中一部分就是用来处理Keepalive过程的。当Keepalive定时器递减到0的时候,协议栈就会发送一个TCP Keepalive probe包,远端就会响应一个reply包。当发送Keepalive probe包收到reply包的时候,就可以认定链接任然是有效的。反之,就可以认为链接无效,需要进一步采取措施。当然这个机制的介入,会对TCP通信带来一定的额外开销。
典型场景
终端异常
_____ _____
| | | |
| S | | C |
|_____| |_____|
^ ^
|--->--->--->-------------- SYN -------------->--->--->---|
|---<---<---<------------ SYN/ACK ------------<---<---<---|
|--->--->--->-------------- ACK -------------->--->--->---|
| |
| system crash ---> X
|
| system restart ---> ^
| |
|--->--->--->-------------- PSH -------------->--->--->---|
|---<---<---<-------------- RST --------------<---<---<---|
| |
网络异常
_____ _____ _____
| | | | | |
| S | | NAT | | C |
|_____| |_____| |_____|
^ ^ ^
|--->--->--->---|----------- SYN ------------->--->--->---|
|---<---<---<---|--------- SYN/ACK -----------<---<---<---|
|--->--->--->---|----------- ACK ------------->--->--->---|
| | |
| | <--- connection deleted from table |
| | |
|--->- PSH ->---| <--- invalid connection |
| | |
Keepalive参数介绍
# cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
# cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
# cat /proc/sys/net/ipv4/tcp_keepalive_probes
9
tcp_keepalive_time:表示2小时(7200 秒)开始发送第一个keepalive probe包
tcp_keepalive_intvl:表示每隔75秒发送一次keepalive probe包
tcp_keepalive_probes:表示如果没有收到ACK响应,连续尝试9次失败后,标记该链接断链
Keepalive参数设置
echo
# echo 600 > /proc/sys/net/ipv4/tcp_keepalive_time
# echo 60 > /proc/sys/net/ipv4/tcp_keepalive_intvl
# echo 20 > /proc/sys/net/ipv4/tcp_keepalive_probes
sysctl
# sysctl \
> net.ipv4.tcp_keepalive_time \
> net.ipv4.tcp_keepalive_intvl \
> net.ipv4.tcp_keepalive_probes
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
# sysctl -w \
> net.ipv4.tcp_keepalive_time=600 \
> net.ipv4.tcp_keepalive_intvl=60 \
> net.ipv4.tcp_keepalive_probes=20
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
C语言编程API
int setsockopt(int s, int level, int optname,
const void *optval, socklen_t optlen)
TCP_KEEPCNT: overrides tcp_keepalive_probes
TCP_KEEPIDLE: overrides tcp_keepalive_time
TCP_KEEPINTVL: overrides tcp_keepalive_intvl
例如:
/* --- begin of keepalive test program --- */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
int main(void);
int main()
{
int s;
int optval;
socklen_t optlen = sizeof(optval);
/* Create the socket */
if((s = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0) {
perror("socket()");
exit(EXIT_FAILURE);
}
/* Check the status for the keepalive option */
if(getsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) < 0) {
perror("getsockopt()");
close(s);
exit(EXIT_FAILURE);
}
printf("SO_KEEPALIVE is %s\n", (optval ? "ON" : "OFF"));
/* Set the option active */
optval = 1;
optlen = sizeof(optval);
if(setsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, optlen) < 0) {
perror("setsockopt()");
close(s);
exit(EXIT_FAILURE);
}
printf("SO_KEEPALIVE set on socket\n");
/* Check the status again */
if(getsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen) < 0) {
perror("getsockopt()");
close(s);
exit(EXIT_FAILURE);
}
printf("SO_KEEPALIVE is %s\n", (optval ? "ON" : "OFF"));
close(s);
exit(EXIT_SUCCESS);
}
/* --- end of keepalive test program --- */