昨天一运行客户端频繁出现
Tue Jul 17 16:06:21 2012 us=390000 Attempting to establish TCP connection with 1
92.168.1.86:10443 [nonblock]
Tue Jul 17 16:06:21 2012 us=390000 TCP: connect to 192.168.1.86:10443 failed, wi
ll try again in 5 seconds: Operation would block (WSAEWOULDBLOCK)
Tue Jul 17 16:06:26 2012 us=390000 TCP: connect to 192.168.1.86:10443 failed, wi
ll try again in 5 seconds: Operation would block (WSAEWOULDBLOCK)
初步怀疑是客户端的问题,看了看客户端配置,没有发现任何异常,只好从服务器端判断。
看了一下服务器的log
Tue Jul 17 10:48:20 2012 us=189690 192.168.1.189:52252 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:25 2012 us=186285 192.168.1.189:52254 Connection reset, restarting [0]
Tue Jul 17 10:48:25 2012 us=186317 192.168.1.189:52254 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:30 2012 us=188313 192.168.1.189:52256 Connection reset, restarting [0]
Tue Jul 17 10:48:30 2012 us=188344 192.168.1.189:52256 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:35 2012 us=188504 192.168.1.189:52258 Connection reset, restarting [0]
Tue Jul 17 10:48:35 2012 us=188536 192.168.1.189:52258 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:45 2012 us=188543 192.168.1.189:52265 Connection reset, restarting [0]
Tue Jul 17 10:48:45 2012 us=188576 192.168.1.189:52265 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:50 2012 us=187213 192.168.1.189:52270 Connection reset, restarting [0]
Tue Jul 17 10:48:50 2012 us=187244 192.168.1.189:52270 SIGUSR1[soft,connection-reset] received, client-instance restarting
Tue Jul 17 10:48:55 2012 us=185208 192.168.1.189:52287 Connection reset, restarting [0]
Tue Jul 17 10:48:55 2012 us=185239 192.168.1.189:52287 SIGUSR1[soft,connection-reset] received, client-instance restarting
貌似客户端是自己退出的,通过抓包也发现客户端发出syn包,就换了端口。
设置了一下,发现客户端通过udp连接是正常的。
看了一天配置文档后,换了官方标准的客户端运行一切正常,当时晕倒。
通过和官方客户端的对比发现,
Tue Jul 17 16:13:32 2012 us=250000 Attempting to establish TCP connection with 1
92.168.1.86:10443
Tue Jul 17 16:13:32 2012 us=250000 TCP connection established with 192.168.1.86:
10443
官方使用了阻塞方式connect。
open***处理连接的代码在socket.c中,
int
open***_connect (socket_descriptor_t sd,
struct open***_sockaddr *remote,
int connect_timeout,
volatile int *signal_received)
{
int status = 0;
#ifdef CONNECT_NONBLOCK
set_nonblock (sd);
status = connect (sd, (struct sockaddr *) &remote->sa, sizeof (remote->sa));
if (status)
status = open***_errno_socket ();
if (status == EINPROGRESS )
{
while (true)
{
fd_set writes;
struct timeval tv;
FD_ZERO (&writes);
FD_SET (sd, &writes);
tv.tv_sec = 0;
tv.tv_usec = 0;
status = select (sd + 1, NULL, &writes, NULL, &tv);
if (signal_received)
{
get_signal (signal_received);
if (*signal_received)
{
status = 0;
break;
}
}
if (status < 0)
{
status = open***_errno_socket ();
break;
}
if (status <= 0)
{
if (--connect_timeout < 0)
{
status = ETIMEDOUT;
break;
}
open***_sleep (1);
continue;
}
/* got it */
{
int val = 0;
socklen_t len;
len = sizeof (val);
if (getsockopt (sd, SOL_SOCKET, SO_ERROR, (void *) &val, &len) == 0
&& len == sizeof (val))
status = val;
else
status = open***_errno_socket ();
break;
}
}
}
#else
status = connect (sd, (struct sockaddr *) &remote->sa, sizeof (remote->sa));
if (status)
status = open***_errno_socket ();
#endif
return status;
}
CONNECT_NONBLOCK 宏定义在 syshead.h中
/*
* Is non-blocking connect() supported?
*/
#if defined(HAVE_GETSOCKOPT) && defined(SOL_SOCKET) && defined(SO_ERROR) && defined(EINPROGRESS) && defined(ETIMEDOUT)
#define CONNECT_NONBLOCK
#endif
这个文件一直没有修改,可能是环境变量的设置,导致CONNECT_NONBLOCK为1,导致open***_connect使用nonblock。
发现open***的一个bug,open***的作者不大熟悉windows编程
set_nonblock (sd); windows下这行是废话,不用设置nonblock
status = connect (sd, (struct sockaddr *) &remote->sa, sizeof (remote->sa));
if (status)
status = open***_errno_socket ();
返回 10035是正常的,
if (status == EINPROGRESS ) 就不对了, EINPROGRESS 115 /* Operation now in progress */,永远不进循环里面,
修改一下,
if (status == WSAEWOULDBLOCK || status == EINPROGRESS )
目前ok。
转载于:https://blog.51cto.com/4301814/1896196