1有关SIGPIPE信号
在Unix系统下,如果send 、 recv 、 write在等待协议传送数据时 , socket 被 shutdown,调用send的进程会接收到一个SIGPIPE信号,进程对该信号的默认处理是进程终止。 此种情况 应用就很难查 出 处理进程为什么退出。
SIGPIPE 信号:
对一个已经收到FIN包的socket调用read方法, 如果接收缓冲已空, 则返回0, 这就是常说的表示连接关闭. 但第一次对其调用write方法时, 如果发送缓冲没问题, 会返回正确写入(发送). 但发送的报文会导致对端发送RST报文, 因为对端的socket已经调用了close, 完全关闭, 既不发送, 也不接收数据. 所以, 第二次调用write方法(假设在收到RST之后), 会生成SIGPIPE信号, 导致进程退出 。如果对 SIGPIPE 进行忽略处理, 二次调用write方法时, 会返回-1, 同时errno置为SIGPIPE.
处理方法:
在初始化时调用 signal(SIGPIPE,SIG_IGN) 忽略该信号(只需一次) , SIGPIPE交给了系统处理。 此时send 、 recv 或 write 函数将返回-1,errno为EPIPE,可视情况关闭socket或其他处理
SIGPIPE 被忽略的情况下,如果 服务器采用了fork的话,要收集垃圾进程,防止僵尸进程的产生,可以这样处理: signal(SIGCHLD,SIG_IGN); 交给系统init去回收。 这样 子进程就不会产生僵尸进程了。
ACE中发送和接收超时都是基于select的
- ssize_t
- ACE::send_n_i (ACE_HANDLE handle,
- const void *buf,
- size_t len,
- int flags,
- const ACE_Time_Value *timeout,
- size_t *bt)
- {
- size_t temp;
- size_t &bytes_transferred = bt == 0 ? temp : *bt;
- ssize_t n;
- ssize_t result = 0;
- int error = 0;
- int val = 0;
- ACE::record_and_set_non_blocking_mode (handle, val);
- for (bytes_transferred = 0;
- bytes_transferred < len;
- bytes_transferred += n)
- {
- // Try to transfer as much of the remaining data as possible.
- // Since the socket is in non-blocking mode, this call will not
- // block.
- n = ACE_OS::send (handle,
- (char *) buf + bytes_transferred,
- len - bytes_transferred,
- flags);
- // Check for errors.
- if (n == 0 ||
- n == -1)
- {
- // Check for possible blocking.
- if (n == -1 &&
- (errno == EWOULDBLOCK || errno == ENOBUFS))
- {
- // Wait upto <timeout> for the blocking to subside.
- int rtn = ACE::handle_write_ready (handle,
- timeout);
- // Did select() succeed?
- if (rtn != -1)
- {
- // Blocking subsided in <timeout> period. Continue
- // data transfer.
- n = 0;
- continue;
- }
- }
- // Wait in select() timed out or other data transfer or
- // select() failures.
- error = 1;
- result = n;
- break;
- }
- }
- ACE::restore_non_blocking_mode (handle, val);
- if (error)
- {
- return result;
- }
- else
- {
- return ACE_Utils::truncate_cast<ssize_t> (bytes_transferred);
- }
- }
- int
- ACE::handle_ready (ACE_HANDLE handle,
- const ACE_Time_Value *timeout,
- int read_ready,
- int write_ready,
- int exception_ready)
- {
- #if defined (ACE_HAS_POLL) && defined (ACE_HAS_LIMITED_SELECT)
- ACE_UNUSED_ARG (write_ready);
- ACE_UNUSED_ARG (exception_ready);
- struct pollfd fds;
- fds.fd = handle;
- fds.events = read_ready ? POLLIN : POLLOUT;
- fds.revents = 0;
- int result = ACE_OS::poll (&fds, 1, timeout);
- #else
- ACE_Handle_Set handle_set;
- handle_set.set_bit (handle);
- // Wait for data or for the timeout to elapse.
- int select_width;
- # if defined (ACE_WIN32)
- // This arg is ignored on Windows and causes pointer truncation
- // warnings on 64-bit compiles.
- select_width = 0;
- # else
- select_width = int (handle) + 1;
- # endif /* ACE_WIN64 */
- int result = ACE_OS::select (select_width,
- read_ready ? handle_set.fdset () : 0, // read_fds.
- write_ready ? handle_set.fdset () : 0, // write_fds.
- exception_ready ? handle_set.fdset () : 0, // exception_fds.
- timeout);
- #endif /* ACE_HAS_POLL && ACE_HAS_LIMITED_SELECT */
- switch (result)
- {
- case 0: // Timer expired.
- errno = ETIME;
- /* FALLTHRU */
- case -1: // we got here directly - select() returned -1.
- return -1;
- case 1: // Handle has data.
- /* FALLTHRU */
- default: // default is case result > 0; return a
- // ACE_ASSERT (result == 1);
- return result;
- }
其它lighttpd:
- // if (len == 0 || (len < 0 && errno != EAGAIN && errno != EINTR) ) {
- case -1:
- if (errno == EWOULDBLOCK || errno == EAGAIN || errno == EINPROGRESS)
- {
- return 0;
- }
- if (-1 == (cnt = accept(srv_socket->fd, (struct sockaddr *) &cnt_addr, &cnt_len))) {
- switch (errno) {
- case EAGAIN:
- #if EWOULDBLOCK != EAGAIN
- case EWOULDBLOCK:
- #endif
- case EINTR:
- /* we were stopped _before_ we had a connection */
- case ECONNABORTED: /* this is a FreeBSD thingy */
- /* we were stopped _after_ we had a connection */
- break;
- case EMFILE:
- /* out of fds */
- break;
- default:
- log_error_write(srv, __FILE__, __LINE__, "ssd", "accept failed:", strerror(errno), errno);
- }
- return NULL;
- }
- fcgi
- if (-1 == connect(fcgi_fd, fcgi_addr, servlen)) {
- if (errno == EINPROGRESS ||
- errno == EALREADY ||
- errno == EINTR) {
- if (hctx->conf.debug > 2) {
- log_error_write(srv, __FILE__, __LINE__, "sb",
- "connect delayed; will continue later:", proc->connection_name);
- }
- return CONNECTION_DELAYED;
- } else if (errno == EAGAIN) {
- if (hctx->conf.debug) {
- log_error_write(srv, __FILE__, __LINE__, "sbsd",
- "This means that you have more incoming requests than your FastCGI backend can handle in parallel."
- "It might help to spawn more FastCGI backends or PHP children; if not, decrease server.max-connections."
- "The load for this FastCGI backend", proc->connection_name, "is", proc->load);
- }
- return CONNECTION_OVERLOADED;
- } else {
- log_error_write(srv, __FILE__, __LINE__, "sssb",
- "connect failed:",
- strerror(errno), "on",
- proc->connection_name);
- return CONNECTION_DEAD;
- }
- }
2有关SEND,RECV函数的返回值及错误码
2.1EINTR
指操作被中断唤醒,需要重新读/写
2.2EAGAIN
在Linux环境下开发经常会碰到很多错误(设置errno),其中EAGAIN是其中比较常见的一个错误(比如用在非阻塞操作中)。从字面上来看,是提示再试一次。这个错误经常出现在当应用程序进行一些非阻塞(non-blocking)操作(对文件或socket)的时候。例如,以 O_NONBLOCK的标志打开文件/socket/FIFO,如果你连续做read操作而没有数据可读。此时程序不会阻塞起来等待数据准备就绪返 回,read函数会返回一个错误EAGAIN,提示你的应用程序现在没有数据可读请稍后再试。又例如,当一个系统调用(比如fork)因为没有足够的资源(比如虚拟内存)而执行失败,返回EAGAIN提示其再调用一次(也许下次就能成功)。
2.3EAGIN处理
当客户通过Socket提供的send函数发送大的数据包时,就可能返回一个EGGAIN的错误。该错误产生的原因是由于send
函数 中的size变量大小超过了tcp_sendspace的值。tcp_sendspace定义了应用在调用send之前能够在kernel中缓存的数据 量。当应用程序在socket中设置了O_NDELAY或者O_NONBLOCK属性后,如果发送缓存被占满,send就会返回EAGAIN的错误。
为了消除该错误,有三种方法可以选择:
1.调大tcp_sendspace,使之大于send中的size参数
---no -p -o tcp_sendspace=65536
2.在调用send前,在setsockopt函数中为SNDBUF设置更大的值