群集里多个web站点的Apache日志合并出现问题。
  将原来分散的web应用整合成群集后,同一个站点产生的多个日志需要进行合并。其中一个站点在合并apache日志的时候总是报错,为了搞清楚原因仔细看了该站点和其他站点日志有何不同,结果发现在不能合并的日志中发现大量的internal dummy connection信息。
::1 - - [01/Nov/2008:00:01:04 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:05 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:06 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:07 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:08 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:09 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:10 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
::1 - - [01/Nov/2008:00:01:11 +0800] "GET / HTTP/1.0" 200 95536 "-" "Apache/2.2.0 (Unix) PHP/5.2.3 (internal dummy connection)"
  把这些包含internal dummy connection的行从日志中清除后日志可以正常合并,看来问题就出在这些被记录在Apache日志中的internal dummy connection信息了。但是这些信息总不能每个都要手工清除吧,分析日志总归还是要做成自动化。要在日志中屏蔽这些信息可以通过修改Apache配置文件实现,
SetEnvIf Remote_Addr "::1" dontlog
CustomLog "|/usr/local/sbin/cronolog /www/logs/XXX/access%Y%m%d.log" combined env=!dontlog
  修改配置文件之后产生的Apache日志屏蔽了这些信息,表面上日志中不再记录这些内部进程的通信,但是这并不代表问题不存在。那深究一下同样的环境下,为什么这些信息只出现在一个站点上而其他站点没有类似的问题?又为什么会产生这些信息呢?
这些信息是为了方便管理员在日志中查找出错误的请求
/* Create the request string. We include a User-Agent so that
* adminstrators can track down the cause of the odd-looking
* requests in their logs.
*/
srequest = apr_pstrcat(p, "GET / HTTP/1.0\r\nUser-Agent: ",
ap_get_server_version(),
" (internal dummy connection)\r\n\r\n", NULL);
internal dummy connection的产生是因为apache的MPM模块用prefork方式工作,而用worker MPM则不会使用pipe of death在进程间通信,也就不会有internal dummy connections的问题。
It's defined in /server/mpm_common.c:

| This function connects to the server, then immediately closes the
| connection.
| This permits the MPM to skip the poll when there is only one listening
| socket, because it provides a alternate way to unblock an accept()
| when the pod is used.

pod=pipe of death.

| The pipe of death is used to tell all child processes that it is time
| to die gracefully.

So if you use the worker MPM which doesn't use a pod, there are no
internal dummy connections anymore.
现在搞不清楚为什么只有N多站点中只有一个站点出现这样的问题,如果找到原因我会续上。有朋友知道也请指点。