HTTP协议基于TCP协议,但是HTTP协议是无状态的,在HTTP1.0协议中,服务器一旦发送完请求的数据就会关闭连接。但是开发者们很快就发现这是不科学的,因为HTTP请求通常不是孤立的一个,比如打开一个网页,会有html、js、css、图片等一系列资源需要向服务器请求,每次HTTP请求都经过TCP的连接和断开,很不科学,既浪费时间又消耗带宽。
作为补救,HTTP1.0提供了Connection头,浏览器在请求数据时,可以指定Connection:keep-alive,这时,如果服务器支持keep-alive,就会在响应头中也包含Connection:keep-alive,这样浏览器就无需重新建立TCP连接而继续使用前一次的连接;如果服务器不支持,响应头中就不包含Connection或者包含Connection:closed。
当HTTP1.1协议诞生后,keep-alive就作为一项默认行为了。即便浏览器不指定Connection:keep-alive,只要服务器支持,响应头依然会包含Connection:keep-alive。
了解了这个机制,我们在接触浏览器类、web类或者网络爬虫类程序时,就应该注意keep-alive的影响,合理使用keep-alive,可以大大提高服务器或者客户端程序的性能。
这里顺便介绍一下Apache 2.2中如何配置支持keep-alive:
70 #
71 # KeepAlive: Whether or not to allow persistent connections (more than
72 # one request per connection). Set to "Off" to deactivate.
73 #
74 KeepAlive On
75
76 #
77 # MaxKeepAliveRequests: The maximum number of requests to allow
78 # during a persistent connection. Set to 0 to allow an unlimited amount.
79 # We recommend you leave this number high, for maximum performance.
80 #
81 MaxKeepAliveRequests 100
82
83 #
84 # KeepAliveTimeout: Number of seconds to wait for the next request from the
85 # same client on the same connection.
86 #
87 KeepAliveTimeout 15
以上是Apache的conf/httpd.conf中相应的配置项,配置完成后重启Apache,就能看到响应头信息如下啦:
HTTP/1.1 200 OK^M
Date: Tue, 03 Jun 2014 17:13:58 GMT^M
Server: Apache/2.2.3 (Red Hat)^M
X-Powered-By: PHP/5.4.4^M
Content-Length: 46^M
Keep-Alive: timeout=15, max=100^M
Connection: Keep-Alive^M
Content-Type: text/html; charset=GBK^M
^M
注:“^M”是\r\n在Linux vi编辑器中展示的结果。