使用 Web 高速缓存减少网络流量 / Reducing network traffic with Web caching

最新推荐文章于 2023-08-06 02:01:36 发布

btbtd

最新推荐文章于 2023-08-06 02:01:36 发布

阅读量5.9k

点赞数

文章标签： web caching network 网络缓存服务器 cache

Dissertation 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

使用 Web 高速缓存减少网络流量 / Reducing network traffic with Web caching

English Version

Reducing network traffic with Web caching

Server Web caches speed access to Web pages and ease network traffic

Rawn Shah
Independent technologist and freelance journalist

Find out three ways to use Web caching to route Web traffic to your site more efficiently. Whether you run an extranet, intranet, or Internet site, Web caching can give you more control of your resources. Learn what hardware you need and what caching software to consider. In the companion article, "Setting up a cache server", see in detail how to configure a Squid cache proxy, with example code and parameters.

Like the mass transit systems that move groups of people between popular destinations, Web caching systems do the same for URL requests for popular Web sites. You can use Web caches to put users on the express track to their destinations.

Web caching stores local copies of popular Web pages so users can access them faster. A cache aggregates all the individual requests for a Web page and sends a single request as their proxy to the origin site, as the requested Web site is called. (But don't confuse a Web cache with a proxy server. The latter serves as an intermediary to place a firewall between network users and the outside world. A proxy server makes your outgoing network connection more secure, but it does little to reduce network traffic.) When the cache receives its copy of the contents, it then makes further copies and passes them on to the requesting users.

Caching in

Web caches can help reduce the load on a Web server by reducing the number of incoming requests; browsers retrieve portions of data from the cache rather than directly from the server. However, most Web content providers do not have access or control of which users or how many users arrive at their site. The cache server needs to go near the user's end, rather than near the Web servers. (Web load-balancing schemes distribute the incoming load across multiple servers at the Web content provider end, but that's a whole other story.) The most obvious beneficiary of Web caching is the user, who avoids some traffic snarls when browsing.

The network administrator and the remote Web site also reap benefits. According to the National Laboratory for Applied Network Research (NLANR), large caches with lots of clients may field as many as 50% of the hits that would otherwise travel through a network individually to the origin site. A typical cache would easily field about 30% of the intended hits, says the NLANR's 1996 research. Thus, statistically speaking, a Web cache could eliminate at least 30% of the Web traffic that would normally be going out over a WAN line. If you're paying dollars for megabytes, Web caching can then save you considerable sums in a relatively short time. Even if you have a flat-rate WAN connection, caching improves customer satisfaction levels, because it speeds access for all.

PCs have memory caches for code that's called often, and most browser programs have local caches that store recently surfed Web pages either in memory or on disk. A Web cache also stores frequently accessed Web pages, but it operates on a grander scale.

Caching on a grand scale

Global cache projects aim to reduce Internet traffic jams

On the global scale of the Internet, Web caches can lighten the overall burden of traffic through the numerous high-speed links between top-tier Internet service providers. By providing a cache hierarchy that maps to the provider's network topology, ISPs can create regional cache areas supported by groups of cache servers. Each region's cache contains data from Web sites in other regions. Thus, a programmer in California who wants to look up a page located in France would automatically be directed toward the cache server in California that holds a copy of the needed page, rather than pulling it straight down from halfway across the world.

Because of the public benefit of reducing Internet traffic, the National Science Foundation (NSF) has supported research projects that enable large-scale caching systems. One such project from the NLANR is currently investigating a multilevel cache system based on the national supercomputing centers located in public institutions and universities around the United States It began when a study back in 1993 suggested that several strategically placed FTP servers could reduce the overall traffic on the then-NSFNet backbone by 44 percent.

Internet topology has changed significantly since 1993, but the basic precepts still hold true. In fact, the IRCACHE project from the NLANR has cache servers located near or at the recommended central exchange points.

In the United Kingdom, a similar project is part of the Joint Academic Network (JANET), a national academic network service, and the next generation SuperJANET system. This national cache service is also available for public use or cache peering arrangements.

Both the IRCACHE and JANET Cache offer open participation from the public to combine their cache servers with the projects' distributed system. This gives the benefit of a global cache system that empowers that of your own, which in turn, speeds the access of your users. For more details on participating, visit the sites listed in Resources.

Webrouting with cache servers

In addition to reducing outgoing traffic by bundling duplicate requests from browsers, Web caches act like custom-dispatched express trains to solve the problem of Webrouting: how to send Web traffic efficiently over a network. While Internet Protocol routing handles low-level traffic direction of individual IP packets irrespective of the data contents, Webrouting directs application-specific HTTP traffic across the network. Because Web traffic constitutes most of all the Internet traffic, improving Webrouting can improve the overall performance of the Internet.

Webrouting depends upon IP routing because Web traffic flows only along the paths defined as IP routes. However, a single Web flow can change from server to server as it is redirected by different Web routers. A Web server can use the Redirect command of the HTTP protocol to send Web requests to other servers for processing. Web caches themselves redirect client and server traffic locally or to other caches to provide faster access to pages. Finally, load-balancing devices for Web servers can redirect incoming client requests to a group of servers in the same location or in other network locations to evenly distribute the incoming requests among the servers. You can think of all these devices as Webrouters directing HTTP traffic.

The process of Webrouting with cache servers begins after the Web request leaves the client browser workstation:

The cache server receives the request one of three ways: the request can be sent directly to the server; the server can actively monitor network traffic and pick out requests from the flow; other network devices can pick out the traffic and send it to the cache server.
Then the cache resolves the Web request. It has to determine if the requested page is stored within its cache database. If not, it checks its partner cache servers, if any, for the requested data.
Finally, the cache server returns the data to the client, either from its own database, from a partner's database, or from the original Web server.

Just as public transit systems use buses, trains, trollies, shuttles, taxis, and ferries, this three-step Receive-Process-Return process has been implemented in various forms.

Receiving the Web request

The most basic method for diverting requests to a cache is to configure the browser to point to the cache as its proxy server, an option on most popular browsers. The client browser then sends a request for a URL directly to the cache server to retrieve a document. This method ensures that the cache does the greatest possible amount of request processing: every request goes through the cache server. One downside of this method is that you cannot always control whether the browser uses a proxy; thus, clever users who understand that this is a typical configuration option may try to bypass the proxy. And another downside: When you have hundreds or thousands of desktops and Web browsers to configure, this method can turn into a management headache.

Transparent proxy caching also diverts all traffic to the cache server. A cache server sits directly on the data path between the clients and the remote Web sites and intercepts all outgoing requests. The cache examines every packet of data to look for Web requests, so in essence it serves as an advanced form of a packet filter.

External packet filters and IP Layer-4 and Layer-7 switches can also handle and route client requests. These devices examine the packets that are going out of the network to identify Web requests and redirect them to the cache server. A packet filter can examine any or all of the contents of the packet and, based upon some predefined policy, redirect the traffic appropriately.

At the transport layer, a Layer-4 switch redirects TCP or UDP to an appropriate destination; because all HTTP traffic is TCP-based, most such traffic is passed on first.

At the application layer of the ISO stack, a Layer-7 switch looks only for application-specific protocols such as HTTP, to direct to appropriate destinations.

Comparing methods for handling requests

Configuring every client Web browser can be a tedious task; transparent proxy caches are more practical for deployment on large networks or in organizations without strict control of the network. For example, an ISP can use transparent proxy caches for its dial-up modem clients without their ever knowing about it. Such a cache server would have to sit closest to the outgoing WAN connection to provide the maximum benefit.

Transparent proxy caches work much more slowly, however, because the cache server has to process every single IP packet that goes through the network to look for Web packets. Thus transparent proxy caches require the fastest processors and fast dual network links.

Using external packet filters or layer-specific switches optimizes the function of the device. In fact, some implementations have their own protocols that monitor the activity of multiple caches for the purposes of load-balancing.

Processing the Web request

Once the cache server receives a Web request, it checks its database to see if it has the contents of the requested page stored somewhere.

Web caching originally began as a single-server system that contained all the data of the cache. Although that's effective, cache servers tend to grow large. A single server runs out of disk space to store the requested pages or cannot process the incoming requests fast enough. Eventually single-server schemes gave way to distributed cache servers working either hierarchically or in parallel, or both. These servers balance among themselves the amount of cached information they contain, placing the most commonly requested data at the top of their hierarchy for the most people to see and the least commonly requested data at the bottom, closer to the specific users who need them.

Some cache server software actually works as extensions to existing Web server products. In such a case, there's no point in logging Web access entries, so the administrator should either disable or limit logging to the server log file. A cache contains continuously changing information, and unless you know what each cached entry contains (which actual Web site it goes to), you won't know where the client was going. Cache logs may also get fairly large, because all your users will be contributing to it. It can consume disk space as quickly as third-graders do candy.

Single-level caching

A cache server is essentially a proxy Web client that stores a lot of pages locally. The server responds to requests by sending along the requested Web page if it's available.

A successful retrieval from the local cache is called a cache hit, and an unsuccessful one is called a cache miss. In this case, the server begins its own access to the requested URL. Such a first-time access to a page forces the cache server to contact the origin Web server that hosts the page. The cache server checks to see if the page can be cached, retrieves the data to cache locally, and, at the same time, passes through the contents to the client. The user may never realize that the cache is between the client and server except in special circumstances.

A single cache server is the cheapest solution for improving Webrouting, but its effectiveness is limited by the capacity of the server. By combining a firewall, an IP router, and a cache together, vendors have created a single-box solution that works well for small office intranets. To go even cheaper, you can build a device with similar capabilities using a PC, the Linux operating system, and open-source software available publicly.

Parallel and load-balanced caching

A single cache server can handle only so many requests at a time, and even pumping up the machine with memory, disk space, and processors takes its capacity only so far. A better way to handle high-volume requests is to keep several cache servers running in parallel, handling requests from the same clients or different groups of clients. These parallel cache servers usually contain identical data and communicate changes among themselves.

An enhancement to the parallel-server method involves creating a load-balancing system for the parallel servers. All the servers handle the same group of clients and balance the load of incoming requests among themselves.

Multilevel caching

A multilevel cache spreads the cached data contents across several servers across the network. The top level caching server holds the most commonly accessed pages, and the lowest-level caching server holds the least commonly accessed pages. The various levels combine in a network of cache servers called a Web caching mesh. The caches communicate among themselves, using HTTP and special cache-coordination protocols, to divide the contents appropriately and maintain consistency among the servers.

Multilevel caching works almost the same as caching with single-cache servers. However, if there is a cache miss at one server level, the request is propagated up to the next higher level to see if that cache contains the data. Only when the request hits the top level and still encounters a cache miss will the cache server go directly to the origin Web site to retrieve the data. (You can customize this configuration of multilevel caching. Typically it looks at the nearest cache server before going up the chain to the top-level server, which might be several hops away.)

Multilevel cache systems work very well for a very large number of clients (in the 10,000s or 100,000s) accessing the system. Furthermore, if your many clients are spread widely across a WAN or the Internet, it's an even better solution.

Returning the Web request

Returning the results of a cache is currently still a simple process. Basically, the cache that contains the requested data examines the request packet, takes the source IP address, and sends the data to the client under the guise of the identity of the origin Web server.

Choosing protocols and options for multiple servers

Coordinating the contents of a cache among multiple servers is a challenge. As soon as you add a second cache server to the system, you encounter this problem: how do you maintain the consistency among the multiple servers that should contain identical data? If you add multiple levels of cache servers, you have to ask two other questions: how do you know what the other caches contain, and how do you redirect the request to the appropriate cache?

This is where cache protocols come in. There are three main types:

Query protocols send messages to other caches in a multilevel system to discover if they contain the needed data.
Redirect protocols forward the client request to the cache server in the multilevel system that contains the needed data.
Multicast protocols combine Query and Redirect protocols using multicast network communications.

Multicast cache protocols work in concert with all cache servers at the same time. Multicasting is the ability to create a virtual network of computers that can communicate directly with every other member at the same time. Multicasting is a function of the IP network protocol, with the help of special multicast routers and protocol stacks. With such cache protocols, a cache server can query all the other servers at the same time to find out if they contain the needed data. In addition, a client request sent to such a multicast group is automatically sent to all members, obviating any redirection. Within the group, one of the cache servers recognizes the requested URL as within its domain of responsibility and sends the data appropriately.

The problem with multicast protocols is that they are still not very popular. What's more, multicasting over the current Internet Protocol isn't really efficient because all of the Internet is connected by a mass of single point-to-point, or unicast, links, which defeats the purpose of multicasting. Still the software methods exist, and within intranets it is possible to set them up. The future generation of the Internet Protocol, called IPv6, allows real multicasting to take place, but it will be some time before it's widely implemented.

Setting protocol options for cache servers

There are four options for caching protocols:

The Internet Cache Protocol (ICP) is the first cache query documented as an informational standard by the Internet Engineering Task Force. It was developed during the research conducted in 1996 by the Harvest project, one of the early Web-caching projects. In a multilevel cache, ICP sends queries between the cache servers to check for specific URLs in other caches in the mesh. Unfortunately, ICP becomes inefficient beyond a certain number of distributed cache servers. If you are setting up one or two caches, this limitation of ICP does not pose a problem. On the other hand, if you're setting up a large multilevel cache with more than ten servers, ICP caches will spend too much of their time propagating changes and thus reduce efficiency. ICP also contains no real security to protect the communications between the cache servers.
The HyperText Caching Protocol (HTCP) is a better query protocol that is used to discover cache servers on local networks and to inquire if URLs are contained on the servers. It includes the HTTP headers from the original client request so that the cache server may process them, if necessary, as part of the request.
The Cache Array Routing Protocol (CARP) is a redirect protocol for a multilevel cache system. Each cache is programmed with a list of all the other cache servers in the system. The cache server uses a hash function that maps the URL to a given cache server. It then sends a CARP message to the other cache server containing the original HTTP request to fulfill. Microsoft's Proxy Server implements CARP.
Cisco's proprietary Web Cache Control Protocol (WCCP) handles request redirection to a cache mesh from a router. One of the cache servers can send a WCCP message to the router to define the mapping between URLs and cache servers. The router processes outgoing packets and looks for HTTP traffic; it then uses a hash function to determine which cache server should process the URL in each request and redirects the traffic to the server with WCCP.

Selecting hardware for cache servers

Essentially, a cache server is a heavy-duty network file server. Unlike a proxy or firewall server, which can run on fairly low powered machines (even 486 machines can work well as firewalls), a cache server needs processing power and speed.

To be most effective, a cache server needs fast network connections to the internal LAN network and the external WAN. Typically, plan for a cache storage capacity of several gigabytes on disk, as well as at least 128 MB of RAM, preferably gigabytes of RAM. By increasing the RAM storage, you directly increase the performance of the system, because direct accesses to physical memory work much faster than accesses to disk-stored caches.

Also, a fast processor can help, but a multiprocessor system, even with slower CPUs, can perform better by handling more requests simultaneously. Cache server administrators recognize that RAM and disk storage are the most important performance factors.

A Linux-based cache server running on a dual-processor 350 MHz Pentium II system with 512 MB of RAM, 25 GB of SCSI disk space, and dual 100 Mbps Ethernet connection--an estimated price between $2,500 and $5,000--should be able to handle one to two million requests a day, serving between 1,000 and 10,000 users.

Typically, the cache server does not need any intervention from a sysadmin, so the best choice is a fairly stable, reliable platform that can run unattended. Both commercial and freeware cache software is available.

Cache server software products
Software	Vendor/Developer	Caching Type	Platform
Apache Web Server caching module*	Apache Information Services	Single	AIX, BSD/OS, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, SCO Unix, Windows NT
BorderManager FastCache	Novell	Single, multilevel	NetWare
Cache Engine	Cisco	Single, multilevel, load-balancing	Custom hardware
CacheFlow Series	CacheFlow	Single, multilevel, load-balancing	Custom hardware
CacheRaq 2	Cobalt Networks	Single	Custom hardware appliance
DeleGate*	MITI ETL	Single, multilevel, load- balancing	AIX, EWS4800, HP-UX, HI-UX, IRIX, NextStep, NEWS-OS, Digital UNIX, Solaris, SunOS, BSD/OS, FreeBSD, Linux, NetBSD, OpenBSD, Windows 95/NT, OS/2
HTTPd Proxy Cache*	CERN, World Wide Web Consortium	Single	AIX, BSD/OS, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, Unixware
Internet Caching System	Novell	Single, multilevel, load-balancing	Custom hardware
Jigsaw caching proxy module*	World Wide Web Consortium	Single	Java
NetCache	Network Appliance	Single, multilevel, load-balancing	Custom hardware appliance
Netra Proxy Server	Sun Microsystems	Single, multilevel	Solaris, Custom hardware
Proxy Server	AOL/Netscape	Single, multilevel, load-balancing	AIX, HP-UX, IRIX, Solaris, Windows NT
Proxy Server	Microsoft	Single, multilevel, load-balancing	Windows NT
Squid*	NLANR	Single, multilevel, load- balancing	AIX, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, SCO Unix, OS/2
Traffic Server	Inktomi	Single, multilevel, load-balancing	Digital Unix, FreeBSD, IRIX, Solaris, Windows NT
WebSphere Performance Pack Cache Manager or Web Traffic Express	IBM	Single, multilevel, load-balancing	AIX, Linux, OS/400, Solaris, Windows NT
*Freeware or open source software

Resources

Find more network statistics at The National Laboratory for Advanced Network Research: http://www.nlanr.net

For info on the legal ramifications of using caches to access sites, read this synopsis of the Digital Millennium Copyright Act: http://www.arl.org/info/frn/copy/band.html

To learn how to set up your own cache server, read "Setting up a cache server" here on developerWorks.

Download Squid: http://squid.nlanr.net

Explore the Squid FAQ: http://squid.nlanr.net/Squid/FAQ/

Download Jigsaw caching proxy module: http://www.w3.org/Jigsaw/

Download DeleGate: http://wall.etl.go.jp/delegate/

Download HTTPd Proxy Cache from the World Wide Web Consortium CERN HTTPd server: http://www.w3.org/Daemon/

Find out more about how to install Linux programs for turning Squid implementations into transparent proxy caches:

Transparent Proxy Caching with Squid: http://squid.nlanr.net/Squid/FAQ/FAQ-17.html
The IP Filter package: http://cheops.anu.edu.au/~avalon/ip-filter.html
Two HowTo documents on packet filtering on Linux--HowTo for ipchains: http://www.rustcorp.com/linux/ipchains/HOWTO.html and HowTo for Firewalls: http://penguin.spd.louisville.edu/LDP/HOWTO/Firewall-HOWTO.html

If you're interested in participating in Global caching systems, take a look at the following two links:

The NLANR IRCACHE Project: http://www.ircache.net
The UK Joint Academic Network Cache Project: http://wwwcache.ja.net

For details about the cache server products described in this article, visit the links in the commercial and freeware cache software table.

Rawn Shah is an independent technologist and freelance journalist based in Tucson, Arizona, covering topics of networking and cross-platform integration since 1993. He can be reached at rawn@rtd.com.

Chinese Version

使用 Web 高速缓存减少网络流量

服务器 Web 高速缓存可以加速对网页的访问并减少网络流量

Rawn Shah
独立技术专家和自由新闻记者

找出三种使用 Web 高速缓存更有效地发送 Web 流量的方法。无论您运行外部网站点、内部网站点还是因特网站点，Web 高速缓存都可以使您更好的控制资源。了解您需要哪些硬件以及要考虑哪些高速缓存软件。请在姊妹篇 "Setting up a cache server" 中查看配置 Squid 高速缓存代理的细节，其中有代码和参数示例。

就像在受欢迎的目的地之间成批运送旅客的大众运输系统一样，Web 高速缓存系统对受欢迎的网站的 URL 请求也进行同样的处理。您可以使用 Web 高速缓存将用户通过快车道送到他们的目的地。

Web 高速缓存存储受欢迎的网页的本地副本，所以用户可以更快地访问这些网页。当有人访问所请求的网站时，高速缓存收集对某个网页的所有个别请求，并作为这些请求的代理将单个请求发送给 起始网站。（但不要混淆 Web 高速缓存和代理服务器。后者是用来在网络用户和外部世界之间安置防火墙的中介。代理服务器使您的外发网络连接更安全，但几乎不会减少网络流量。）当高速缓存接收到内容副本后，它就会进一步制作副本并将它们传递给发出请求的用户。

高速缓存

Web 高速缓存有助于减少 Web 服务器的负载，因为它减少了输入请求数；浏览器从高速缓存中检索数据，而不是直接从服务器检索数据。但是，大多数 Web 内容供应商无法知道或控制哪些用户或多少用户访问它们的网站。高速缓存服务器需要靠近用户一端，而不是靠近 Web 服务器一端。（Web 负载均衡方案将输入负载分布到 Web 内容供应商一端的多个服务器上，但那完全是另一码事。）Web 高速缓存最明显的受益者是用户，因为在他浏览时避免了某些交通混乱。

网络管理员和远程网站也能从中受益。据 National Laboratory for Applied Network Research (NLANR) 称，拥有大量客户机的大容量高速缓存可以覆盖多达 50% 的点击，如果没有高速缓存，这些点击将独自通过网络访问起始网站。NLANR 在 1996 年的研究报告中指出，一般的高速缓存很容易覆盖大约 30% 的点击。这样，从统计学的角度来说，Web 高速缓存至少可以消除 30% 的 Web 流量，如果没有高速缓存，这些 Web 流量将进入广域网。如果您需要按字节付费，则 Web 高速缓存在相对较短的时间内就可以为您节省一笔可观的费用。即便您使用包价收费的广域网连接，高速缓存还可以提高客户的满意度，因为它提高了每个人的访问速度。

个人计算机对经常调用的代码设置了存储器超高速缓存，大多数浏览器程序都有本地高速缓存，它在内存中或磁盘上存储最近浏览过的网页。Web 高速缓存也存储访问频繁的网页，但它是以更大的规模运行。

大规模高速缓存

全球高速缓冲项目旨在减少因特网流量阻塞

从因特网的全球规模来看，Web 高速缓存可减轻通过顶层因特网服务供应商之间的大量高速链接的总流量负担。通过提供映射供应商网络拓朴的高速缓存层次结构，ISP 可以创建由高速缓存服务器组所支持的区域性高速缓存区。每个区域高速缓存都包含来自其他区域的网站的数据。这样，在加利福尼亚的程序员如果想查找位于法国的一个网页，他将被自动定向到位于加利福尼亚的一个高速缓存服务器（该服务器保存着所需网页的一个副本）上，而不是直接从地球的另一端提取这个网页。

由于涉及减少因特网流量这一公共利益，（美国）国家科学基金会 (NSF) 已经在对支持大规模高速缓存系统的研究项目提供支持。NLANR 在这一领域的一个项目目前正在研究一种多层高速缓存系统，这个系统基于全美公共机构和大学的超级计算中心。这个项目是在 1993 年开始的，当时的一项研究表明，在关键地方放置的几台 FTP 服务器可使 then-NSFNet 主干线上的总流量减少 44%。

自 1993 年以来，因特网拓朴已发生了巨大的变化，但基本概念仍然适用。NLANR 的 IRCACHE 项目实际上将高速缓存服务器置于建议的中枢交换点或其附近。

在英国，一个类似的项目是 Joint Academic Network (JANET) 的一部分，JANET 是一项国家科研网络服务，是下一代 SuperJANET 系统。这个国有高速缓存服务也可用于公共用途或高速缓存同层布置。

IRCACHE 和 JANET 高速缓存提供公开参与的机会，公众可以将它们的高速缓存服务器与该项目的分布式系统结合在一起。这使全球高速缓存系统可以利用您的高速缓存服务器，这反过来又会加速您的用户的访问。有关参与的详细信息，请访问“参考资料”中所列的网站。

使用高速缓存服务器的 Web 发送

除了通过捆绑浏览器的重复请求来减少外发流量之外，Web 高速缓存的作用就像客户调度的特快列车，用来解决 Web 发送的问题：如何在网络上高效地发送 Web 流量。虽然因特网协议发送对单个 IP 数据包低层流量方向的处理与数据内容无关，但 Web 发送要对网络上应用程序特定的 HTTP 流量进行定向。因为 Web 流量是所有因特网流量中最主要的组成部分，所以改善 Web 发送就可以改善因特网的整体性能。

Web 发送取决于 IP 路由选择，因为 Web 流量只沿着 IP 路由所定义的路径流动。但是，当单个 Web 流被不同的 Web 路由器重新定向时，它就可以从一个服务器转向另一个服务器。Web 服务器可以使用 HTTP 协议的 Redirect 命令将 Web 请求发送给其他服务器进行处理。Web 高速缓存本身将客户机和服务器流量重定向到本地或其他高速缓存，以提供对网页的更快访问。最后，Web 服务器的负载均衡设备可以将输入客户机请求重定向到在同一位置或其他网络位置的一组服务器，以在这些服务器之间均匀分布输入请求。您可以将所有这些设备看作是对 HTTP 流量进行定向的 Web 路由器。

使用高速缓存服务器的 Web 发送的过程是在 Web 请求离开客户浏览器工作站之后开始的：

高速缓存服务器接收请求的方式有以下三种：将请求直接发送给服务器；服务器积极监视网络流量并从流中挑选请求；其他网络设备挑选流量并将其发送给高速缓存服务器。
随后高速缓存解析 Web 请求。它必须确定所请求的网页是否存储在它的高速缓存数据库中。如果没有，它将检查它的高速缓存服务器伙伴（如果有的话）以获得所请求的数据。
最后，高速缓存服务器将数据返回给客户机，数据既可能来自它自己的数据库，也可能来自某个伙伴服务器的数据库，还可能来自起始 Web 服务器。

正像公共运输系统使用公共汽车、火车、电车、出租车和渡船一样，这个“检索-处理-返回”三步过程是以各种形式实现的。

接收 Web 请求

使请求转向高速缓存的最基本的方法是，配置浏览器以使它指向高速缓存服务器，将其作为它的代理服务器，这是大多数常用浏览器的一个选项。客户浏览器然后将对一个 URL 的请求直接发送给高速缓存服务器以检索一个文档。这种方法保证高速缓存完成尽可能多的请求处理：每个请求都通过高速缓存服务器。这个方法的一个缺点是，您始终无法控制浏览器是否使用代理；这样，懂得这是一种典型配置选项的聪明用户就可能试图绕过代理。另一个缺点是：当您有数百或者数千的桌面和 Web 浏览器要配置时，这种方法可能变成一种令人头痛的管理事务。

透明的代理高速缓存也使全部流量转向高速缓存服务器。高速缓存服务器直接位于客户机和远程网站之间的数据路径上，并中途截取所有外发请求。高速缓存检查每个数据包来查找 Web 请求，所以本质上它是数据库过滤的一种高级形式。

外部数据包过滤器以及 IP 的第 4 层和第 7 层之间的转换也可以处理和发送客户机请求。这些设备检查外发的数据包以识别 Web 请求，并将它们重定向到高速缓存服务器。数据包过滤器可以根据某些预定义的策略检查数据包的任何或者全部内容，并对流量进行适当的重定向。

在传输层，第 4 层转换将 TCP 或 UDP 重定向到适当的目的地；因为所有 HTTP 流量基于 TCP，所以大多数这种流量都是首先传递的。

在 ISO 分层模型的应用层，第 7 层转换只查找应用程序特定的协议，如 HTTP，以定向到适当的目的地。

各种处理请求的方法的比较

配置每个客户 Web 浏览器可能是一项乏味的任务；透明代理高速缓存对于在大型网络上的部署或者在对网络没有严格控制的组织内的部署更实用。例如，ISP 可以对其拔号调制解调器客户使用透明代理高速缓存，而用户根本不会知道。这种高速缓存服务器必须位于离外出广域网连接最近的位置，以提供最大的利益。

但是，透明代理高速缓存的运行速度很慢，因为高速缓存服务器必须处理通过网络的每个 IP 数据包以查找 Web 数据包。这样，透明代理高速缓存要求最快的处理器和快速的双重网络链接。

使用外部数据包过滤器或特定层的转换可以优化设备的功能。实际上，某些实现有它们自己的协议，用来监视多个高速缓存的活动，以便进行负载均衡。

处理 Web 请求

一旦高速缓存接收到一个 Web 请求，它就检查自己的数据库，以查看是否在其中的某个地方存储着所请求网页的内容。

Web 高速缓存最初是单一服务器系统，它包含了高速缓存的全部数据。虽然这是有效的，但高速缓存服务器注定要增大。单一服务器会因不断存储被请求的网页而用完磁盘空间，或者不能足够快地处理输入请求。单一服务器方案最终让位于分布式高速缓存服务器，这些服务器以分层方式或并行方式运行，或者同时以两种方法运行。这些服务器相互之间平衡它们所包含的缓存信息量，将最常被请求的数据放在层次结构的顶层供大多数人查看，而将最少被请求的的数据放在底层，使它们更接近于需要它们的特定用户。

某些高速缓存服务器软件实际上是对现有 Web 服务器产品的扩展。在这种情况下，记录 Web 访问条目毫无意义，所以管理员应该禁用或者限制在服务器日志文件中作记录。高速缓存包含不断变化的信息，所以除非您知道每个高速缓存条目包含什么内容（它转向哪个实际的网站），否则您不会知道客户机访问了什么地方。高速缓存日志也可能变得相当大，因为您的所有用户都将对它产生影响。它消耗磁盘空间的速度就像三年级小学生吃糖果一样快。

单层高速缓存

高速缓存服务器本质上是一个在本地存储大量网页的代理 Web 客户机。服务器通过发送所请求的网页（如果可用）来作出响应。

从本地高速缓存中成功检索称为高速缓存命中，不成功的检索称为高速缓存缺失。在这种情况下，服务器开始自己访问所请求的 URL。这种对网页的首次访问强制高速缓存服务器联系托管该网页的起始 Web 服务器。高速缓存服务器检查该网页是否能被高速缓存，然后将数据检索到本地高速缓存中，同时将这些内容传递给客户机。除非是在特殊的环境中，否则用户永远不会意识到客户机和服务器之间有一个高速缓存。

单一高速缓存服务器是改善 Web 发送的最廉价的解决方案，但它的有效性受服务器容量的制约。通过将防火墙、IP 路由器和高速缓存组合在一起，厂商就创建了一个单机解决方案，这种方案很适合小型办公室内部网。要再便宜一点，您可以使用个人计算机、Linux 操作系统和公开可用的开放源码软件构建具有类似功能的一台设备。

并行和负载均衡高速缓存

单一高速缓存服务器每次只能处理这么多请求，即使为机器增加内存、磁盘空间和处理器，目前也只能做到这一步。处理大量请求的更好的方法是使几台高速缓存服务器并行运行，以处理来自相同客户机或不同组客户机的各种请求。这些并行高速缓存服务器通常包含完全相同的数据，并彼此沟通更改。

对并行服务器方法的增强包括为并行服务器创建负载均衡系统。所有服务器都处理同一组客户机，并相互平衡输入请求负载。

多层高速缓存

多层高速缓存将高速缓存的数据内容分布到网络中的几个不同服务器上。顶层高速缓存服务器保存最常访问的网页，最低层的高速缓存服务器保存访问最少的网页。高速缓存服务器网络中的各层组合称为 Web 高速缓存网络。高速缓存使用 HTTP 和专用的高速缓存协调协议相互通信，以适当分配内容并维护服务器之间的一致性。

多层高速缓存所做的工作与单一高速缓存服务器所做的工作几乎完全相同。但是，如果在某个服务器层出现高速缓存缺失，请求将被传播到更高一层来查看高速缓存是否包含这个数据。仅当请求到达顶层并且仍然出现高速缓存缺失时，高速缓存服务器才会直接转向起始网站来检索数据（您可以定制多层高速缓存的这一配置）。通常，在转向顶层服务器链（中间可能隔了好几层）之前，它首先查找最近的高速缓存服务器。

多层高速缓存服务器非常适合于大量客户机（10,000 到 100,000 台）访问系统的情况。此外，如果您的客户在广域网或因特网上的分布比较分散，这将是一个较好的解决方案。

返回 Web 请求

目前，返回高速缓存的结果仍然是一个简单的过程。基本过程是这样，包含所请求数据的高速缓存分析请求数据包，取出源 IP 地址，并假借起始 Web 服务器的名义将数据发送给客户机。

针对多个服务器选择协议和选项

在多个服务器之间协调高速缓存的内容是一个挑战。只要在系统中添加第二个高速缓存服务器，就会遇到以下问题：如果多个服务器应该包含相同的数据，如何维护多个服务器之间的一致性呢？如果添加多层高速缓存服务器，您必须回答两个问题：如何才能知道其他高速缓存包含什么内容，以及如何将请求重定向到适当的高速缓存？

这正是高速缓存协议发挥作用的地方。有三种主要的高速缓存协议类型：

查询协议将消息发送给多层系统中的其他高速缓存，以查看它们是否包含所需的数据。
重定向协议将客户机请求转发给多层系统中包含所需数据的高速缓存服务器。
多点传送协议使用多点传送网络通信将查询协议和重定向协议组合在一起。

多点传送高速缓存协议与所有高速缓存服务器同时协同工作。多点传送是创建虚拟计算机网络的一种能力，该虚拟网络中的每台计算机都能同时直接与其他每个成员通信。多点传送是 IP 网络协议的一项功能，它在专用多点广播路由器和协议堆栈的辅助下工作。有了这些高速缓存协议，高速缓存服务器就可以同时查询所有其他服务器，以查看它们是否包含所需的数据。此外，发送给这种多点传送组的客户机请求自动被发送给所有成员，从而避免了任何重定向。在组内部，一个高速缓存服务器确认所请求的 URL 在它的职责范围内，则适当地将数据发送出去。

多点传送协议的问题是它们还不太普及。而且，在目前的因特网协议基础之上的多点传送实际上效率不高，因为整个因特网是通过大量单一点对点（或单播）链接连接起来的，因而无法实现多点传送。即便这样，仍然存在一些软件方法，在内部网中可以设置这些方法。下一代因特网协议，称为 IPv6，允许真正的多点传送，但其广泛实现仍需要一段时间。

设置高速缓存服务器的协议选项

有四种高速缓存协议选项：

因特网高速缓存协议 (ICP) 是因特网工程任务组将其颁布为信息标准的第一个高速缓存查询。它是由 Harvest 项目于 1996 年在研究过程中开发的，这是早期的一个 Web 高速缓存项目。在多层高速缓存中，ICP 在高速缓存服务器之间发送查询，以在网络中的其他高速缓存中查找特定的 URL。不幸的是，当分布式高速缓存服务器的数量超过一定的规模时，ICP 变得非常低效。如果设置一个或两个高速缓存，ICP 的这个局限性不会成为一个问题。另一方面，如果设置拥有十台以上服务器的一个大型多层高速缓存，则 ICP 高速缓存将耗费过多的时间传播更改，这样就会降低性能。ICP 也不包含真正的安全性来保护高速缓存服务器间的通信。
超文本高速缓存协议 (HTCP) 是一种更好的查询协议，用来发现本地网络上的高速缓存服务器，以及查询这些服务器上是否包含所需的 URL。它包括原始客户机请求的 HTTP 标头，以便高速缓存服务器能够在必须时将它们作为请求的一部分来处理。
高速缓存阵列路由协议 (CARP) 是多层高速缓存系统的重定向协议。每个高速缓存都有系统中所有其他高速缓存服务器的一个列表。高速缓存服务器使用一个散列函数将 URL 映射到给定的高速缓存服务器。它随后向其他高速缓存服务器发送一条 CARP 消息，其中包含要处理的原始 HTTP 请求。Microsoft Proxy Server 实现了 CARP。
Cisco 的专有 Web 高速缓存控制协议 (WCCP) 通过路由器处理到高速缓存网络的请求重定向。一个高速缓存服务器可以向路由器发送一条 WCCP 消息来定义 URL 和高速缓存服务器之间的映射。路由器处理外发数据包并查找 HTTP 流量；它然后使用一个散列函数来确定哪个高速缓存服务器应该处理请求中的 URL，并使用 WCCP 将此流量重定向到那个服务器。

为高速缓存服务器选择硬件

高速缓存服务器本质上是一种重型网络文件服务器。与可以运行在相当低级的机器上的代理服务器或防火墙服务器（甚至 486 机器都可以很好地作为防火墙运行）不同，高速缓存服务器要求处理能力和速度。

为了达到最佳效果，高速缓存服务器需要用快速网络连接来连接内部局域网和外部广域网。通常，应该在磁盘上规划几千兆字节的高速缓存存储容量，内存至少应该有 128 MB，最好有几千兆。增加内存可以直接提高系统的性能，因为直接访问物理内存比访问存储在磁盘上的高速缓存要快得多。

同时，快速的处理器也有帮助，但多处理器系统（即使是一些很慢的 CPU）会执行得更好，因为它能够并发处理更多的请求。高速缓存服务器管理员都承认内存和磁盘存储是最重要的性能因素。

一台基于 Linux 的高速缓存服务器，系统具有两个 350 MHz Pentium II 处理器，512 MB 内存和 25 GB 的 SCSI 磁盘空间，并有双重 100 Mbps 以太网连接 — 估价在 2,500 美元到 5,000 美元之间 — 每天应该能够处理一到两百万个请求，为 1,000 到 10,000 个用户服务。

通常，高速缓存服务器不需要系统管理员的任何干预，所以最佳的选择是相当稳定、可无人值守运行的可靠平台。商业高速缓存软件和免费高速缓存软件现在都有。

高速缓存服务器软件产品
软件	厂商/开发商	高速缓存类型	平台
Apache Web Server 高速缓存模块*	Apache Information Services	单一	AIX, BSD/OS, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, SCO Unix, Windows NT
BorderManager FastCache	Novell	单一，多层	NetWare
Cache Engine	Cisco	单一，多层，负载均衡	自定义硬件
CacheFlow Series	CacheFlow	单一，多层，负载均衡	自定义硬件
CacheRaq 2	Cobalt Networks	单一	自定义硬件设备
DeleGate*	MITI ETL	单一，多层，负载均衡	AIX, EWS4800, HP-UX, HI-UX, IRIX, NextStep, NEWS-OS, Digital UNIX, Solaris, SunOS, BSD/OS, FreeBSD, Linux, NetBSD, OpenBSD, Windows 95/NT, OS/2
HTTPd Proxy Cache*	欧洲粒子物理研究所，万维网联盟	单一	AIX, BSD/OS, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, Unixware
Internet Caching System	Novell	单一，多层，负载均衡	自定义硬件
Jigsaw 高速缓存代理模块*	万维网联盟	单一	Java
NetCache	Network Appliance	单一，多层，负载均衡	自定义硬件设备
Netra Proxy Server	Sun Microsystems	单一，多层	Solaris，自定义硬件
Proxy Server	AOL/Netscape	单一，多层，负载均衡	AIX, HP-UX, IRIX, Solaris, Windows NT
Proxy Server	Microsoft	单一，多层，负载均衡	Windows NT
Squid*	NLANR	单一，多层，负载均衡	AIX, Digital UNIX, FreeBSD, HP-UX, IRIX, Linux, NetBSD, NextStep, SunOS, Solaris, SCO Unix, OS/2
Traffic Server	Inktomi	单一，多层，负载均衡	Digital Unix, FreeBSD, IRIX, Solaris, Windows NT
WebSphere Performance Pack Cache Manager 或 Web Traffic Express	IBM	单一，多层，负载均衡	AIX, Linux, OS/400, Solaris, Windows NT
*免费软件或开放源码软件

参考资料

查看（美国）国家高级网络研究实验室的更多网络统计数据：http://www.nlanr.net

有关使用高速缓存访问网站的合法分支机构的信息，请阅读 Digital Millennium Copyright Act 的对照表： http://www.arl.org/info/frn/copy/band.html

要了解如何设置您自己的高速缓存服务器，请阅读 developerWorks 上的文章："Setting up a cache server"。

下载 Squid：http://squid.nlanr.net

浏览 Squid 的常见问题解答：http://squid.nlanr.net/Squid/FAQ/

下载 Jigsaw 高速缓存代理模块：http://www.w3.org/Jigsaw/

下载 DeleGate：http://wall.etl.go.jp/delegate/

从 World Wide Web Consortium CERN HTTPd 服务器上下载 HTTPd Proxy Cache：http://www.w3.org/Daemon/

查找有关如何安装 Linux 程序以将 Squid 实现转变为透明代理高速缓存的详细信息：

使用 Squid 的透明代理高速缓存：http://squid.nlanr.net/Squid/FAQ/FAQ-17.html
IP 过滤器程序包：http://cheops.anu.edu.au/~avalon/ip-filter.html
有关 Linux 上数据包过滤的两篇 HowTo 文档 — HowTo for ipchains：http://www.rustcorp.com/linux/ipchains/HOWTO.html 和 HowTo for Firewalls：http://penguin.spd.louisville.edu/LDP/HOWTO/Firewall-HOWTO.html

如果您对参与全球高速缓存系统感兴趣，请查看以下两个链接：