由RFC7234详解HTTP缓存头(Cache-Control)

1 默认缓存策略

根据【RFC 7231】Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content 第6.1节:

Responses with status codes that are defined as cacheable by default (e.g., 200, 203, 204, 206, 300, 301, 404, 405, 410, 414, and 501 in this specification) can be reused by a cache with heuristic expiration unless otherwise indicated by the method definition or explicit cache controls [RFC7234]; all other status codes are not cacheable by default.

根据【RFC 7234】Hypertext Transfer Protocol (HTTP/1.1): Caching 第3.2节:

A shared cache MUST NOT use a cached response to a request with an Authorization header field (Section 4.2 of [RFC7235]) to satisfy any subsequent request unless a cache directive that allows such responses to be stored is present in the response.
In this specification, the following Cache-Control response directives (Section 5.2.2) have such an effect: must-revalidate, public, and s-maxage.

当响应中没有缓存头时,默认采用以下缓存策略:

  • 默认可缓存的状态码:200、203、204、206、300、301、404、405、410、414 、501,缓存连接器自己推算过期时间(heuristic expiration)。如果这些状态码的响应带有Authorization头字段,默认表示该响应只能被私有缓存连接器存储。反之,如果这些状态码的响应没有Authorization头字段,则默认表示共享缓存连接器、私有缓存连接器都可以存储该响应。
  • 默认不可缓存:其余所有状态码。

当响应中有缓存头时,以头字段中的命令为准。与缓存相关的头字段具体包括:

HTTP 1.1

  • Cache-Control
  • Age
  • Warning

HTTP 1.0

  • Expires
  • Pragma

私有缓存连接器:浏览器缓存
共享缓存连接器:代理缓存、网关缓存(CDNs)

2 响应中的Cache-Control命令

来源服务器通过在Cache-Control头字段中传入以下命令,显式声明本响应所采取的缓存策略。
HTTP响应中可选的Cache-Control命令

2.1 是否允许缓存

no-store:显式声明任何缓存连接器都不允许存储该响应及其对应的请求。

RFC 7234 Section 5.2.2.3
The “no-store” response directive indicates that a cache MUST NOT store any part of either the immediate request or response. This directive applies to both private and shared caches. “MUST NOT store” in this context means that the cache MUST NOT intentionally store the information in non-volatile storage, and MUST make a best-effort attempt to remove the information from volatile storage as promptly as possible after forwarding it.
This directive is NOT a reliable or sufficient mechanism for ensuring privacy. In particular, malicious or compromised caches might not recognize or obey this directive, and communications networks might be vulnerable to eavesdropping.

2.2 允许谁缓存

public:显式声明所有缓存连接器都可以存储此响应,即使该响应通常被视为不可缓存、或只可被私有缓存连接器存储。

RFC 7234 Section 5.2.2.5
The “public” response directive indicates that any cache MAY store the response, even if the response would normally be non-cacheable or cacheable only within a private cache. (See Section 3.2 for additional details related to the use of public in response to a request containing Authorization, and Section 3 for details of how public affects responses that would normally not be stored, due to their status codes not being defined as cacheable by default; see Section 4.2.2.)

private

  • 未指定field-name:显式声明此响应只允许私有缓存连接器存储,不允许共享缓存连接器存储。通常因为该响应涉及了特定用户的私密信息。
  • 指定field-name:显式声明所指定的字段只允许私有缓存连接器存储,共享缓存连接器可以存储响应的其余部分。注意:此功能尚未广泛实现。

RFC 7234 Section 5.2.2.6
The “private” response directive indicates that the response message is intended for a single user and MUST NOT be stored by a shared cache. A private cache MAY store the response and reuse it for later requests, even if the response would normally be non-cacheable.
If the private response directive specifies one or more field-names, this requirement is limited to the field-values associated with the listed response header fields. That is, a shared cache MUST NOT store the specified field-names(s), whereas it MAY store the remainder of the response message.
Note: This usage of the word “private” only controls where the response can be stored; it cannot ensure the privacy of the message content. Also, private response directives with fieldnames are often handled by caches as if an unqualified private directive was received; i.e., the special handling for the qualified form is not widely implemented.

2.3 开始重用前是否需要经过来源服务器验证

no-cache

  • 未指定field-name:整个响应在开始重用前必须经过来源服务器的验证。
  • 指定field-name:所指定的响应头字段在开始重用前必须经过来源服务器的验证,响应的其余部分可在遵循其他命令的限制下直接开始重用。注意:此功能尚未广泛实现。

RFC 7234 Section 5.2.2.2
The “no-cache” response directive indicates that the response MUST NOT be used to satisfy a subsequent request without successful validation on the origin server. This allows an origin server to prevent a cache from using it to satisfy a request without contacting it, even by caches that have been configured to send stale responses.
If the no-cache response directive specifies one or more field-names, then a cache MAY use the response to satisfy a subsequent request, subject to any other restrictions on caching. However, any headerfields in the response that have the field-name(s) listed MUST NOT be sent in the response to a subsequent request without successful revalidation with the origin server. This allows an origin server to prevent the re-use of certain header fields in a response, while still allowing caching of the rest of the response.
Note: Although it has been back-ported to many implementations, some HTTP/1.0 caches will not recognize or obey this directive. Also, no-cache response directives with field-names are often handled by caches as if an unqualified no-cache directive was received; i.e., the special handling for the qualified form is not widely implemented.

注意:HTTP 1.0 的Pragma字段也有no-cache命令,两者含义相同。

2.4 响应过期后是否强制进行重新验证

must-revalidate:来源服务器用此命令强制缓存连接器在响应过期后重新向自己验证,如果缓存连接器联系不上来源服务器,则它只能用504 (Gateway Timeout) 来回应客户端组件,而不能用未被重新验证的过期响应进行回应。must-revalidate是个很强硬的命令,只应该用于某些重要响应中,这些响应一旦使用了过期缓存将造成严重后果(例如金融事务)。由于缓存连接器可能会配置成忽略过期时间,或者客户端组件可能愿意接受已过期的响应(通过max- stale命令),需要must-revalidate这种机制来表明来源服务器的强硬态度。

RFC 7234 Section 5.2.2.1
The “must-revalidate” response directive indicates that once it has become stale, a cache MUST NOT use the response to satisfy subsequent requests without successful validation on the origin server.
The must-revalidate directive is necessary to support reliable operation for certain protocol features. In all circumstances a cache MUST obey the must-revalidate directive; in particular, if a cache cannot reach the origin server for any reason, it MUST generate a 504 (Gateway Timeout) response.
The must-revalidate directive ought to be used by servers if and only if failure to validate a request on the representation could result in incorrect operation, such as a silently unexecutedfinancial transaction.

proxy-revalidate:专门指示共享缓存连接器的must-revalidate命令。即响应过期后,共享缓存连接器必须先重新验证,而私有缓存连接器忽略该命令,不用重新验证。

RFC 7234 Section 5.2.2.7
The “proxy-revalidate” response directive has the same meaning as the must-revalidate response directive, except that it does not apply to private caches.

2.5 来源服务器显式声明响应的过期时间

max-age:有效期(秒),当响应的缓存年龄大于此时间时,表明响应已过期。

RFC 7234 Section 5.2.2.8
The “max-age” response directive indicates that the response is to be considered stale after its age is greater than the specified number of seconds.

s-maxage:专门给共享缓存连接器用的过期时间,它隐含了proxy-revalidate命令。当响应中同时包含s-maxage、max-age、Expires时,共享缓存连接器必须以s-maxage为准。私有缓存连接器始终忽略此命令。

RFC 7234 Section 5.2.2.9
The “s-maxage” response directive indicates that, in shared caches, the maximum age specified by this directive overrides the maximum age specified by either the max-age directive or the Expires header field. The s-maxage directive also implies the semantics of the proxy-revalidate response directive.

2.6 是否允许中间组件、连接器对响应做格式转换

no-transform:一些中间组件、连接器可能会转换响应头和响应体。例如,代理通过转换图像格式,以节省缓存空间或减少慢速链接上的流量。此命令表示不允许转换。

RFC 7234 Section 5.2.2.4
The “no-transform” response directive indicates that an intermediary (regardless of whether it implements a cache) MUST NOT transform the payload, as defined in Section 5.7.2 of [RFC7230].
RFC 7230 Section 5.7.2
Some intermediaries include features for transforming messages and their payloads. A proxy might, for example, convert between image formats in order to save cache space or to reduce the amount of traffic on a slow link. However, operational problems might occur when these transformations are applied to payloads intended for critical applications, such as medical imaging or scientific data analysis, particularly when integrity checks or digital signatures are used to ensure that the payload received is identical to the original.

3 响应头中与缓存相关的其他字段

Age:是 HTTP 1.1 的单独头字段,缓存连接器使用Age头字段说明此响应的缓存年龄(秒),即该响应自从被来源服务器生成或验证以来的时长。当响应头中出现Age字段时,说明此响应来自于缓存连接器,而不是直接来自于来源服务器。注意:反之不成立,即缺少Age字段不能代表该响应直接来自于来源服务器,因为该响应可能来自于不支持Age字段的 HTTP 1.0 的缓存连接器。

RFC 7234 Section 5.1
The “Age” header field conveys the sender’s estimate of the amount of time since the response was generated or successfully validated at the origin server. Age values are calculated as specified in Section 4.2.3.
The presence of an Age header field implies that the response was not generated or validated by the origin server for this request. However, lack of an Age header field does not imply the origin was contacted, since the response might have been received from an
HTTP/1.0 cache that does not implement Age.

Expires:是 HTTP 1.0 的单独头字段,表示过期时间绝对值(HTTP-date 时间戳),是来源服务器上的时间。当它与 HTTP 1.1 的Cache-Control头字段的max-age命令同时存在时,以max-age为准。通常Expires与max-age会同时存在,以兼容只支持 HTTP 1.0 的组件、连接器。

RFC 7234 Section 5.3
The “Expires” header field gives the date/time after which the response is considered stale. See Section 4.2 for further discussion of the freshness model.
If a response includes a Cache-Control field with the max-age directive (Section 5.2.2.8), a recipient MUST ignore the Expires field. Likewise, if a response includes the s-maxage directive (Section 5.2.2.9), a shared cache recipient MUST ignore the Expires field. In both these cases, the value in Expires is only intended for recipients that have not yet implemented the Cache-Control field.

Pragma:是 HTTP 1.0 的单独头字段,其no-cache命令与 HTTP 1.1 的Cache-Control头字段的no-cache命令含义相同。通常两者会同时存在,以兼容只支持 HTTP 1.0 的组件、连接器。当Pragma与Cache-Control指示的缓存策略不一致时,以Cache-Control为准。

注意:从RFC 7234中可以看出,Pragma被设计为请求头,而不是响应头。虽然有少部分缓存连接器能够识别响应头中的Pragma,但大部分无视。因此,如果需要在响应中设置no-cache,一定要通过Cache-Control响应头,不能只设置Pragma。

RFC 7234 Section 5.4
The “Pragma” header field allows backwards compatibility with HTTP/1.0 caches, so that clients can specify a “no-cache” request that they will understand (as Cache-Control was not defined until HTTP/1.1). When the Cache-Control header field is also present and understood in a request, Pragma is ignored.
When the Cache-Control header field is not present in a request, caches MUST consider the no-cache request pragma-directive as having the same effect as if “Cache-Control: no-cache” were present (see Section 5.2.1).

Warning:是 HTTP 1.1 的单独头字段,用于提供关于缓存或转换的警示信息。警示码及其描述详见下图。

RFC 7234 Section 5.5
The “Warning” header field is used to carry additional information about the status or transformation of a message that might not be reflected in the status code. This information is typically used to warn about possible incorrectness introduced by caching operations or transformations applied to the payload of the message.
警示码及其描述

4 共享缓存、私有缓存

4.1 默认的共享缓存策略

根据【RFC 7234】Hypertext Transfer Protocol (HTTP/1.1): Caching 第3.2节:

A shared cache MUST NOT use a cached response to a request with an Authorization header field (Section 4.2 of [RFC7235]) to satisfy any subsequent request unless a cache directive that allows such responses to be stored is present in the response.
In this specification, the following Cache-Control response directives (Section 5.2.2) have such an effect: must-revalidate, public, and s-maxage.

  • 如果响应带有Authorization头字段,表示该响应只能被私有缓存连接器存储。
  • 如果响应没有Authorization头字段,表示共享缓存连接器、私有缓存连接器都可以存储。

当响应中包含以下Cache-Control命令时,即使它带有Authorization头字段,共享缓存连接器也可以存储该响应:

  • must-revalidate
  • public
  • s-maxage

4.2 根据认证机制、会话管理机制设置私有缓存

无需手动设置 Cache-Control: private

  • HTTP标准认证机制(Basic、Digest、Mutual)通过Authorization头字段传递账密,默认为只可私有缓存,无需手动设置。

当项目中采用了以下机制时,必须手动设置 Cache-Control: private

  • 表单认证通过请求体传递账密
  • 基于cookie的会话管理机制通过cookie传递登录凭证
  • 基于session的会话管理机制通过cookie、url查询参数、请求\响应体传递登录凭证
  • 基于token的会话管理机制通过其他头字段、url查询参数、请求\响应体传递登录凭证

5 验证

缓存连接器发送条件请求(【RFC 7232】HTTP1.1 Conditional Requests)给下一环节的服务器组件,请它来判断自己存储的响应是否能重用,缓存连接器根据它的判断结果来更新自己存储的响应。此过程称为“验证”。

When a cache has one or more stored responses for a requested URI, but cannot serve any of them (e.g., because they are not fresh, or one cannot be selected; see Section 4.1), it can use the conditional request mechanism [RFC7232] in the forwarded request to give the next inbound server an opportunity to select a valid stored response to use, updating the stored metadata in the process, or to replace the stored response(s) with a new response. This process is known as “validating” or “revalidating” the stored response.

5.1 两种验证器

(1)HTTP 1.0 定义的:来源服务器提供响应头Last-Modified,缓存连接器使用该值作为If-Modified-Since请求头。

One such validator is the timestamp given in a Last-Modified header field (Section 2.2 of [RFC7232]), which can be used in an If-Modified-Since header field for response validation, or in an If-Unmodified-Since or If-Range header field for representation selection (i.e., the client is referring specifically to a previously obtained representation with that timestamp).

(2)HTTP 1.1 定义的:来源服务器提供响应头ETag,缓存连接器使用该值作为If-None-Match请求头。

Another validator is the entity-tag given in an ETag header field (Section 2.3 of [RFC7232]). One or more entity-tags, indicating one or more stored responses, can be used in an If-None-Match header field for response validation, or in an If-Match or If-Range header field for representation selection (i.e., the client is referring specifically to one or more previously obtained representations with the listed entity-tags).

5.2 验证结果

304 (Not Modified) :缓存可重用

A 304 (Not Modified) response status code indicates that the stored response can be updated and reused; see Section 4.3.4.

完整的响应(例如:带有响应体):缓存不能重用,需更新为此响应。

A full response (i.e., one with a payload body) indicates that none of the stored responses nominated in the conditional request is suitable. Instead, the cache MUST use the full response to satisfy the request and MAY replace the stored response(s).

5xx (Server Error):服务器错误。缓存连接器收到此响应时,有两种处理方式:直接将5xx (Server Error)转发给客户端组件,或重用先前的缓存。

However, if a cache receives a 5xx (Server Error) response while attempting to validate a response, it can either forward this response to the requesting client, or act as if the server failed to respond. In the latter case, the cache MAY send a previously stored response (see Section 4.2.4).

6 请求中的Cache-Control命令

客户端组件通过在Cache-Control头字段中传入以下命令,显式声明缓存连接器可以使用已存储响应来回应本次请求的前提条件。
HTTP请求中可选的Cache-Control命令

6.1 是否允许存储

no-store:客户端组件要求所有缓存连接器都不存储该请求及其对应的响应。

RFC 7234 Section 5.2.1.5

6.2 开始重用前是否需要经过来源服务器验证

no-cache:客户端组件要求缓存连接器用来回答本请求的缓存响应必须是被来源服务器验证通过的。注意:请求中的no-cache命令没有field-name参数。

RFC 7234 Section 5.2.1.4
The “no-cache” request directive indicates that a cache MUST NOT use a stored response to satisfy the request without successful validation on the origin server.

6.3 是否允许中间组件、连接器对响应做格式转换

no-transform:客户端组件不允许中间组件、连接器做转换。

RFC 7234 Section 5.2.1.6

6.4 客户端组件表明愿意接受的响应的新鲜度

max-age:客户端组件不愿意接受超过此缓存年龄(秒)的响应。除非与max-stale命令一起使用,否则客户端组件不接受已过期的响应。

RFC 7234 Section 5.2.1.1
max-age: The “max-age” request directive indicates that the client is unwilling to accept a response whose age is greater than the
specified number of seconds. Unless the max-stale request directive is also present, the client is not willing to accept a stale response.

max-stale:客户端组件愿意接受已过期的响应,过期时长需小于此时间(秒)。

RFC 7234 Section 5.2.1.2
The “max-stale” request directive indicates that the client is willing to accept a response that has exceeded its freshness
lifetime. If max-stale is assigned a value, then the client is willing to accept a response that has exceeded its freshness lifetime by no more than the specified number of seconds. If no value is assigned to max-stale, then the client is willing to accept a stale response of any age.

min-fresh:客户端组件愿意接受至少比此时间新鲜的响应,即响应的新鲜生命期大于其缓存年龄加此时间(秒)。

RFC 7234 Section 5.2.1.3
min-fresh: The “min-fresh” request directive indicates that the client is willing to accept a response whose freshness lifetime is no less than its current age plus the specified time in seconds. That is, the client wants a response that will still be fresh for at least the specified number of seconds.

6.5 客户端组件表明只接受缓存

only-if-cached:客户端组件要求缓存连接器只使用已存储的响应来回答请求。如果缓存连接器中没有匹配的已存储响应,则需返回504 (Gateway Timeout) 状态码。

RFC 7234 Section 5.2.1.7
The “only-if-cached” request directive indicates that the client only wishes to obtain a stored response. If it receives this directive, a cache SHOULD either respond using a stored response that is consistent with the other constraints of the request, or respond with a 504 (Gateway Timeout) status code. If a group of caches is being operated as a unified system with good internal connectivity, a member cache MAY forward such a request within that group of caches.

7 总结

对于默认可缓存的状态码:

初次请求
当请求中带有no-store命令时,缓存连接器不存储该请求及其响应。
当请求中没有no-store命令时:

  • 如果来源服务器返回的响应中带有no-store命令:缓存连接器不存储该请求及其响应。
  • 如果来源服务器返回的响应中也没有no-store命令:缓存连接器存储该请求及其响应。

再次请求
缓存连接器发现有已存储的响应,则检查是否需要验证:

  • 如果请求中带有no-cache命令:该响应必须先验证通过,才能重用于本请求。
  • 如果响应中带有no-cache命令:该响应必须先验证通过,才能重用于所有请求。
  • 如果请求、响应中都没有no-cache命令:该响应可直接重用于本请求,不用验证。

当响应可以重用时,缓存连接器检查响应是否过期(过期时间是响应中的max-age、Expires指定的,或自己推算的)。
当响应可以重用且未过期时,检查响应是否达到了所要求的的新鲜度(根据请求中的max-age、min-fresh命令)。

  • 如果足够新鲜:直接重用于本请求。
  • 如果不够新鲜:重新验证通过,使其变得新鲜,才能重用于本请求。

当响应可以重用但已过期时,则检查是否需要重新验证:

  • 如果响应中带有must-revalidate命令:过期响应必须重新验证通过,才能继续重用于所有请求。
  • 如果请求中没有max-statle命令:过期响应必须重新验证通过,才能继续重用于本请求。
  • 如果请求中带有max-statle命令:过期时长不超过max-statle指定时间的响应,能直接重用于本请求。

强缓存:设置有效期,在有效期内直接重用,无需向来源服务器验证,最大程度提升缓存效率。
协商缓存:不设置有效期,每次重用前需要向来源服务器验证,在保证准确性的同时提升缓存效率。

根据资源类型设置缓存规则
对于前端返回的长期不变的资源(静态文件、图片、CSS等),使用强缓存:

  • 设置较长的过期时间:通过配置Apache增加响应头Expires、Cache-Control:max-age。
  • 配合前端部署策略,在新版本上线时统一更新这些URI。

对于后端返回的不定期变化的静态资源(文件等),使用协商缓存:

  • 设置两个缓存验证器:通过配置Apache、或通过后端代码,增加响应头Etag、Last-Modified、Cache-Control:no-cache、Pragma:no-cache。

对于后端返回的可缓存的动态资源(数据库查询结果等),使用协商缓存:

  • 设置两个缓存验证器:通过后端代码增加响应头Etag、Last-Modified、Cache-Control:no-cache、Pragma:no-cache。
  • 对于缓存连接器发起的条件GET、HEAD,后端代码根据验证器判断是否返回304 (Not Modified) 。
  • 后端代码对携带认证信息的请求增加Cache-Control:private。

参考文档

  1. 【RFC 7231】Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content
  2. 【RFC 7234】Hypertext Transfer Protocol (HTTP/1.1): Caching
  3. 【RFC 7232】Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests
  4. w3.org rfc2616 sec14.9
  5. 响应头Warning字段
  6. 翻译:web制作、开发人员需知的Web缓存知识
  7. Caching Tutorial for Web Authors and Webmasters
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值