预取？在这个年龄？

最新推荐文章于 2024-07-30 17:23:09 发布

cunya6061

最新推荐文章于 2024-07-30 17:23:09 发布

阅读量374

点赞数

文章标签： python java 面试人工智能编程语言

原文链接：https://timkadlec.com/remembers/2020-06-17-prefetching-at-this-age/

版权

A while back, I wrote about using Netlify and SpeedCurve to A/B test performance changes. The one I specifically mentioned was testing out Instant.Page on my site.

前一段时间，我写了关于使用Netlify和SpeedCurve进行A / B测试性能更改的文章。我特别提到的是在我的网站上测试Instant.Page 。

While the bulk of the post is about the A/B testing setup (which I am very happy with), I did note at the end that I was seeing some small improvements from Instant.Page, though the results were far from conclusive yet.

尽管大部分文章都是关于A / B测试设置的(我很满意)，但我确实在最后指出，我看到Instant.Page有一些小的改进，尽管结果还没有定论。

Alexandre, the creator of Instant.Page, suggested on Twitter that the gains I was seeing were small because Netlify passes an Age header that messes with prefetching.

Instant.Page的创建者Alexandre 在Twitter上建议我看到的收益很小，因为Netlify传递了与预取混淆的Age标头。

Tim, Netlify sends a Age header that conflicts with prefetching. A prefetched page will get fetched again on navigation if its Age header is over 300. The small gain you are seeing are due to the navigation request being a 304 and not a 200.

提姆，Netlify发送与预取冲突的Age标头。如果“ Age”标头超过300，则将在导航中再次获取预提取的页面。您看到的少量收益是由于导航请求是304而不是200。

This lead down an interesting little rabbit hole and, eventually, a bug. I learned a few new things as I dug in, so I figured it was worth sharing for others as well (and for me to come back to when I inevitably forget the details).

这导致了一个有趣的小兔子洞，并最终导致了一个错误。我在挖掘过程中学到了一些新东西，因此我认为也值得与其他人分享(当我不可避免地忘记细节时，对我来说，这很值得)。

First, before we dive in, let’s zero in on the critical components of what’s happening on my site specifically.

首先，在深入研究之前，让我们先深入探讨我网站上正在发生的事情的关键组成部分。

For all HTML responses, I pass a Cache-control: max-age=900, must-revalidate header. This tells the browser to cache the response for 15 minutes (⁹⁰⁰⁄₆₀). After that, it has to revalidate—basically, talk to the server again to make sure the asset is still valid and a newer version isn’t available. As soon as the resources is revalidated, the 15 minutes starts over.

对于所有HTML响应，我都传递了一个Cache-control: max-age=900, must-revalidate标头。这告诉浏览器高速缓存15分钟^{_{(60分之900)}}的响应。之后，它必须重新验证-基本上，再次与服务器对话以确保资产仍然有效，并且没有新版本可用。重新验证资源后，这15分钟将重新开始。

Netlify also passes along an ‘Age’ header, indicating how long they’ve been caching the resource themselves. (More on that in a bit) So, for example, if they’ve had the resource on the servers for 14 minutes, that would look like this:

Netlify还会传递一个“年龄”标头，指示他们自己将资源缓存了多长时间。 (有关更多信息，例如)，例如，如果他们在服务器上拥有资源14分钟，则看起来像这样：

age: 840

And finally, as a recap, Instant.Page works by using the prefetch resource hint to fetch links early, when someone hovers over the link instead of waiting for the next navigation to start.

最后，作为回顾，Instant.Page通过使用prefetch资源提示来尽早获取链接，这时有人将鼠标悬停在链接上而不是等待下一次导航开始。

Now let’s dive into each part of that and how they fit together.

现在，让我们深入研究其中的每个部分以及它们如何组合在一起。

`Age`头 (The `Age` Header)

The ‘Age’ header is used by upstream caching layers (Varnish, CDNs, other proxies, etc.) to indicate how long it’s been since a response was either generated or validated at the origin server. In other words, how long has that resource been sitting in that upstream cache.

上游缓存层(Varnish，CDN，其他代理等)使用“ Age”标头来指示自从原始服务器生成或验证响应以来已经有多长时间了。换句话说，该资源在该上游缓存中停留了多长时间。

It’s not just something that Netlify does—open just about any site and you’ll find resources with the ‘Age’ header set. That’s because if you’ve got something sitting between your origin and the browser caching your content, setting the ‘Age’ header is exactly what you’re supposed to be doing. It’s important information.

这不仅是Netlify所做的事情-几乎可以打开任何站点，并且您会找到带有'Age'标头集的资源。这是因为，如果您的来源与浏览器之间存在某些内容缓存，那么设置“ Age”标头正是您应该执行的操作。这是重要的信息。

Let’s say you’re using a CDN to cache content on their edge servers instead of making visitors wait while assets are requested from wherever your origin server resides. The first time a resource is requested, the CDN is going to have to go out and make a connection to your origin server to get it. At that point, since the CDN just got the resource from the origin server, the age is ‘0’.

假设您正在使用CDN在其边缘服务器上缓存内容，而不是让访问者在源服务器所在的位置请求资产时都等待。第一次请求资源时，CDN将必须出去并连接到原始服务器才能获取它。那时，由于CDN刚从原始服务器获取资源，因此期限为“ 0”。

Then, depending on what you’ve set up at the CDN level and assuming the CDN can cache the resource, the CDN will start serving that resource as it’s requested without talking to the origin again. As it does this, the age of the resource gets older and older.

然后，根据您在CDN级别上设置的内容，并假设CDN可以缓存资源，CDN将根据请求的内容开始提供该资源，而无需再次与源对话。这样做，资源的年龄越来越大。

Eventually, the CDN needs to talk to the origin server again.

最终，CDN需要再次与原始服务器通信。

Let’s say your CDN is set to cache a resource for 15 minutes before it needs to validate that the resource is still fresh. After 15 minutes, the CDN talks to the origin and will either get a new version of that resource or verification that the resource is still valid. At that point, the age of the resource resets to ‘0’—we’ve got a fresh start since we know what we have on the CDN is the latest version.

假设您的CDN设置为在需要验证资源仍然新鲜之前将其缓存15分钟。 15分钟后，CDN与源进行对话，或者获取该资源的新版本，或者验证该资源仍然有效。那时，资源的使用期限重置为“ 0”-因为我们知道CDN上的内容是最新版本，所以我们有了一个新的起点。

The browser’s primary mechanisms for determining what to cache and for how long are headers like Expires (which provides an expiration date for the resource being served), Cache-control (a ton of stuff here, but specifically for duration is max-age), Last-Modified (the date at which the resource was last modified), and Etag (a unique version identifier for the object). (For more detail on all of those,Paul Calvano’s post on Heuristic Caching and Harry’s post about Cache-Control are both top-notch resources.)

浏览器的主要机制是确定要缓存的内容以及诸如Expires (为所服务的资源提供到期日期)之类的标头， Cache-control (此处有大量内容，但持续时间为max-age )， Last-Modified (上次修改资源的日期)和Etag (对象的唯一版本标识符)。 (有关所有这些的更多详细信息， Paul Calvano的启发式高速缓存帖子和Harry的有关Cache-Control的帖子都是一流的资源。)

Age, too, factors in.

Age也是影响因素。

Let’s say that your CDN is set to cache a resource for 15 minutes, and you’ve also told the browser to cache that resource for 15 minutes using the Cache-control header (Cache-control: max-age=900, must-revalidate). With two layers of caching, each at 15 minutes, that means we have a potential Time to Live (the time a resource is stored in a cache before it’s deleted or updated) of up to 30 minutes—if the browser requests the resource just before the CDN’s version expires, then it’s been sitting in cache for 15 minutes on the CDN and will sit in the browser cache for another 15 minutes—so 30 minutes total.

假设您的CDN设置为将资源缓存15分钟，并且您还告诉浏览器使用Cache-control标头将资源缓存15分钟( Cache-control: max-age=900, must-revalidate )。具有两层缓存，每层15分钟，这意味着我们有一个潜在的生存时间(资源在删除或更新之前存储在缓存中的时间)最多为30分钟-如果浏览器在请求资源之前CDN的版本过期，然后在CDN上的缓存中保留了15分钟，而在浏览器中的缓存中又保留了15分钟-总共30分钟。

For any sort of remotely dynamic content, this could be problematic. If the content changes in that upstream cache, we could still be serving an old version of the resource for 15 more minutes until the browser cache expires.

对于任何种类的远程动态内容，这可能会出现问题。如果该上游缓存中的内容发生变化，我们仍然可以为该资源的旧版本提供15分钟的服务，直到浏览器缓存过期。

The Age header helps to prevent against this.

Age标头有助于防止这种情况。

Let’s go back to our example, where the browser requests the resource just before the CDN’s version expires. Only this time, let’s say the CDN communicates how long it’s had the asset in cache by providing an Age header of 840 (14 minutes). The browser knows from the max-age directive that it’s ok to serve an asset that is 15 minutes old, and it knows that the asset has been sitting on the CDN for 14 minutes. So, the browser adjusts the TTL to 1 minute (15 minutes of browser TTL minus 14 minutes it’s already been on the CDN), protecting against this problem of cache layers stacking on top of each other.

让我们回到我们的示例，在该示例中，浏览器在CDN版本过期之前请求资源。假设只有这一次，CDN通过提供840的Age标头(14分钟)来传达其资产在缓存中的时间。浏览器从max-age指令中知道可以使用15分钟的资产，并且知道该资产已在CDN上放置14分钟。因此，浏览器将TTL调整为1分钟(浏览器TTL为15分钟减去CDN上已经存在的14分钟)，从而避免了缓存层相互堆叠的问题。

This can all get a bit funky if the max-age directive you’re passing to the browser doesn’t align with how long you’re caching the resource upstream.

如果您传递给浏览器的max-age指令与您将资源缓存到上游的时间不一致，那么这一切都会变得有些奇怪。

For example, if you’re telling your CDN to cache a file for a week, but you’re only telling the browser to cache that resource for 15 minutes, then as soon as the Age of that resource exceeds 900 (15*60) the browser will no longer consider that resource safe to cache. Everytime it sees the request, it will note that the age is past the maximum TTL it’s been told to pay attention to, so it goes back out to the servers to try to find a new version.

例如，如果你告诉你的CDN缓存一个星期一个文件，但你只告诉浏览器缓存资源的15分钟，然后尽快的Age是资源超过900(15 * 60)浏览器将不再认为该资源可安全缓存。每次看到请求时，都会注意到年龄已超过告知它要注意的最大TTL，因此它返回服务器以尝试查找新版本。

There are times where having mismatched TTL's at a caching layer and at the browser may make sense. It's pretty quick to purge the cache (basically, empty it out) for most CDNs. So sometimes what you'll see is folks set a long TTL at the CDN layer and a short one at the browser level. Then, if the content does need to change, they can purge the CDN cache quickly and all they have to wait for is the browser to get past whatever short TTL they've set there. In those cases, it makes sense from a performance standpoint not to pass the Age header so that the browser can keep caching.

有时，在缓存层和浏览器中的TTL不匹配可能是有意义的。清除大多数CDN的缓存非常快(基本上是清空它)。因此，有时您会看到人们在CDN层设置了较长的TTL，而在浏览器级设置了较短的TTL。然后，如果确实需要更改内容，则他们可以快速清除CDN缓存，而他们所要等待的只是浏览器克服了在此处设置的任何短TTL。在这些情况下，从性能的角度来看，不要传递Age标头是有意义的，这样浏览器就可以继续缓存。

`prefetch`如何工作 (How `prefetch` works)

When you use the prefetch resource hint (which is what Instant.Page does), you’re telling the browser to go grab that resource even though it hasn’t been requested by the current page, and put it into cache.

当使用prefetch资源提示(Instant.Page所做的事情)时，即告诉浏览器去获取该资源，即使当前页面未请求该资源，也将其放入缓存。

So, for example, the following example tells the browser to grab the about page and store it.

因此，例如，以下示例告诉浏览器获取About页面并进行存储。

<link rel="prefetch" href="/about" as="document" />

The browser will request the resource at a very low priority during idle time so that the resource doesn’t compete with anything needed for the current navigation.

浏览器将在空闲时间以非常低的优先级请求资源，以使该资源不会与当前导航所需的任何资源竞争。

As with any request that gets cached, how long it’s cached depends on the caching headers. But with prefetch, there’s an added wrinkle.

与任何要缓存的请求一样，其缓存时间取决于缓存头。但是通过prefetch ，会增加皱纹。

The entire point of prefetch is to have something stored for the next navigation: making a prefetch for a resource that expires before that next navigation is wasted work and wasted bytes.

prefetch的全部目的是为下一次导航存储一些东西：对在下一次导航浪费工作和浪费字节之前过期的资源进行prefetch 。

For this reasons, Chromium-based browsers have a period of five minutes where they’ll cache any prefetched resources regardless of any other caching indicators (unless no-store has explicitly been set in the Cache-control header). After that window has expired, the normal Cache-control directives kick in, minus that initial window.

由于这个原因，基于Chromium的浏览器有五分钟的时间，可以缓存任何预取的资源，而与其他任何缓存指示符无关(除非在Cache-control标头中明确设置了no-store )。该窗口到期后，正常的Cache-control指令开始执行， 减去该初始window 。

In my case, I serve HTML documents with a max-age of 15 minutes. That means Chrome will save that prefetched resource for 15 minutes so this 5 minute window doesn’t really do anything special.

就我而言，我提供HTML文档max-age超过15分钟。这意味着Chrome将保留该预取资源15分钟，因此这5分钟的窗口实际上并没有做任何特别的事情。

But if you served an asset with a max-age of 0, then Chrome is still going to hold that resource for 5 minutes before having to revalidate it. The main takeaway here is that to avoid wasted work, the browser ignores the usual indicators of freshness for a period of time.

但是，如果您提供的资产的max-age期限为0，则Chrome 仍将保留该资源5分钟，然后才需要重新进行验证。这里的主要要点是，为了避免浪费工作，浏览器会在一段时间内忽略通常的新鲜度指示。

Firefox, on the other hand, does not have this little extra window for prefetched resources—it treats them like any other cached object, paying attention to the caching headers as normal. So, if (for example) the max-age is 0 for a prefetched resource, Firefox will make the request as directed using prefetch and then make the request again once it discovers it on the next navigation.

另一方面，Firefox没有用于预取资源的额外窗口，它像对待任何其他缓存对象一样对待它们，并像平常一样注意缓存头。因此，如果(例如)对于预取资源， max-age为0，则Firefox将使用prefetch指令进行请求，然后在下一次导航中发现请求后再次提出请求。

一起把它 (Bringing it altogether)

Phew. Ok. So we know what the Age header does, we know how the browser uses it to determine caching, and we know that Chromium-based browsers ignore all the usual freshness indicators when it comes to prefetch, at least for a short period of time, and Firefox does not.

ew 好。因此，我们知道Age标头的作用，我们知道浏览器如何使用它来确定缓存，并且我们知道基于Chromium的浏览器在进行prefetch时至少会在很短的时间内忽略所有常规的新鲜度指示器，并且Firefox没有。

All of this means that in Firefox, if the Age exceeds the max-age directive, then the prefetched resource is going to result in two requests: once for the actual prefetch and, because the asset is older than the TTL, once again on the next navigation.

所有这些都意味着，在Firefox中，如果Age超过了max-age指令，则预取资源将导致两个请求：一次请求实际prefetch并且由于资产早于TTL，因此再次在下一个导航。

In Chromium-based browsers, it seems likeAge shouldn’t impact prefetch behavior at all—if Chromium ignores other caching directives, why is Age any different? It seems like a bug.

在基于Chromium的浏览器中，似乎 Age根本不应该影响prefetch行为-如果Chromium忽略其他缓存指令，那么Age为何有所不同？好像是个错误。

Which is exactly the conclusion Yoav came to:

这也正是结论约阿夫来：

To clarify, sounds like a Chromium bug. Sending Age headers for cached resources is what caches are supposed to do And indeed, the 5 minutes calculation includes the Age header, which IMO makes little sense https://source.chromium.org/chromium/chromium/src/+/master:net/http/http_cache_transaction.cc;l=2716;drc=2f11470d7ad8963a9add116df64d2edd1b85d3a4;bpv=1;bpt=1?originalUrl=https:%2F%2Fcs.chromium.org%2F

澄清一下，听起来像是Chromium错误。发送缓存资源的Age标头是缓存应该要做的事情，确实，这5分钟的计算包括Age标头，IMO认为这没有意义https://source.chromium.org/chromium/chromium/src/+/master：净/http/http_cache_transaction.cc;l=2716;drc=2f11470d7ad8963a9add116df64d2edd1b85d3a4;bpv=1;bpt=1?originalUrl=https:%2F%2Fcs.chromium.org%2F

The bug is the source of what Alexandre was noting. Since Age is being included in the prefetch caching considerations, any prefetched resource in Chrome with an Age higher than either that 5 minute window or the max-age (whichever is longer) can’t be cached, so the request happens twice: once on prefetch and once on the next navigation.

该错误是Alexandre指出的原因。由于Age 已包含在prefetch缓存注意事项中，因此Chrome中任何Age大于5分钟窗口或max-age (以较长者为准)的资源都无法缓存，因此请求发生两次：一次prefetch然后在下一个导航中一次。

In my specific case, while the bug’s behavior is definitely not ideal, it also doesn’t jump out in the metrics on the aggregate because of my service worker. When the request gets prefetched, the service worker caches it. On that next navigation, the request gets made again, but the service worker has it at the ready, which accounts for why I’m seeing some performance improvements even with the bug.

在我的特定情况下，虽然错误的行为肯定不是理想的，但是由于我的服务工作者，它也没有跳到总体指标上。当请求被预取时，服务工作者将其缓存。在下一个导航中，再次发出了请求，但是服务人员准备就绪了，这解释了为什么即使有bug，我仍然看到一些性能改进。

Now, if we ignore the prefetch specific issues here, we do still have an issue with the way Netlify handles the Age header. Netlify is, interestingly, both the CDN and the origin here. Typically, whenever the CDN has to revalidate that a resource is still fresh with the origin, it will reset the Age header back to 0.

现在，如果我们忽略此处的prefetch特定问题，那么Netlify处理Age标头的方式仍然存在问题。有趣的是，Netlify既是CDN也是这里的起源。通常，每当CDN必须重新验证资源仍与原始资源一起使用时，它都会将Age标头重置为0。

In this case, because Netlify essentially is our origin, there’s no other layer somewhere for Netlify to revalidate with. The buck stops here, or something like that.

在这种情况下，由于Netlify本质上是我们的起源，因此Netlify无需再进行其他层验证。责任就在这里停止，或者类似的东西。

By passing the Age header along, and only updating it when the content is changed or cache is explicitly cleared, Netlify creates a situation where the browser will always have to go back to the server (Netlify) to see if the resource is fresh, regardless of that max-age window. The only way around this is to set a very long max-age or make sure to clear your Netlify cache on a semi-regular basis.

通过传递Age标头，并且仅在内容更改或显式清除缓存时才更新它，Netlify造成了一种情况，浏览器将始终必须返回服务器(Netlify)以查看资源是否新鲜，无论max-age窗口的解决此问题的唯一方法是设置一个非常长的max-age期限，或确保半定期清除Netlify缓存。

I suspect Netlify shouldn’t be passing the Age header down at all. Or, if that header is being applied at their edge layer (I’m not 100% clear on their architecture), then whenever their edge layer has to revalidate with the original source, they should be updating that Age at that point to avoid the issue of an ever-increasing Age.

我怀疑Netlify根本不应该向下传递Age标头。或者，如果该标头是在其边缘层应用的(我的架构不是100％清晰)，则每当其边缘层必须使用原始源进行重新验证时，他们都应在该时刻更新Age ，以避免Age不断增长的问题。

我们从这里去哪里？ (Where do we go from here?)

So, how do we make sure that our prefetched resources are as performant as possible?

那么，我们如何确保预取的资源性能尽可能好呢？

First things first: measure. I tried to emphasize this in my last post, but the data about the impact of this approach on my site says nothing about the impact on other sites. In my situation, I’m seeing a small improvement in most situations even with the bug in place. Your mileage may vary. Testing performance changes is good.

第一件事：衡量。我在上一篇文章中试图强调这一点，但是有关此方法对我的网站的影响的数据没有说明对其他网站的影响。在我的情况下，即使存在错误，在大多数情况下我仍会看到一个小的改进。你的旅费可能会改变。测试性能变化是否良好。

From the Chromium side of things, don’t worry about it. Yoav was all over it, and a fix has already landed.

从Chromium的角度来看，不必担心。约夫(Yoav)到处都是，而且已经找到了解决方法。

Firefox, however, is another story. It seems they’ve been contemplating making this change for awhile now, so it’s a matter of prioritizing the work. In the meantime, there are a few things to keep in mind.

但是，Firefox是另一回事。看来他们已经考虑了一段时间了，所以这是对工作进行优先级排序的问题。同时，有几点需要牢记。

One, if you have a service worker in place and you’re using an approach where the service worker serves from the cached version first, that helps to offset the double request penalty you might otherwise pay. The first request puts it in the service worker cache, the second gets pulled from there before it has to go any further.

一个，如果您有服务人员，并且您正在使用一种方法，该服务人员首先从缓存版本进行服务，这有助于抵消您可能要付出的双重请求罚款。第一个请求将其放入服务工作者缓存中，第二个请求从那里拉出，然后再进行下一步。

If you don’t have a service worker in place, then you’re going to have to make a decision regarding the Age header.

如果您没有服务人员，那么您将不得不就Age标头做出决定。

If you don’t pass the Age header, then Firefox can cache the resource according to your cache headers regardless of whether the age of the resource on the CDN (or proxy) is longer than the max-age communicated to the browser, but it does introduce the risk of extending the total TTL as we saw above. If your max-age directive is set to a short duration and you can quickly purge the upstream cache, you reduce the pain here a little.

如果您不传递Age标头，则Firefox可以根据您的缓存标头缓存资源，而不管CDN(或代理)上资源的使用期限是否长于传达给浏览器的max-age期限，但是确实会带来如上文所述扩展总TTL的风险。如果将max-age指令设置为较短的持续时间，并且可以快速清除上游缓存，则可以在此减轻一些痛苦。

If you do pass the Age header along, you avoid longer total TTL issues, but you now risk issuing double requests for every prefetched resource as the age of the cached resource gets older. If the resource changes frequently in the upstream cache, or if you are passing a long max-age directive to the browser, the severity of this risk is reduced a little.

如果您确实传递了Age标头，则可以避免更长的TTL总问题，但是随着缓存资源的使用期限变老，现在冒着对每个预取资源发出双重请求的风险。如果资源在上游缓存中频繁更改，或者您将长的max-age指令传递给浏览器，则此风险的严重性会有所降低。

In the end, this comes down to a combination of what services and tools you’re using for those upstream caches, and how frequently your prefetched resources may change.

最后，这归结为您要为这些上游缓存使用哪些服务和工具，以及预取资源的更改频率。

翻译自: https://timkadlec.com/remembers/2020-06-17-prefetching-at-this-age/

cunya6061

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
预取？在这个年龄？

pcie预取不预取A while back, I wrote about using Netlify and SpeedCurve to A/B test performance changes. The one I specifically mentioned was testing out Instant.Page on my site. 前一段时间，我写了关于使用Netlify和Spe...
复制链接

扫一扫