转载WIKIPEDIA的CACHE知识

最新推荐文章于 2024-02-18 20:22:54 发布

Allan Jie

最新推荐文章于 2024-02-18 20:22:54 发布

阅读量1.1k

点赞数

分类专栏：操作系统

操作系统专栏收录该内容

1 篇文章 0 订阅

订阅专栏

http://en.wikipedia.org/wiki/CPU_cache#Associativity

==Associativity==

[[Image:Cache,associative-fill-both.png|thumb|450px|Which memory locations can be cached by which cache locations]]

The replacement policy decides where in the cache a copy of a particular entry of main memory will go. If the replacement policy is free to choose any entry in the cache to hold the copy, the cache is called '''fully associative'''. At the other extreme, if each entry in main memory can go in just one place in the cache, the cache is '''direct mapped'''. Many caches implement a compromise in which each entry in main memory can go to any one of N places in the cache, and are described as N-way set associative. For example, the level-1 data cache in an [[AMD Athlon]] is two-way set associative, which means that any particular location in main memory can be cached in either of two locations in the level-1 data cache.

Associativity is a [[trade-off]]. If there are ten places to which the replacement policy could have mapped a memory location, then to check if that location is in the cache, ten cache entries must be searched. Checking more places takes more power, chip area, and potentially time. On the other hand, caches with more associativity suffer fewer misses (see conflict misses, below), so that the CPU wastes less time reading from the slow main memory. The rule of thumb is that doubling the associativity, from direct mapped to two-way, or from two-way to four-way, has about the same effect on hit rate as doubling the cache size. Associativity increases beyond four-way have much less effect on the hit rate,<ref>[http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5234663 IEEE Xplore - Phased set associative cache design for reduced power consumption]. Ieeexplore.ieee.org (2009-08-11). Retrieved on 2013-07-30.</ref> and are generally done for other reasons (see virtual aliasing, below).


In order of worse but simple to better but complex:
* direct mapped cache — the best (fastest) hit times, and so the best tradeoff for "large" caches
* two-way set associative cache
* two-way skewed associative cache — in 1993, this was the best tradeoff for caches whose sizes were in the 4–8 KB range<ref name="Seznec">{{cite web |url=http://dl.acm.org/citation.cfm?doid=173682.165152 |doi=10.1145/173682.165152 |title=A Case for Two-Way Skewed-Associative Caches |accessdate=2007-12-13 |author=André Seznec}}</ref>
* four-way set associative cache
* fully associative cache — the best (lowest) miss rates, and so the best tradeoff when the miss penalty is very high.

===Direct-mapped cache===

Here each location in main memory can only go in one entry in the cache. It does not have a replacement policy as such, since there is no choice of which cache entry's contents to evict. This means that if two locations map to the same entry, they may continually knock each other out. Although simpler, a direct-mapped cache needs to be much larger than an associative one to give comparable performance, and is more unpredictable. Let "x" be block number in cache, "y" be block number of memory, and "n" be number of blocks in cache, then mapping is done with the help of the equation x = y mod n.

=== Two-way set associative cache ===

If each location in main memory can be cached in either of two locations in the cache, one logical question is: ''which one of the two?'' The simplest and most commonly used scheme, shown in the right-hand diagram above, is to use the least significant bits of the memory location's index as the index for the cache memory, and to have two entries for each index. One benefit of this scheme is that the tags stored in the cache do not have to include that part of the main memory address which is implied by the cache memory's index. Since the cache tags have fewer bits, they take less area on the microprocessor chip and can be read and compared faster. Also [[Cache algorithms|LRU]] is especially simple since only one bit needs to be stored for each pair.

=== Speculative execution ===

One of the advantages of a direct mapped cache is that it allows simple and fast [[speculative execution|speculation]]. Once the address has been computed, the one cache index which might have a copy of that location in memory is known. That cache entry can be read, and the processor can continue to work with that data before it finishes checking that the tag actually matches the requested address.

The idea of having the processor use the cached data before the tag match completes can be applied to associative caches as well. A subset of the tag, called a ''hint'', can be used to pick just one of the possible cache entries mapping to the requested address. The entry selected by the hint can then be used in parallel with checking the full tag. The hint technique works best when used in the context of address translation, as explained below.

=== Two-way skewed associative cache ===

Other schemes have been suggested, such as the ''skewed cache'',<ref name="Seznec">{{cite web|url=http://citeseer.ist.psu.edu/seznec93case.html |title=A Case for Two-Way Skewed-Associative Caches |accessdate=2007-12-13 |author=André Seznec}}</ref> where the index for way 0 is direct, as above, but the index for way 1 is formed with a [[hash function]]. A good hash function has the property that addresses which conflict with the direct mapping tend not to conflict when mapped with the hash function, and so it is less likely that a program will suffer from an unexpectedly large number of conflict misses due to a pathological access pattern. The downside is extra latency from computing the hash function.<ref name="CK" /> Additionally, when it comes time to load a new line and evict an old line, it may be difficult to determine which existing line was least recently used, because the new line conflicts with data at different indexes in each way; [[Cache algorithms|LRU]] tracking for non-skewed caches is usually done on a per-set basis. Nevertheless, skewed-associative caches have major advantages over conventional set-associative ones.<ref>
[http://www.irisa.fr/caps/PROJECTS/Architecture/ Micro-Architecture] "Skewed-associative caches have ... major advantages over conventional set-associative caches."
</ref>

===Pseudo-associative cache===
A true set-associative cache tests all the possible ways simultaneously, using something like a [[content addressable memory]]. A pseudo-associative cache tests each possible way one at a time. A hash-rehash cache and a column-associative cache are examples of a pseudo-associative cache.

In the common case of finding a hit in the first way tested, a pseudo-associative cache is as fast as a direct-mapped cache, but it has a much lower conflict miss rate than a direct-mapped cache, closer to the miss rate of a fully associative cache.
<ref name="CK">
[http://www.stanford.edu/class/ee282/08_handouts/L03-Cache.pdf "Advanced Caching Techniques"] by C. Kozyrakis
</ref>