Chapter 4 Cache Memory

最新推荐文章于 2022-04-05 10:15:18 发布

Peter_Ouyang

最新推荐文章于 2022-04-05 10:15:18 发布

阅读量480

点赞数

分类专栏：计算机组织与结构文章标签：计算机组织与结构

本文链接：https://blog.csdn.net/Peter_Ouyang/article/details/53945818

版权

计算机组织与结构专栏收录该内容

16 篇文章 1 订阅

订阅专栏

Key characristics of memory systems
- Location:Processor
- - Internal: including CPU
  - External
  - Offline
- Capacity:
- - word length
  - - Internal memory: 8, 16, 32 bits
  - number of words
- Unit of transfer:
- - Internal:
  - - Governed by data bus width
  - External:
  - - block
  - Addressable Unit
  - - AU = 2^(Address length)
- Access method:
- - sequential
  - - beginning -----> in order
    - Depends on location and previous location
  - direct
  - - individual blocks have unique address, jumping to vicinity then sequential search
    - Depends on location and previous location(block)
  - random: address-based
  - - individual addresses identify locations exactly
    - Independent of location or previous access
  - associative: content-based -----> cache
  - - Independent of location or previous access
- Performance:
- - access time/latency
  - access cycle
  - - access time + address transient change time
    - Used for random access memory
  - bandwidth/transfer rate
  - - 1/cycle
- Physical type:
- - semiconductor
  - magnetic
  - optical
  - magneto-optical
- Physical characteristics:
- - volatile/nonvolatile
  - erasable/nonerasable
- Organisation: how to form bits to words
Memory hierarchy
- Why we need hierarchy architecture
Locality of Reference
- Reasons:
- - Programs typically contain many loops and subroutines -- repeated references
  - Table and array involve access to clustered data set -- repeated references
  - Programs or data are always put in sequence, instructions or data to be accessed are typically nearby current instruction or data.
Performance of a simple two-level memory

Cache memory principles
- A relatively large and slow main memory with a smaller, faster cache memory
- small & expensive
Key Techniques in Cache Design
- Size
- Mapping Function(addressed by word/byte?)
- - Direct mapping(Thrashing)
  - - i = j mod m
    - - i is the cache line number
      - j is the memory block number
      - m us the number of line in the cache
    - Method: memory address separated into 3 parts:
    - - Low w bits identify content/word
      - Middle r bits identify Cache line
      - Left bits identify whether the data is needed
      - Tag identify which memory block
    - Advantages:
    - - Simple
      - Inexpensive to implement
    - Disadvantage:
    - - Fixed location for given block
  - Associative mapping
  - - A main memory block can load into any line of cache
    - Procedure：
    - 1. Find the tag size in the cache, then compare it and memory address size to get the word size(indicating the number of items in each memory block)
      2. Then ignoring the word bits, get the tag bits in cache from right side to the left.
    - Advantage:
    - - A memory block can be mapped into any line of the cache
      - Replacing is flexible
    - Disadvantage:
    - - Parallel comparing circuit is needed
      - Complex and expensive
  - Set associative mapping
  - - Direct mapping for cache set and memory set; full associative mapping between two sets
    - A cache of m lines is divided into v sets, k lines/set
    - - Set number of cache i = block number of memory mod v
- Replacement Algorithm
- - Direct mapping
  - - No choice
    - Each block only maps to one line(fixed)
    - Replace that line
  - Associative & Set associate
  - - LRU
    - - use 1/0 as mark bit
    - FIFO
    - Least frequently used
    - Random
    - - Performance is almost the same as LRU
    - Optimized(Cannot be implemented)
    - - Replace the line accessing in the farthest future
- Write Policy
- - More complex than read policy
  - - Data consistency between cache, memory and other caches
  - Write through
  - - write memory while writing cache
    - Advantage:
    - - Simplest
    - Disadvantages:
    - - Lots of writing traffic, resulting in bus bottleneck
      - Slow down writes
  - Write back
  - - Not only write cache, but also modified flag is put in the written line, and when the line is replaced, writing back that line to memory(Compared to write through policy, it is like the git operation: add -----> commit)
    - Suitable for iterative operation and the system that I/O module is directly connected to cache
    - Disadvantages:
    - - Part of contents in the memory is invalid
      - Circult is complex
      - Cache may become bottleneck
- Block Size
- Number of Caches
- - Single cache vs multilevel Cache
  - - On-chip Cache（L1）：short path to CPU, fast speed and reduce the frequency of bus access
    - Off-chip Cache（L2）：only L1 access miss results in accessing L2
  - Unified Cache and Split Cache
  - - Unified Cache:
    - - Storing Data & instruction, balancing load automatically
      - Hit rate high
    - Split Cache:
    - - Data Cache & instruction Cache
      - Parallel operation
- Address of Cache