【我所認知的BIOS】-->Cache(1)

【我所認知的BIOS-->Cache(1)

LightSeed  

2009-11-12

 

       在文章前面我想说一下,CPU的cache是一个很有趣的部件,对它研究我目前也还仅仅处于486架构上的研究。因为我手上也就只找到了《80486 System Architecture》这本书(这本书里面讲的非常非常详细,我个人很喜欢这样的书籍)。据说P4 CPU也有这样一本书,不过在国内好像找不到,如果您是有这本的书的话,能否给我让我瞧瞧?这样的话,我会很感谢您的!

      另外,由于关于cache的东西比较多(当然其实有两个大的方向,就是数据一致性和运行机制,掌握了这两个大的目标我想cache对于您来说就不算什么难事了。),所以我准备把整理好的小结分成三篇文章放到blog里。这样可以降低每一篇文章的篇幅。否则篇幅太长,我担心大家会睡着了。呵呵。。。(开玩笑啦。)

1、什么cache

引用自互动百科里面关于cache的定义:

Cache存储器:电脑中为高速缓冲存储器,是位于CPU主存储器DRAM(Dynamic Random Access Memory)之间,规模较小,但速度很高的存储器,通常由SRAM(Static Random Access Memory静态存储器)组成。CPU的速度远高于内存,当CPU直接从内存中存取数据时要等待一定时间周期(在后续的时序图中我们可以很清楚地看到这个等待周期。),而Cache则可以保存CPU刚用过或循环使用的一部分数据,如果CPU需要再次使用该部分数据时可从Cache中直接调用,这样就避免了重复存取数据,减少了CPU等待时间,因而提高了系统的效率。Cache又分为L1 Cache一级缓存)和L2 Cache二级缓存),L1 Cache主要是集成在CPU内部,而L2 Cache集成在主板上或是CPU上。

2cache的工作原理

同样引用自互动百科:

cache工作原理是基于程序访问的局部性。对大量典型程序运行情况的分析结果表明,在一个较短的时间间隔内,由程序产生的地址往往集中在存储器逻辑地址空间的很小范围内。指令地址的分布本来就是连续的,再加上循环程序段和子程序段要重复执行多次。因此,对这些地址的访问就自然地具有时间上集中分布的倾向。数据分布的这种集中倾向不如指令明显,但对数组的存储和访问以及工作单元的选择都可以使存储器地址相对集中。这种对局部范围的存储器地址频繁访问,而对此范围以外的地址则访问甚少的现象,就称为程序访问的局部性

根据程序的局部性原理,可以在主存和CPU通用寄存器之间设置一个高速的容量相对较小的存储器,把正在执行的指令地址附近的一部分指令或数据从主存调入这个存储器,供CPU在一段时间内使用。这时提高程序的运行速度有很大的作用。这个介于主存和CPU之间的高速小容量存储器称作高速缓冲存储器(cache)。系统正是依据此原理,不断地将与当前指令集相关联的一个不太大块后继指令集从内存读到cache,然后再与CPU高速传送,从而达到速度匹配。

CPU对存储器进行数据请求时,通常先访问cache。由于局部性原理不能保证所访问数据百分之百地在Cache中,这里便存在一个命中率的说法。即CPU在任一时刻从Cache中可靠获取数据的几率。命中率越高,快速正确获取数据的可靠性就越大。Cache的存储容量比主存的容量小得多,但不能太小,太小会使命中率太低;也没有必要过大,过大不仅会增加成本,而且当容量超过一定值后,命中率随容量的增加将不会有明显地增长。只要Cache的空间与主存空间在一定范围内保持适当比例的映射关系Cache的命中率还是相当高的。一般规定Cache与内存的空间比为41000,即128kB Cache可映射32MB内存;256kB Cache可映射64MB内存。在这种情况下,命中率都在90%以上。至于没有命中的数据,CPU只好直接从内存获取。获取的同时,也把它拷进Cache。以备下次访问。


3cache的基本结构

cache通常由相联存储器实现。相联存储器的每一个存储块都具有额外的存储信息,称为标签(Tag)(后续会详细讲解)。当访问相联存储器时,将地址和每一个标签同时进行比较,从而对标签相同的存储块进行访问。

Cache的基本结构中分成三类:

①全相联cache

②直接映像cache

③组相联cache

不过手上目前只有486的资料,那么我就和大家探讨一下关于直接映像的这种结构工作过程。(笔者:486上的架构运行就是第二种。这里我理解不正确的话还望大家指点哦。)其做法是,为Cache中的每个块位置分配一个索引字段,用Tag字段区分存放在Cache位置上的不同的块。单路直接映像把主存储器分成若干页,主存储器的每一页与cache存储器的大小相同,匹配的主存储器的偏移量可以直接映像为cache偏移量。CacheTag存储器保存着主存储器的页地址(页号)(笔者:通过这个页号就可以算出main memory中的实际偏移)。从以上可以看出,直接映像cache优于全相联Cache,能进行快速查找,其缺点是当空存储器的组之间做频繁调用时,cache控制器必须做多次转换。关于这个机构在实际的486上是怎么运行的,让我们拭目以待。


4cache的技术概要

4.1 读取顺序

CPU要读取一个数据时,首先从Cache中查找,如果找到就立即读取并送给CPU处理;如果没有找到,就用相对慢的速度从内存中读取并送给CPU处理,同时把这个数据所在的数据块调入Cache中,可以使得以后对整块数据的读取都从Cache中进行,不必再调用内存。

正是这样的读取机制使CPU读取Cache命中率非常高(大多数CPU可达90%左右),也就是说CPU下一次要读取的数据90%都在Cache中,只有大约10%需要从内存读取。这大大节省了CPU直接读取内存的时间,也使CPU读取数据时基本无需等待。总的来说,CPU读取数据的顺序是先Cache后内存。

4.2 读取命中率

CPUCache中找到有用的数据被称为命中,当Cache中没有CPU所需的数据时(这时称为未命中hit miss)CPU才访问内存。从理论上讲,在一颗拥有2CacheCPU中,读取L1 Cache的命中率为80%CPUL1Cache中找到的有用数据占数据总量80%,剩下的20%L2 Cache读取。由于不能准确预测将要执行的数据,读取L2的命中率也在80%左右(L2读到有用的数据占总数据的16%)

为了保证CPU访问时有较高的命中率,Cache中的内容应该按一定的算法替换。算法是“最近最少使用算法”(latest recently use LRU算法,后面会详细探讨一下关于LRU算法在486上的运用。),它是将最近一段时间内最少被访问过的行(line)淘汰出局。这是一种高效、科学的算法,其计数器的判断过程可以把一些频繁调用后再不需要的数据淘汰出Cache,提高Cache利用率

4.3WB&WT

WBWT分别是write backwrite through的缩写。为了保证cachemain memory的数据一致性而设计的数据更新方法。(笔者:在IA32 programming guide里有比较详细的介绍其他的属性,不过WBWT是两种非常典型的方法,今天在这里拿他们来做详细的分析。)

write through

在这种机制下,每当CPU把数据写到cache中的时候,cache controller就会立即把数据写入对应的main memory。(这种理解更新的机制叫做snarf,这个单词网上还查不到翻译的,所以也就没有翻译。)因此,main memory随时跟踪cache到最新的数据版本,从而也就不会有main memory将新的数据丢失的问题。不过这样的话,也有一个显而易见的缺点,每次更新数据的时候,都会有总线的操作,总线操作就过于频繁,系统的性能会降低。

write back

在这种机制下,cache每个区块的标记中都要设置一个更新为,CPUcache中的一个区块快写入数据后,更新位置要好标记。(比如说为1.)由于cache的速度比DRAM的速度可快了很多很多了,如果先checkcache中有需要更新的数据,那么就会先更新cache中的数据,并且阻止总线把数据写入到main memory中去。如果再次更新的话,cache会把原来存在cache中的数据先写回(write back)到main memory,再做cache内部的数据更新。(笔者:后面会详细讲操作过程。)

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Contents About This Book The MindShare Architecture Series.......................................................................................1 Organization of This Book.......................................................................................................2 Who Should Read This Book ..................................................................................................3 Prerequisite Knowledge...........................................................................................................4 Documentation Conventions...................................................................................................4 Hex Notation.......................................................................................................................4 Binary Notation....................................................................................................................4 Decimal Notation.................................................................................................................4 Signal Name Representation..............................................................................................5 Identification of Bit Fields...................................................................................................5 We Want Your Feedback ..........................................................................................................5 E-Mail/Phone/FAX ............................................................................................................5 Bulletin Board.......................................................................................................................6 Mailing Address ..................................................................................................................6 Chapter 1: 80486 Overview System Performance Prior to the 80486..................................................................................7 The Memory Bottleneck ...........................................................................................................7 The Static Ram, or SRAM, Solution...................................................................................8 The External Cache Solution ..............................................................................................8 Advantage: Reduces Many Memory Accesses to Zero Wait States.......................8 Disadvantage: Memory Accesses Still Bound By Bus Speed..................................8 The 80486 Solution: Internal Code/Data Cache ..............................................................9 Faster Memory Accesses..............................................................................................9 Frees Up the Bus...........................................................................................................9 The Floating-Point Bottleneck ................................................................................................9 The 80386/80387 Solution ................................................................................................10 The 80486 Solution: Integrate the FPU............................................................................10 The 80486 Microarchitecture..................................................................................................10 The Intel Family of 486 Processors .......................................................................................12 80486 System Architecture vi Chapter 2: Functional Units The 80486 Functional Units....................................................................................................13 Introduction.......................................................................................................................13 The 80486 Bus Unit............................................................................................................15 The 80486 Cache Unit........................................................................................................15 The Instruction Pipeline/Decode Unit ...........................................................................16 Instruction Prefetch ....................................................................................................17 Two-Stage Instruction Decode..................................................................................18 Execution .....................................................................................................................18 Register Write-Back....................................................................................................18 The Control Unit ................................................................................................................18 The Floating-Point Unit ....................................................................................................19 The Datapath Unit .............................................................................................................19 The Memory Management Unit (MMU)........................................................................20 Chapter 3: The Hardware Interface Hardware Interface..................................................................................................................21 General ...............................................................................................................................21 Clock .........................................................................................................................................23 Address.....................................................................................................................................23 Data Bus....................................................................................................................................24 Data Bus Parity.........................................................................................................................25 Bus Cycle Definition...............................................................................................................26 Bus Cycle Control ....................................................................................................................27 Burst Control ............................................................................................................................28 Interrupts..................................................................................................................................28 Bus Arbitration........................................................................................................................29 Cache Invalidation ..................................................................................................................30 Cache Control ...........................................................................................................................30 Numeric Error Reporting........................................................................................................32 Bus Size Control.......................................................................................................................32 Address Mask...........................................................................................................................33 SL Technology.........................................................................................................................33 Boundary Scan Interface ........................................................................................................34 Upgrade Processor Support ...................................................................................................35 Chapter 4: The 486 Cache and Line Fill Operations The 486 Caching Solution ......................................................................................................37 The 486 Internal Cache......................................................................................................37 The Advantage of a Level 2 Cache..................................................................................38 The 486 with an L2 Look-Through Cache ...........................................................................38 Contents vii Handling of I/O Reads .....................................................................................................40 Handling of I/O Writes ....................................................................................................40 Handling of Memory Reads.............................................................................................40 Handling of Memory Writes ............................................................................................41 Handling of Memory Reads by Another Bus Master ...................................................41 When a Write-Through Policy is Used ....................................................................42 When a Write-Back Policy is Used...........................................................................42 Handling of Memory Writes by Another Bus Master ..................................................42 When a Write-Through Policy is Used ....................................................................43 When a Write-Back Policy is Used...........................................................................43 The Bus Snooping Process .....................................................................................................45 Summary of the L2 Look-Through Cache Designs ...........................................................45 The 486 with an L2 Look-Aside Cache ................................................................................46 Anatomy of a Memory Read..................................................................................................48 The Internal Cache's View of Main Memory .................................................................48 L1 Memory Read Request ................................................................................................49 The Structure of the L1 Cache Controller.......................................................................49 Set the Cache Stage............................................................................................................50 The Cache Look-Up...........................................................................................................52 The Bus Cycle Request ......................................................................................................52 Memory Subsystem Agrees to Perform a Line Fill .......................................................54 Cache Line Fill Defined.....................................................................................................55 Conversion to a Cache Line Fill Operation ....................................................................56 L2 Cache's Interpretation of the Memory Address .......................................................56 The L2 Cache Look-Up .....................................................................................................57 The Affect of the L2 Cache Read Miss on the Microprocessor ....................................57 Organization of the DRAM Main Memory....................................................................57 The Cache Line Fill Transfer Sequence...........................................................................58 The First Doubleword Is Read from DRAM Memory ..................................................59 First Doubleword Transferred to the L2 Cache and the 80486 Microprocessor .................................................................................................59 Memory Subsystem's Treatment of the Next Three Doubleword Addresses ....................................................................................................60 Transfer of the Second Doubleword to the Microprocessor ........................................60 Memory Subsystem Latching of the Third and Fourth Doublewords .......................61 Transfer of the Third Doubleword..................................................................................61 The Beginning of the End .................................................................................................62 Transfer of the Fourth and Final Doubleword...............................................................62 Internal Cache Update ......................................................................................................62 Summary of the Memory Read........................................................................................64 Burst Transfers from Four-Way Interleaved Memory ......................................................64 Burst Transfers from L2 Cache..............................................................................................66 80486 System Architecture viii The Interrupted Burst .............................................................................................................67 Cache Line Fill Without Bursting.........................................................................................69 Internal Cache Handling of Memory Writes......................................................................73 Invalidation Cycles (486 Cache Snooping) .........................................................................73 L1 and L2 Cache Control ........................................................................................................74 Chapter 5: Bus Transactions (Non-Cache) Overview of 486 Bus Cycles...................................................................................................77 Bus Cycle Definition...............................................................................................................78 Interrupt Acknowledge Bus Cycle .......................................................................................79 Special Cycles..........................................................................................................................79 Shutdown Special Cycle ...................................................................................................80 Flush Special Cycle............................................................................................................80 Halt Special Cycle..............................................................................................................80 Stop Grant Acknowledge .................................................................................................81 Write-Back Special Cycle ..................................................................................................81 Non-Burst Bus Cycles .............................................................................................................81 Transfers with 8-,16-, and 32-bit Devices ............................................................................82 Address Translation..........................................................................................................82 Data Bus Steering...............................................................................................................84 Non-Cacheable Burst Reads ..................................................................................................85 Non-Cacheable Burst Writes .................................................................................................87 Locked Transfers......................................................................................................................89 Pseudo-Locked Transfers .......................................................................................................89 Transactions and BOFF# (Bus Cycle Restart) .....................................................................90 The Bus Cycle State Machine................................................................................................91 I/O Recovery Time...................................................................................................................92 Write Buffers ............................................................................................................................93 General ...............................................................................................................................93 The Write Buffers and I/O Cycles...................................................................................94 Chapter 6: SL Technology Introduction to SL Technology Used in the 486 Processors.............................................95 System Management Mode (SMM) .....................................................................................96 System Management Memory (SMRAM)......................................................................98 The SMRAM Address Map.......................................................................................98 Initializing SMRAM.................................................................................................101 Changing the SMRAM Base Address....................................................................101 Entering SMM..................................................................................................................101 The System Asserts SMI ..........................................................................................101 Back-to-Back SMI Requests..............................................................................102 SMI and Cache Coherency...............................................................................102 Contents ix Pending Writes are Flushed to System Memory..................................................102 SMIACT# is Asserted (SMRAM Accessed)...........................................................103 Processor Saves Its State ..........................................................................................103 Auto-HALT Restart...........................................................................................105 SMM Revision Identifier ..................................................................................105 SMBASE Slot ......................................................................................................106 I/O Instruction Restart .....................................................................................106 The Processor Enters SMM .....................................................................................107 Address Space...........................................................................................................108 Exceptions and Interrupts .......................................................................................108 Executing the SMI Handler ............................................................................................109 Exiting SMM.....................................................................................................................109 Processor’s Response to RSM..................................................................................109 State Save Area Restored.........................................................................................110 Maintaining Cache Coherency When SMRAM is Cacheable.............................111 486 Clock Control...................................................................................................................111 The Stop Grant State........................................................................................................111 Stop Clock State ...............................................................................................................113 Auto-HALT Power Down ..............................................................................................113 Stop Clock Snoop State ...................................................................................................114 Chapter 7: Summary of Software Changes Changes to the Software Environment..............................................................................115 Instruction Set Enhancements.............................................................................................116 The Register Set .....................................................................................................................117 Base Architecture Registers............................................................................................117 The System-Level Registers............................................................................................119 Control Register 0 (CR0)..........................................................................................120 Cache Disable (CD) and Not Write-Through (NW) .....................................121 Alignment Mask (AM)......................................................................................121 Write-Protect (WP) ............................................................................................122 Numeric Exception (NE) ..................................................................................122 Control Register 2 (CR2)..........................................................................................122 Control Register 3 (CR3)..........................................................................................123 Control Register 4 (CR4)..........................................................................................123 Global Descriptor Table Register (GDTR).............................................................124 Interrupt Descriptor Table Register (IDTR) ..........................................................124 Task State Segment Register (TR)...........................................................................124 Local Descriptor Table Register (LDTR)................................................................124 Virtual Paging ..................................................................................................................125 The Floating-Point Registers ...................................................................................126 The Debug and Test Registers ................................................................................128 80486 System Architecture x Chapter 8: The 486SX and 487SX Processors Introduction to the 80486SX and 80487SX Processors.....................................................131 The 486SX Signal Interface ..................................................................................................132 Register Differences..............................................................................................................132 Chapter 9: The 486DX2 and 486SX2 Processors The Clock Doubler Processors ............................................................................................135 Chapter 10: The Write Back Enhanced 486DX2 Introduction to the Write Back Enhanced 486DX2 ..........................................................137 Advantage of the Write-Back Policy ..................................................................................138 The Write-Through Policy..............................................................................................138 The Write-Back Policy.....................................................................................................139 Signal Interface ......................................................................................................................139 New Signals......................................................................................................................139 Existing Signals with Modified Functionality..............................................................141 The MESI Model....................................................................................................................141 Write Back Enhanced 486DX2 System without an L2 Cache.........................................144 Cache Line Fill..................................................................................................................144 Bus Master Read — Processor Snoop ...........................................................................146 Bus Master Write — Processor Snoop ..........................................................................148 Write Back Enhanced 486DX2 System with an L2 Cache...............................................150 The L2 Cache with a Write-Through Policy.................................................................151 The L2 Cache with a Write-Back Policy........................................................................152 Snoop Cycle During Cache Line Fill .............................................................................152 Special Cycles.........................................................................................................................155 Clock Control.........................................................................................................................156 Chapter 11: The 486DX4 Processor Primary Feature of the 486DX4 Processor .........................................................................159 Clock Multiplier ....................................................................................................................159 16KB Internal Cache..............................................................................................................160 5vdc Tolerant Design ............................................................................................................162 Glossary..................................................................................................................................165 Index........................................................................................................................................183 Figures xi Figure 1-1. Subsystems Integrated into the 80486 ................................................................ 11 Figure 2-1. 80486 Microarchitecture ....................................................................................... 14 Figure 2-2. The Elements Comprising the 80486 Bus Unit .................................................. 16 Figure 2-3. 80486 Instruction Pipeline.................................................................................... 17 Figure 3-1. 80486 Pin Designations......................................................................................... 22 Figure 4-1. The 486 Processor with an L2 Look-Through cache........................................ 39 Figure 4-2. The 80486 with a Look-Aside External Cache................................................... 47 Figure 4-3. The Structure of the L1 Cache ............................................................................. 51 Figure 4-4. Internal Cache Interpretation of the Memory Address.................................... 52 Figure 4-5. Memory Address at the Start of the Bus Cycle ................................................. 53 Figure 4-6. Cache Line Fill with Bursting .............................................................................. 54 Figure 4-7. 64-Bit Interleaved Memory Architecture ........................................................... 58 Figure 4-8. The LRU Algorithm .............................................................................................. 63 Figure 4-9. 4-way Interleaved Memory Designed to Support Burst Transfers. ............... 65 Figure 4-10. Burst Timing from 4-way Interleaved Memory............................................. 66 Figure 4-11. Burst Timing from L2 Cache.............................................................................. 67 Figure 4-12. The Interrupted Burst......................................................................................... 69 Figure 4-13. Non-Burst Cache Line Fill.................................................................................. 72 Figure 4-14. Cache Invalidation Cycle ................................................................................... 74 Figure 5-1. Example of Non-Burst Cycle Timing.................................................................. 82 Figure 5-2. Address Translation for 8, 16, and 32-bit Devices ............................................ 83 Figure 5-3. System Logic Used to Perform Data Bus Steering............................................ 84 Figure 5-4. Non-Cacheable Burst Read Bus Cycle................................................................ 86 Figure 5-5. Non-Cacheable Burst Write Bus Cycle............................................................... 88 Figure 5-6. 80486 Bus Cycle States.......................................................................................... 91 Figure 6-1. Address Space Available to Processor when Operating in Different Modes................................................................................................. 97 Figure 6-2. Sample Layout of SMM Memory........................................................................ 99 Figure 6-3. Typical PC Memory Map (SMM Disabled versus SMM Enabled)................. 100 Figure 6-4. The Processor’s SMM State-Save Map ............................................................... 104 Figure 6-5. SMM Revision Identifier Definition ................................................................... 105 Figure 6-6. Stop Clock State Diagram .................................................................................... 112 Figure 7-1. The BSWAP Instruction........................................................................................ 117 Figure 7-2. 80486 Base Architecture Registers ...................................................................... 118 Figure 7-3. 486 EFlags Register Definition............................................................................. 119 Figure 7-4. 80486 System Registers......................................................................................... 120 Figure 7-5. Bit definition for CR0............................................................................................ 121 Figure 7-6. Format of CR3........................................................................................................ 123 Figure 7-7. Format of CR4........................................................................................................ 124 Figure 7-8. The 80486 Floating-Point Registers..................................................................... 128 Figure 7-9. The 80486 Debug and Test Registers .................................................................. 129 80486 System Architecture xii Figure 10-1. Example of System with Write Back Enhanced 486DX2 (no L2 Cache)............................................................................................................................ 145 Figure 10-2. Example Cache Line Fill — Write-Back Mode Enabled................................. 146 Figure 10-3. External Snoop Performed by Enhanced Write Back 486DX2 Processor ..................................................................................................................... 149 Figure 10-4. Write Back Enhanced 486 with Look-Through L2 Cache.............................. 150 Figure 10-5. Cache Line Fill with External Snoop ................................................................ 154 Figure 10-6. Stop Clock State Machine for Enhanced Bus Mode ....................................... 156 Figure 11-1. Organization of the 486DX4 Internal Cache.................................................... 161

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值