Interleaving
Interleaving is an advanced technique used by high-end motherboards/chipsets to improve memory performance. Memory interleaving increases bandwidth by allowing simultaneous access to more than one chunk of memory. This improves performance because the processor can transfer more information to/from memory in the same amount of time, and helps alleviate the processor-memory bottleneck that is a major limiting factor in overall performance.
Interleaving works by dividing the system memory into multiple blocks. The most common numbers are two or four, calledtwo-way or four-way interleaving, respectively. Each block of memory is accessed using different sets of control lines, which are merged together on the memory bus. When a read or write is begun to one block, a read or write to other blocks can be overlapped with the first one. The more blocks, the more that overlapping can be done. As an analogy, consider eating a plate of food with a fork. Two-way interleaving would mean dividing the food onto two plates and eating with both hands, using two forks. (Four-way interleaving would require two more hands. :^) ) Remember that here the processor is doing the "eating" and it is much faster than the forks (memory) "feeding" it (unlike a person, whose hands are generally faster.)
In order to get the best performance from this type of memory system, consecutive memory addresses are spread over the different blocks of memory. In other words, if you have 4 blocks of interleaved memory, the system doesn't fill the first block, and then the second and so on. It uses all 4 blocks, spreading the memory around so that the interleaving can be exploited.
Interleaving is an advanced technique that is not generally supported by most PC motherboards, most likely due to cost. It is most helpful on high-end systems, especially servers, that have to process a great deal of information quickly. The Intel Orion chipset is one that does support memory interleaving.
http://www.pcguide.com/ref/ram/timingInterleaving-c.html
----------------------------------------------------------------------------------------------------------------------------
What is Interleaved Memory
The new Macintosh Centris 650 and Quadra 800 computers introduced today feature a newly designed memory controller which supports interleaved memory. The following article explains how interleaved memory works on these machines and how to configure the machines for maximum performance.
Interleaved Memory on the Centris 650 and Quadra 800
The main memory subsystem of the Macintosh Centris 650 and Quadra 800 computers makes use of a memory access technique called "interleaved memory". This memory organization serves to reduce the overall access time of the 68040 processor into DRAM. The following description illustrates how this memory organization works and why it results in reduced memory access time.
Non-interleaved Memory System
In a non-interleaved memory system, all of the first bank of memory, bank 0, is addressed before the first long word of the second bank of memory, bank 1, all of bank 1 is addressed before the first long word of bank 2, and so on. Figure 1 shows this organization for two banks of N long words. (A long word is 4 bytes, or 32 bits, and is the natural unit of memory for the 68040.)
The 68040 performs burst accesses (a single bus transaction that reads or writes 16 bytes in 4 adjacent long words) to move data between its caches and memory. All 16 bytes come from one bank of DRAM in a non-interleaved memory system, so the time required to complete the transfer depends directly on the access time of the DRAM. Figure 2 shows an example of such a burst access. The time needed to access the 2nd, 3rd, and 4th long words is shorter because a feature of the DRAMs called "page-mode access" is used.
Interleaved Memory System
In an interleaved memory system, there are still two physical banks of DRAM, but logically the system sees one bank of memory that is twice as large. In the interleaved bank, the first long word of bank 0 is followed by the first long word of bank 1, which is followed by the second long word of bank 0, which is followed by the second long word of bank 1, and so on. Figure 3 shows this organization for two physical banks of N long words. All even long words of the logical bank are located in physical bank 0 and all odd long words are located in physical bank 1.
The interleaved memory configuration is designed to speed up 68040 burst accesses by as much as 30%. (The actual improvement depends on the system clock speed and the DRAM access time.) Since the four long words of a burst access are spread across two physical banks of DRAM, the individual accesses can be overlapped to hide part, or all, of the DRAM access time delay, as shown below in Figure 4.
Centris 650 / Quadra 800 Memory Organization
Physically, the DRAM in a Centris 650 or Quadra 800 system is organized as 10 banks of memory, where each bank is 32 bits wide and 4 or 16 MBytes deep. Logically, the DRAM is organized as 5 pairs of banks, any of which may or may not be interleaved. At system boot time, each pair of DRAM banks is examined; if they are the same size (4 or 16 MBytes) the interleaved memory configuration for that bank pair will be enabled. Otherwise, the bank pair will be left in the non-interleaved configuration. The memory controller in the C610/Q800 is capable of operating with some bank pairs in the interleaved configuration and some bank pairs in the non-interleaved configuration. The type of memory access which is performed is determined dynamically at the start of each cycle based on the value of an "interleave configuration register" within the memory controller. ROM accesses cannot be interleaved since there is only a single bank of ROM.
The C650/Q800 motherboard contains 4 or 8 MB of DRAM and 4 DRAM SIMM sockets. Systems which contain 8 MB on the motherboard (all Q800s and some C650s) already interleave the two 4 MB banks soldered to the motherboard. Systems which have only 4 MB soldered on the motherboard cannot interleave the single soldered 4 MB DRAM bank, although DRAM on SIMMs can still be interleaved.
Each DRAM SIMM can contain either one or two banks of DRAM. The C650 and Q800 use 72 pin DRAM SIMMs - these SIMMs have a 32-bit data path, allowing memory upgrades to be performed with a single SIMM. Single-sided SIMMs contain one DRAM bank; double-sided SIMMs contain two DRAM banks. A double-sided SIMM cannot contain an interleaved bank pair since there are not enough pins on the SIMM to accommodate the two 32-bit data buses required for interleaved memory. Interleaving can only be done between DRAM SIMM pairs.
The motherboard contains banks 0 & 1, SIMM slot 1 contains banks 2 & 3 (remember, SIMMs can be double-sided and contain 2 banks of DRAM), slot 2 contains banks 4 & 5, slot 3 contains banks 6 & 7, and slot 4 contains banks 8 & 9. SIMM slot pairs 1-2 and 3-4 are interleaved together whenever a bank pair is of the same size. For example, if 4 MB SIMMs are placed in both SIMM slots 1 and 2, then that memory will be interleaved (banks 2 & 4). If a double-sided 8 MB SIMM (i.e, a SIMM with two 4 MB banks on it) is placed in slot 1, and a single-sided 4 MB SIMM is placed in bank 2, then two of the banks will be interleaved (banks 2 & 4) and one bank will not be interleaved (bank 3).
The gist of all this is that in order to maximally enable memory interleaving, memory upgrades should be performed with a pair of SIMMs, both of the same size. A single SIMM can be used for memory expansion, but will result in a portion of memory being non-interleaved. The system actually takes care of configuring everything automatically at boot, regardless of what memory is installed. However, by physically configuring DRAM in identically sized bank pairs, the fastest overall memory access is achieved (i.e., the highest performance). The actual performance delta between an interleaved and non-interleaved memory system will depend on the application, and will vary from application to application.
- Dale Adams
Apple Computer
-------------------------------------------------------------------------------------------------------------------------------
These questions and answers appeared in the Apple Information Alley
Q. Is there any information available on the performance gains or losses when using interleaved instead of non-interleaved memory on PCI-based Power Macintosh computers?
Specifically how much of a speed advantage would be lost if you use one 16 MB SIMM rather than two 8 MB SIMMs in an interleaved arrangement?
A. For increased performance it is better to configure a PCI-based Power Macintosh computer for memory interleaving rather than installing memory in a non-interleaved configuration. This means that you will get better performance if you configure your system with two 16 MB DIMMs rather than one 32 MB DIMM. This applies to all other combinations of same-sized DIMMs.
The actual performance will vary from computer to computer. In general, a Power Macintosh with a PowerPC 604 microprocessor, such as the Power Macintosh 8500 or 9500 series computer, gets anywhere from a 5% to 15% boost in performance. The average is about an 8% increase in performance speed. On a Power Macintosh with a PowerPC 601 microprocessor, such as the Power Macintosh 7500 series, you may get only a slightly better performance gain by using memory interleaving rather than non-interleaved DIMMs. Some third-party benchmarking applications may report exaggerated performance differences between interleaved and non-interleaved computers.
Q.How do I populate DIMMs in my PCI-based Power Macintosh Computer to maximize performance using memory interleaving? If I have an odd number of DIMMS, where should I place the odd DIMM to get the best performance from memory interleaving?
A. Iinterleaving is accomplished by 'pairing' two DIMMs in corresponding slots. That is, one DIMM in A1, and another DIMM in B1 will set the machine up to use memory interleaving.
If you have an odd number of DIMMs, the matched pairs will run the memory interleaved. The odd DIMM will then run non-interleaved. For the interleaving to be most effective, the DIMMs must be the same size and speed, (usually, should be of the same manufacturer, but not necessary). In reference to the memory addressing, the A1/B1 will be the lower addresses, going up to the A6/B6 being the highest address.
In relation to performance, it really does not matter where the DIMMs are placed. The software is intelligent enough to figure out which banks are being used, and is able to "stitch" the memory together as required.
Note: Memory interleaving is only available in the Power Macintosh 7500, 8500, and 9500 series computers. The Power Macintosh 7200 uses a different memory controller which does not support interleaving.
http://macspeedzone.com/archive/Comparison/WhatisInterleaved.html