The Findings
Under the conventional interleaved memory mapping, DRAM row buffer conflicts come from the following three sources. (1) Conflict misses in the last level on-chip cache lead to DRAM row buffer misses. (2) Write back conflicts in the last level on-chip cache also lead to DRAM row buffer conflicts. (3) Certain sequential memory access patterns that make the distance of memory locations between consecutive data elements being accessed be a multiple of the accumulative size of all row buffers of the memory banks, will cause row buffer conflicts.
The Permutation based Solution for Interleaved Memory
The two architecture related sources of DRAM row buffer conflicts indicate that the address mapping symmetry between cache and DRAM is a structural problem in the memory hierarchy under the conventional interleaved memory mapping. To break this symmetry needs an external force, and the permutation-based page interleaving effectively serves this purpose. The permutation-based interleaved memory has the following three properties. (1) Conflict addresses of the last level on-chip cache are distributed onto different DRAM banks. (2) All addresses in the same memory page are still in the same page as the conventional interleaved memory. (3) Memory pages are uniformly mapped among memory banks. The cost of permutation to generate each memory bank index is trivial, which is not in the critical path in the deep memory hierarchy, and can be overlapped with operation at the cache level.
The Usage of the Permutation-based Page Interleaving in Computers
The permutation-based solution in interleaved memory was first used by the Sun
MicroSystems in the UltraSPARC IIIi processor in 2001 for its entry level servers, workstations, and desktop products. Today, the permutation-based page ininterleaving can be found in almost all the commercial microprocessors, such as AMD, Intel, and NVIDIA, for embedded systems, laptops, desktops, and enterprise servers.