Stampede2 RAM Arrangement
Stampede2 includes 4,200 Intel Xeon Phi compute nodes with KNL processors, and 1,736 Intel Xeon compute nodes with Skylake processors. While the two sides of the system have notable differences, the general organization of Stampede2 follows the typical pattern for clusters, which can be summarized and illustrated as follows:
-
KNL nodes: distributed memory
Each KNL node has 96GB DDR4 RAM
Each KNL node has 16GB MCDRAM high bandwidth memory (HBM) as well
Memory is local to each node and is not directly accessible from other nodes -
Skylake nodes: distributed memory
Each Skylake Xeon (SKX) node has 192GB DDR4 RAM
Again, memory is local to each node and is not directly accessible from other nodes -
Memory spans all cores on a node (of either type): shared memory
A node’s full local memory is addressable from any core in any socket -
One or two sockets per node
Each KNL node has one socket (to hold one Intel “KNL” processor)
Each Xeon node has two sockets (to hold two Intel “Skylake” processors) -
Multiple cores per socket
Each KNL socket (processor) has 68 cores
Each Xeon socket (processor) has 24 cores -
Memory is attached to sockets
Cores sharing the socket have fastest access to attached memory
KNL has additional core-level memory locality (to be discussed later)
In most of the following diagrams, we will use the Xeon Skylake processor as a model in order to make the figures easier to follow (which wouldn’t be the case if we always drew KNL cores!).
真有钱。。。