【协议森林】NUMA基本知识

最新推荐文章于 2024-08-13 10:35:27 发布

协议森林

最新推荐文章于 2024-08-13 10:35:27 发布

阅读量908

点赞数

分类专栏： Linux内幕文章标签： numa

本文链接：https://blog.csdn.net/u012503639/article/details/121210352

版权

Linux内幕专栏收录该内容

16 篇文章 2 订阅

订阅专栏

1.简述

早期的计算机，内存控制器还没有整合进 CPU，所有的内存访问都需要经过北桥芯片来完成。如下图所示，CPU 通过前端总线（FSB，Front Side Bus）连接到北桥芯片，然后北桥芯片连接到内存——内存控制器集成在北桥芯片里面。
在这里插入图片描述
这样的架构称为UMA(Uniform Memory Access)，直译为“统一内存访问”，这样的架构对软件层面来说非常容易，总线模型保证所有的内存访问是一致的，即每个处理器核心共享相同的内存地址空间。但随着CPU核心数的增加，这样的架构难免遇到问题，比如对总线的带宽带来挑战、访问同一块内存的冲突问题。为了解决这些问题，诞生了NUMA。

2.NUMA 基本框架

NUMA 全称 Non-Uniform Memory Access，译为“非一致性内存访问”。这种构架下，不同的内存器件和CPU核心从属不同的 Node，每个 Node 都有自己的集成内存控制器（IMC，Integrated Memory Controller）。在 Node 内部，架构类似SMP，使用 IMC Bus 进行不同核心间的通信；不同的 Node 间通过QPI（Quick Path Interconnect）进行通信：
在这里插入图片描述

CPU 厂商把内存控制器集成到 CPU 内部，一般一个 CPU socket 会有一个独立的内存控制器。
每个 CPU scoket 独立连接到一部分内存，这部分 CPU 直连的内存称为“本地内存”。
CPU 之间通过 QPI（Quick Path Interconnect）总线进行连接。CPU 可以通过 QPI 总线访问不和自己直连的“远程内存”。

和 UMA 架构不同，在 NUMA 架构下，内存的访问出现了本地和远程的区别：访问远程内存的延时会明显高于访问本地内存。

3.NUMA基本操作

执行 numactl --hardware 可以查看硬件对 NUMA 的支持信息：

# numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
node 0 size: 96920 MB
node 0 free: 2951 MB
node 1 cpus: 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
node 1 size: 98304 MB
node 1 free: 33 MB
node distances:
node   0   1 
  0:  10  21 
  1:  21  10

CPU 被分成 node 0 和 node 1 两组（这台机器有两个 CPU Socket）。
一组 CPU 分配到 96 GB 的内存（这台机器总共有 192GB 内存）。
node distances 是一个二维矩阵，node[i][j] 表示 node i 访问 node j 的内存的相对距离。比如 node 0 访问 node 0 的内存的距离是 10，而 node 0 访问 node 1 的内存的距离是 21。

执行 numactl --show 显示当前的 NUMA 设置：

# numactl --show
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 
cpubind: 0 1 
nodebind: 0 1 
membind: 0 1

其他操作：

root@ubuntu:~# numactl -h
numactl: invalid option -- 'h'
usage: numactl [--all | -a] [--interleave= | -i <nodes>] [--preferred= | -p <node>]
               [--physcpubind= | -C <cpus>] [--cpunodebind= | -N <nodes>]
               [--membind= | -m <nodes>] [--localalloc | -l] command args ...
       numactl [--show | -s]
       numactl [--hardware | -H]
       numactl [--length | -l <length>] [--offset | -o <offset>] [--shmmode | -M <shmmode>]
               [--strict | -t]
               [--shmid | -I <id>] --shm | -S <shmkeyfile>
               [--shmid | -I <id>] --file | -f <tmpfsfile>
               [--huge | -u] [--touch | -T]
               memory policy | --dump | -d | --dump-nodes | -D

memory policy is --interleave | -i, --preferred | -p, --membind | -m, --localalloc | -l
<nodes> is a comma delimited list of node numbers or A-B ranges or all.
Instead of a number a node can also be:
  netdev:DEV the node connected to network device DEV
  file:PATH  the node the block device of path is connected to
  ip:HOST    the node of the network device host routes through
  block:PATH the node of block device path
  pci:[seg:]bus:dev[:func] The node of a PCI device
<cpus> is a comma delimited list of cpu numbers or A-B ranges or all
all ranges can be inverted with !
all numbers and ranges can be made cpuset-relative with +
the old --cpubind argument is deprecated.
use --cpunodebind or --physcpubind instead
<length> can have g (GB), m (MB) or k (KB) suffixes

4.测试NUMA

#include <sys/time.h>

#include <iostream>
#include <string>
#include <vector>

int main(int argc, char** argv) {
  int size = std::stoi(argv[1]);
  std::vector<std::vector<uint64_t>> data(size, std::vector<uint64_t>(size));

  struct timeval b;
  gettimeofday(&b, nullptr);
  # 按列遍历，避免 CPU cache 的影响
  for (int col = 0; col < size; ++col) {
    for (int row = 0; row < size; ++row) {
      data[row][col] = rand();
    }
  }

  struct timeval e;
  gettimeofday(&e, nullptr);

  std::cout << "Use time "
            << e.tv_sec * 1000000 + e.tv_usec - b.tv_sec * 1000000 - b.tv_usec
            << "us" << std::endl;
}

# numactl --cpubind=0 --membind=0 ./numa_test 20000
Use time 16465637us
# numactl --cpubind=0 --membind=1 ./numa_test 20000 
Use time 21402436us