lscpu的查看方法

这是dgx A100工作站,以这个机器举例:

  • CPU(S): 逻辑核心数
  • Thread(s) per core : 每个物理cpu核包含几个线程,即逻辑核
  • sockets : CPU的物理插槽数
  • Core(s) per socket : 每个插槽上的物理CPU 核数
    • NUMA node(s): 节点数。逻辑CPU核的组合形式,多个CPU组合形成一个numa节点,节点之间的通信速度低于节点内部的通信速度。
  • NUMA node0 CPU(s): node0包含哪些逻辑CPU。这里node 0包含的逻辑CPU编号为0-15,128-143.

根据以上信息,可以知道这台机子一共有2个CPU物理插槽,每个插槽上64个CPU 物理核,也就是说一共有128个CPU物理核,但是每个核包含两个线程(超线程,一个线程被称为一个逻辑CPU核),所以这台机子包含 128 × 2 = 256 128\times2 = 256 128×2=256个逻辑CPU 核。

这256个逻辑CPU核以NUMA NODE的形式组合在一起,一共有8个NODE,每个NODE包含32个逻辑CPU核。每个NODE包含哪些CPU逻辑核也可以看出来。

apps@dgx-a100:~/jinrf/test/nwchem/large_offical$ numactl -s
policy: default
preferred node: current
physcpubind: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 
cpubind: 0 1 2 3 4 5 6 7 
nodebind: 0 1 2 3 4 5 6 7 
membind: 0 1 2 3 4 5 6 7 
apps@dgx-a100:~/jinrf/test/nwchem/large_offical$ lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        8
Vendor ID:           AuthenticAMD
CPU family:          23
Model:               49
Model name:          AMD EPYC 7742 64-Core Processor
Stepping:            0
CPU MHz:             3382.761
CPU max MHz:         2250.0000
CPU min MHz:         1500.0000
BogoMIPS:            4491.52
Virtualization:      AMD-V
L1d cache:           32K
L1i cache:           32K
L2 cache:            512K
L3 cache:            16384K
NUMA node0 CPU(s):   0-15,128-143
NUMA node1 CPU(s):   16-31,144-159
NUMA node2 CPU(s):   32-47,160-175
NUMA node3 CPU(s):   48-63,176-191
NUMA node4 CPU(s):   64-79,192-207
NUMA node5 CPU(s):   80-95,208-223
NUMA node6 CPU(s):   96-111,224-239
NUMA node7 CPU(s):   112-127,240-255
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdrfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm  perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseedoccup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists succor smca

如果想让程序只在一个node上的CPU里面跑 numactl

numactl runs processes with a specific NUMA scheduling or memory placement policy.

这是来自MAN的说明,也就是说,NUMA可以控制让你的进程跑在哪些CPU上,从哪里取内存。这是因为NUMA(Non-Uniform Memory Access)非一致性内存访问架构,CPU和CPU是不一样的,不在同一个NODE里面的CPU访问相同地方的内存速度是不一样的(当然应该还有别的不同,但是暂时不知道),所以提供了这个命令让用户手动控制进程跑在哪里,取哪里的内存等等。

提供的几个选项包括

--membind=nodes, -m nodes
    Only allocate memory from nodes.  
    Allocation will fail when there is not enough memory available on these nodes.  
    nodes may be speciied as noted above.

 --cpunodebind=nodes, -N nodes
    Only execute command on the CPUs of nodes.  
    Note that nodes may consist of multiple CPUs.  
    nodes may be specified as N,N,N or  N-N or N,N-N or  N-N,N-N and so forth.

 --physcpubind=cpus, -C cpus
     Only execute process on cpus.  
     This accepts cpu numbers as shown in the processor fields of /proc/cpuinfo, or  relative  cpus  as  in relative  to  the current cpuset.  
     You may specify "all", which means all cpus in the current cpuset. 
     Physical cpus may be specified as N,N,N or  N-N or N,N-N or  N-N,N-N and so forth.  
     Relative cpus may be specifed as +N,N,N or  +N-N or +N,N-N and so forth.  
     The  + indicates that the cpu numbers are relative to the process' set of allowed cpus in its current cpuset.  
     A  !N-N notation indicates the inverse of N-N, in other words all cpus except N-N.  
     If used with + notation, specify !+N-N.

使用下面命令,让程序只在NODE 2包含的CPU上运行,内存也只从NODE 2包含的内存取(要是NODE 2的内存不够就会取消取内存的限制)

numactl --cpunodebind=2 --membind=2 -- mpirun -np $1 nwchem simple.nw |tee $2 
  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 5
    评论
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值