2080Ti双卡开启NVLink

本文介绍了如何在Ubuntu 22.04系统下给2080Ti双卡开启NVLink的方法。
首先要确保安装好CUDA及配置好环境变量

1、开启

nvidia-smi -pm 1
sudo reboot
nvidia-smi topo -m

结果如下:

(base) root@myd-gpu:~# nvidia-smi topo -m
        GPU0    GPU1    CPU Affinity    NUMA Affinity   GPU NUMA ID
GPU0     X      NV2     0-11,24-35      0               N/A
GPU1    NV2      X      12-23,36-47     1               N/A

Legend:

  X    = Self
  SYS  = Connection traversing PCIe as well as the SMP interconnect between NUMA nodes (e.g., QPI/UPI)
  NODE = Connection traversing PCIe as well as the interconnect between PCIe Host Bridges within a NUMA node
  PHB  = Connection traversing PCIe as well as a PCIe Host Bridge (typically the CPU)
  PXB  = Connection traversing multiple PCIe bridges (without traversing the PCIe Host Bridge)
  PIX  = Connection traversing at most a single PCIe bridge
  NV#  = Connection traversing a bonded set of # NVLinks

2、测试

下载官方例程

git clone https://github.com/NVIDIA/cuda-samples.git

编译运行

pip install cmake
cd cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest
mkdir build && cd build
cmake ..
make -j$(nproc)
./p2pBandwidthLatencyTest

查看结果

(base) root@myd-gpu:~/cuda-samples/Samples/5_Domain_Specific/p2pBandwidthLatencyTest/build# ./p2pBandwidthLatencyTest 
[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
Device: 0, NVIDIA GeForce RTX 2080 Ti, pciBusID: 4, pciDeviceID: 0, pciDomainID:0
Device: 1, NVIDIA GeForce RTX 2080 Ti, pciBusID: 81, pciDeviceID: 0, pciDomainID:0
Device=0 CAN Access Peer Device=1
Device=1 CAN Access Peer Device=0

***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

P2P Connectivity Matrix
     D\D     0     1
     0       1     1
     1       1     1
Unidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 541.95   5.67 
     1   5.72 536.94 
Unidirectional P2P=Enabled Bandwidth (P2P Writes) Matrix (GB/s)
   D\D     0      1 
     0 523.63  47.11 
     1  47.11 536.57 
Bidirectional P2P=Disabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 535.84   8.49 
     1   8.44 533.98 
Bidirectional P2P=Enabled Bandwidth Matrix (GB/s)
   D\D     0      1 
     0 534.00  94.18 
     1  94.13 533.34 
P2P=Disabled Latency Matrix (us)
   GPU     0      1 
     0   1.48  16.92 
     1  14.64   1.34 

   CPU     0      1 
     0   3.10   9.39 
     1   9.35   3.30 
P2P=Enabled Latency (P2P Writes) Matrix (us)
   GPU     0      1 
     0   1.34   1.46 
     1   1.53   1.34 

   CPU     0      1 
     0   2.96   2.60 
     1   2.73   3.30 

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值