基于kcores大语言模型推理专用显存天梯作为参考,运行 llama-3.1-70b-instruct-4bit 模型的情况下,计算单位显卡对应token生成数量(理论性能,未计算损耗,仅供参考),并进行排名。
单位显卡(或集群)每秒理论token数量排行:
显卡名称 | 显卡数量 | 每秒总token | 显卡平均token | 排名 |
NVIDIA GB200 NVL72 | 1 | 12000 | 12000 | 1 |
NVIDIA GB200 Grace Blackwell Superchip | 1 | 333.33 | 333.33 | 2 |
NVIDIA B200 SXM 192GB | 1 | 170.83 | 170.83 | 3 |
NVIDIA H100 PCIe/SXM5 96GB | 1 | 70 | 70 | 4 |
NVIDIA H100 SXM5 80GB | 1 | 70 | 70 | 5 |
NVIDIA H800 SXM5 80GB | 1 | 70 | 70 | 6 |
NVIDIA H100 PCIe/CNX 80GB | 1 | 42.5 | 42.5 | 7 |
NVIDIA A100/A100X SXM4 80GB | 1 | 42.5 | 42.5 | 8 |
NVIDIA A800 SXM4 80GB | 1 | 42.5 | 42.5 | 9 |
NVIDIA H800 PCIe 80GB | 1 | 42.5 | 42.5 | 10 |
NVIDIA H100 SXM5 64GB | 1 | 42.08 | 42.08 | 11 |
NVIDIA A100 PCIe 80GB | 1 | 40.42 | 40.42 | 12 |
NVIDIA A800 80GB Active Ampere | 1 | 40.42 | 40.42 | 13 |
NVIDIA GRID/DRIVE A100A 32GB | 2 | 77.92 | 38.96 | 14 |
NVIDIA GRID A100B 48GB | 1 | 38.96 | 38.96 | 15 |
NVIDIA GeForce RTX 5090 32GB (Preliminary) | 2 | 74.67 | 37.335 | 16 |
NVIDIA A100 PCIe/SXM4 40GB | 2 | 65 | 32.5 | 17 |
NVIDIA A800 40GB Active Ampere | 2 | 65 | 32.5 | 18 |
NVIDIA A30X 24GB | 2 | 50.83 | 25.415 | 19 |
NVIDIA Tesla V100 SXM2 16GB (version 2019) | 4 | 94.17 | 23.5425 | 20 |
NVIDIA Tesla V100S PCIe 32GB | 2 | 47.08 | 23.54 | 21 |
NVIDIA GeForce RTX 5080 16GB (Preliminary) | 4 | 85.33 | 21.3325 | 22 |
NVIDIA GeForce RTX 4090 24GB | 2 | 42.08 | 21.04 | 23 |
NVIDIA GeForce RTX 3090 Ti 24GB | 2 | 42.08 | 21.04 | 24 |
NVIDIA Tesla V100 SXM3 32GB | 2 | 40.88 | 20.44 | 25 |
NVIDIA RTX 6000 Ada 48GB | 1 | 20 | 20 | 26 |
NVIDIA GeForce RTX 3090 24GB | 2 | 39.01 | 19.505 | 27 |
NVIDIA A30 PCIe 24GB | 2 | 38.88 | 19.44 | 28 |
NVIDIA GeForce RTX 3080 12GB (Ti 12GB) | 4 | 76.03 | 19.0075 | 29 |
NVIDIA Tesla V100 PCIe/SXM2/DGXS 32GB | 2 | 37.42 | 18.71 | 30 |
NVIDIA Tesla V100 PCIe/SXM2 16GB | 4 | 74.75 | 18.6875 | 31 |
NVIDIA GeForce RTX 5070 Ti 16GB (Preliminary) | 4 | 74.67 | 18.6675 | 32 |
NVIDIA Quadro GV100 32GB | 2 | 36.18 | 18.09 | 33 |
NVIDIA TITAN V CEO Edition 32GB | 2 | 36.18 | 18.09 | 34 |
NVIDIA L40/L40G 24GB | 2 | 36 | 18 | 35 |
NVIDIA L20 48GB | 1 | 18 | 18 | 36 |
NVIDIA L40/L40S 48GB | 1 | 18 | 18 | 37 |
NVIDIA RTX 5880 Ada 48GB | 1 | 18 | 18 | 38 |
Apple MacStudio M1 Ultra 64GB | 1 | 17.07 | 17.07 | 39 |
Apple MacStudio M1 Ultra 128GB | 1 | 17.07 | 17.07 | 40 |
Apple MacStudio M2 Ultra 64GB | 1 | 17.07 | 17.07 | 41 |
Apple MacStudio M2 Ultra 128GB | 1 | 17.07 | 17.07 | 42 |
Apple MacStudio M2 Ultra 192GB | 1 | 17.07 | 17.07 | 43 |
DDR6 12 Channel 8400 512GB | 1 | 16.8 | 16.8 | 44 |
NVIDIA A16 PCIe 64GB | 1 | 16.68 | 16.68 | 45 |
NVIDIA RTX A5500 Ampere 24GB | 2 | 32 | 16 | 46 |
NVIDIA RTX A5000 Ampere 24GB | 2 | 32 | 16 | 47 |
NVIDIA RTX A6000 Ampere 48GB | 1 | 16 | 16 | 48 |
NVIDIA GeForce RTX 3080 10GB | 8 | 126.72 | 15.84 | 49 |
NVIDIA GeForce RTX 3080 Ti 20GB | 4 | 63.36 | 15.84 | 50 |
NVIDIA GeForce RTX 4080 SUPER 16GB | 4 | 61.36 | 15.34 | 51 |
NVIDIA Quadro GP100 16GB | 4 | 61.02 | 15.255 | 52 |
NVIDIA Tesla P100 SXM2/DGXS 16GB | 4 | 61.02 | 15.255 | 53 |
NVIDIA GeForce RTX 4080 16GB | 4 | 59.73 | 14.9325 | 54 |
NVIDIA A40 PCIe 48GB | 1 | 14.5 | 14.5 | 55 |
NVIDIA Tesla P10 24GB | 2 | 28.93 | 14.465 | 56 |
NVIDIA GeForce RTX 4070 Ti SUPER 16GB | 4 | 56.03 | 14.0075 | 57 |
NVIDIA GeForce RTX 5070 12GB (Preliminary) | 4 | 56 | 14 | 58 |
NVIDIA Quadro RTX 6000 24GB | 2 | 28 | 14 | 59 |
NVIDIA TITAN RTX 24GB | 2 | 28 | 14 | 60 |
NVIDIA Quadro RTX 8000 48GB | 1 | 14 | 14 | 61 |
NVIDIA TITAN V 12GB | 4 | 54.28 | 13.57 | 62 |
NVIDIA RTX A4500 Ampere 20GB | 4 | 53.33 | 13.3325 | 63 |
NVIDIA GeForce RTX 2080 Ti 11GB | 8 | 102.67 | 12.83375 | 64 |
NVIDIA GeForce RTX 3070 Ti 8GB | 8 | 101.38 | 12.6725 | 65 |
NVIDIA GeForce RTX 3060 Ti GDDR6X 8GB | 8 | 101.38 | 12.6725 | 66 |
NVIDIA GeForce RTX 3070 Ti 16GB | 4 | 50.69 | 12.6725 | 67 |
NVIDIA A10M 24GB | 2 | 25.01 | 12.505 | 68 |
NVIDIA A10/A10G PCIe 24GB | 2 | 25.01 | 12.505 | 69 |
NVIDIA RTX 5000 Ada 32GB | 2 | 24 | 12 | 70 |
Intel Arc A770 16GB | 4 | 46.67 | 11.6675 | 71 |
NVIDIA Tesla P100 PCIe 12GB | 4 | 45.76 | 11.44 | 72 |
NVIDIA TITAN Xp 12GB | 4 | 45.63 | 11.4075 | 73 |
Apple MacBook Pro M4 Max 64GB | 1 | 11.38 | 11.38 | 74 |
Apple MacBook Pro M4 Max 128GB | 1 | 11.38 | 11.38 | 75 |
Apple MacBook Pro M4 Max 48GB | 2 | 22.75 | 11.375 | 76 |
NVIDIA Project DIGITS 128GB | 1 | 10.67 | 10.67 | 77 |
Intel Arc A750 8GB | 8 | 85.33 | 10.66625 | 78 |
Intel Arc A580 8GB | 8 | 85.33 | 10.66625 | 79 |
NVIDIA GeForce RTX 4080 12GB | 4 | 42.02 | 10.505 | 80 |
NVIDIA GeForce RTX 4070 Ti 12GB | 4 | 42.02 | 10.505 | 81 |
NVIDIA GeForce RTX 4070 SUPER 12GB | 4 | 42.02 | 10.505 | 82 |
NVIDIA GeForce RTX 4070 12GB | 4 | 42.02 | 10.505 | 83 |
NVIDIA Tesla K80 24GB | 2 | 20.05 | 10.025 | 84 |
Intel Arc B580 12GB | 4 | 38 | 9.5 | 85 |
DDR5 12 Channel 4800 512GB | 1 | 9.38 | 9.38 | 86 |
NVIDIA GeForce RTX 3070 8GB | 8 | 74.67 | 9.33375 | 87 |
NVIDIA GeForce RTX 3060 Ti 8GB | 8 | 74.67 | 9.33375 | 88 |
NVIDIA GeForce RTX 2070 8GB | 8 | 74.67 | 9.33375 | 89 |
NVIDIA GeForce RTX 2080 8GB | 8 | 74.67 | 9.33375 | 90 |
NVIDIA GeForce RTX 5060 8GB (Preliminary) | 8 | 74.67 | 9.33375 | 91 |
NVIDIA RTX A4000 Ampere 16GB | 4 | 37.33 | 9.3325 | 92 |
NVIDIA Quadro RTX 5000 16GB | 4 | 37.33 | 9.3325 | 93 |
NVIDIA Quadro P6000 24GB | 2 | 18.03 | 9.015 | 94 |
NVIDIA GeForce RTX 4080 Mobile 12GB | 4 | 36 | 9 | 95 |
NVIDIA RTX 4500 Ada 24GB | 2 | 18 | 9 | 96 |
NVIDIA Tesla T10 16GB | 4 | 35.85 | 8.9625 | 97 |
NVIDIA Quadro RTX 4000 8GB | 8 | 69.33 | 8.66625 | 98 |
Apple MacStudio M1 Max 32GB | 2 | 17.07 | 8.535 | 99 |
Apple MacStudio M2 Max 32GB | 2 | 17.07 | 8.535 | 100 |
Apple MacBook Pro M3 Max 48GB | 2 | 17.07 | 8.535 | 101 |
Apple MacStudio M1 Max 64GB | 1 | 8.53 | 8.53 | 102 |
Apple MacStudio M2 Max 64GB | 1 | 8.53 | 8.53 | 103 |
Apple MacStudio M2 Max 96GB | 1 | 8.53 | 8.53 | 104 |
Apple MacBook Pro M3 Max 64GB | 1 | 8.53 | 8.53 | 105 |
Apple MacBook Pro M3 Max 128GB | 1 | 8.53 | 8.53 | 106 |
Intel Arc B570 10GB | 8 | 63.33 | 7.91625 | 107 |
NVIDIA GeForce RTX 3060 12GB | 4 | 30 | 7.5 | 108 |
NVIDIA RTX 4000 Ada 20GB | 4 | 30 | 7.5 | 109 |
NVIDIA Tesla P40 24GB | 2 | 14.46 | 7.23 | 110 |
NVIDIA Tesla M10 32GB | 2 | 13.87 | 6.935 | 111 |
NVIDIA Tesla M60 16GB | 4 | 26.73 | 6.6825 | 112 |
NVIDIA RTX 4000 SFF Ada 20GB | 4 | 26.67 | 6.6675 | 113 |
NVIDIA Tesla T4/T4G 16GB | 4 | 26.67 | 6.6675 | 114 |
NVIDIA L4 24GB | 2 | 12.5 | 6.25 | 115 |
DDR5 8 Channel 4800 512GB | 1 | 6.25 | 6.25 | 116 |
NVIDIA Tesla M40 24GB | 2 | 12.02 | 6.01 | 117 |
NVIDIA Tesla M40 12GB | 4 | 24.03 | 6.0075 | 118 |
NVIDIA GeForce RTX 4060 Ti 8GB | 8 | 48 | 6 | 119 |
NVIDIA RTX A2000 12GB Ampere | 4 | 24 | 6 | 120 |
NVIDIA GeForce RTX 4060 Ti 16GB | 4 | 24 | 6 | 121 |
Apple MacMini M4 Pro 48GB | 2 | 11.38 | 5.69 | 122 |
Apple MacMini M4 Pro 64GB | 1 | 5.69 | 5.69 | 123 |
Apple MacBook Pro M4 Pro 64GB | 1 | 5.69 | 5.69 | 124 |
Apple MacMini M4 Pro 24GB | 4 | 22.75 | 5.6875 | 125 |
NVIDIA GeForce RTX 4060 8GB | 8 | 45.33 | 5.66625 | 126 |
NVIDIA Quadro P4000 8GB | 8 | 40.55 | 5.06875 | 127 |
NVIDIA GeForce RTX 3060 8GB | 8 | 40 | 5 | 128 |
NVIDIA RTX 2000 Ada 16GB | 4 | 18.67 | 4.6675 | 129 |
NVIDIA GeForce RTX 3050 8GB | 8 | 37.33 | 4.66625 | 130 |
NVIDIA Jetson AGX Orin 64GB | 1 | 4.27 | 4.27 | 131 |
NVIDIA Jetson AGX Orin 32GB | 2 | 8.53 | 4.265 | 132 |
NVIDIA A2 16GB | 4 | 16.68 | 4.17 | 133 |
DDR4 8 Channel 3200 512GB (EPYC SP3 LGA-4189) | 1 | 4.17 | 4.17 | 134 |
Apple MacMini M2 Pro 16GB | 8 | 33.33 | 4.16625 | 135 |
Apple MacMini M2 Pro 32GB | 2 | 8.33 | 4.165 | 136 |
NVIDIA Tesla P4 8GB | 8 | 32.05 | 4.00625 | 137 |
NVIDIA RTX A1000 Ampere 8GB | 8 | 32 | 4 | 138 |
NVIDIA T1000 8GB Turing | 8 | 26.67 | 3.33375 | 139 |
Apple MacBook Pro M3 Pro 36GB | 2 | 6.4 | 3.2 | 140 |
DDR4 6 Channel 2933 384GB (LGA-3647) | 1 | 2.85 | 2.85 | 141 |
NVIDIA Jetson AGX Xavier 16GB | 4 | 11.38 | 2.845 | 142 |
NVIDIA Jetson AGX Xavier 32GB | 2 | 5.69 | 2.845 | 143 |
Apple MacMini M4 16GB | 8 | 20 | 2.5 | 144 |
Apple MacMini M4 24GB | 4 | 10 | 2.5 | 145 |
Apple MacMini M4 32GB | 2 | 5 | 2.5 | 146 |
Apple MacBook Pro M4 32GB | 2 | 5 | 2.5 | 147 |
NVIDIA Jetson Orin NX 8GB | 8 | 17.07 | 2.13375 | 148 |
NVIDIA Jetson Orin NX 16GB | 4 | 8.53 | 2.1325 | 149 |
Apple MacBook Pro M3 24GB | 4 | 8.53 | 2.1325 | 150 |
Jetson Orin Nano Super 8GB | 8 | 17 | 2.125 | 151 |
Apple MacMini M2 16GB | 8 | 16.67 | 2.08375 | 152 |
Apple MacMini M2 24GB | 4 | 8.33 | 2.0825 | 153 |
DDR4 4 Channel 3200 256GB (LGA-2011-3) | 1 | 1.56 | 1.56 | 154 |
NVIDIA Jetson Orin Nano 8GB | 8 | 11.38 | 1.4225 | 155 |
Apple MacMini M1 16GB | 8 | 11.11 | 1.38875 | 156 |
NVIDIA Jetson Xavier NX 16GB | 4 | 4.98 | 1.245 | 157 |
NVIDIA Jetson Xavier NX 8GB | 8 | 9.95 | 1.24375 | 158 |
天梯原文如下: