NVIDIA GPU型号 | BF16吞吐量(teraTFLOPS) | Float32吞吐量(teraTFLOPS) | 功率(WATTS) |
---|---|---|---|
P40 | 11.76② | 11.76 | 250 |
P100 PCI | 18.7 | 9.3 | 250 |
P100 NVL | 21.2 | 10.6 | |
rtx3060 | 24② (INT8:102) | 12.8 | 170 |
RTX4060 | 30.22① (INT8: 242) | 15.11 | 115 |
RTX4060 ti | 44.12① (INT8: 353) | 22.06 | 165 |
RTX4070 | 58.3① (INT8:466) | 29.15 | 200 |
RTX3080 | 59.5② | 29.77 | 350 |
T4 | 65 | 8.1 | 70 |
RTX3080 ti | 70② | 34.1 | 350 |
RTX4070 super | 70.96① (INT8:568) | 35.48 | 220 |
RTX3090 | 71.16② | 35.58 (10496 cuda cores) | 350 |
RTX A4000 | 76.7 (INT8: 153.4 ) | 19.17 | 140 |
RTX3090 ti | 80② | 40 (10752 cuda cores) | 450 |
RTX4070 ti | 80.18① (INT8:641) | 40.09 | 285 |
RTX4070 ti super | 88.2① (INT8:706) | 44.1 | 285 |
V100 PCI | 112 | 14 | 250 |
RTX A5000 | 117② | 27.77 | 230 |
V100 NVL | 125 | 15.7 | 300 |
A10 | 125 | 31.2 | 150 |
V100S PCI | 130 | 16.4 | 250 |
A40 | 149.7② | 37.42 | 300 |
A30 | 165 | 10.3 | 165 |
RTX4090 | 165.2② (INT8:1321) | 450 | |
A100 PCI | 312 | 19.5 | 300 |
A800 NVL | 623.8 | 19.5 | 240 |
A100 SXM | 624 | 19.5 | 400 |
H100 PCI | 1513 | 51 | 350 |
H100 SXM | 1979 | 67 | 700 |
H100 NVL | 3958 | 134 | 400 |
注解
① 由Float32算力推算,可能不准确,网络上给的是与float32 1:1算力。
② 数据来自 https://www.autodl.com/home
说明: 当前时间,公司退下来的,二手 V100 性价比最好。