Blackwell 系列产品参数
Platform | GB200 | B200 | B100 | HGX B200 | HGX B100 |
---|
Configuration | 2x B200 GPU, 1x Grace CPU | Blackwell GPU | Blackwell GPU | 8x B200 GPU | 8x B100 GPU |
FP4 Tensor Dense/Sparse | 20/40 petaflops | 9/18 petaflops | 7/14 petaflops | 72/144 petaflops | 56/112 petaflops |
FP6/FP8 Tensor Dense/Sparse | 10/20 petaflops | 4.5/9 petaflops | 3.5/7 petaflops | 36/72 petaflops | 28/56 petaflops |
INT8 Tensor Dense/Sparse | 10/20 petaops | 4.5/9 petaops | 3.5/7 petaops | 36/72 petaops | 28/56 petaops |
FP16/BF16 Tensor Dense/Sparse | 5/10 petaflops | 2.25/4.5 petaflops | 1.8/3.5 petaflops | 18/36 petaflops | 14/28 petaflops |
TF32 Tensor Dense/Sparse | 2.5/5 petaflops | 1.12/2.25 petaflops | 0.9/1.8 petaflops | 9/18 petaflops | 7/14 petaflops |
FP64 Tensor Dense | 90 teraflops | 40 teraflops | 30 teraflops | 320 teraflops | 240 teraflops |
Memory | 384GB (2x8x24GB) | 192GB (8x24GB) | 192GB (8x24GB) | 1536GB (8x8x24GB) | 1536GB (8x8x24GB) |
Bandwidth | 16 TB/s | 8 TB/s | 8 TB/s | 64 TB/s | 64 TB/s |
NVLink Bandwidth | 2x 1.8 TB/s | 1.8 TB/s | 1.8 TB/s | 14.4 TB/s | 14.4 TB/s |
Power | Up to 2700W | 1000W | 700W | 8000W? | 5600W? |
通过chatgpt 生成了tesla 主要产品线产品的参数:
Product | Release Year | Tensor Core FLOPS | CUDA Core FLOPS | Memory Capacity | Interconnect | Key Features |
---|
Tesla K80 | 2014 | N/A | 8.74 TFLOPS | 24 GB GDDR5 | PCIe Gen3 x16 | Kepler Architecture, Dual-GPU |
Tesla P100 | 2016 | N/A | Up to 10.6 TFLOPS | 12 GB - 16 GB HBM2 | NVLink, PCIe Gen3 x16 | Pascal Architecture, NVLink and NVLink Bridge |
Tesla V100 | 2017 | Up to 125 TFLOPS | Up to 14.1 TFLOPS | 16 GB - 32 GB HBM2 | NVLink, PCIe Gen3 x16 | Tensor Core, NVLink and NVLink Bridge, Volta Architecture |
Tesla T4 | 2018 | 65.6 TFLOPS | 8.1 TFLOPS | 16 GB GDDR6 | PCIe Gen3 x16 | Turing Tensor Core, Power Efficiency |
Tesla A10 | 2020 | 624 TFLOPS | 77.4 TFLOPS | N/A | PCIe Gen4 x16 | Ampere Architecture, AI Inference |
Tesla A30 | 2022 | 777 TFLOPS | 156 TFLOPS | N/A | PCIe Gen4 x16 | Ampere Architecture, AI Inference |
RTX 3090 | 2020 | 285.1 TFLOPS | 10,496 CUDA cores | 24 GB GDDR6X | PCIe Gen4 x16 | Ampere Architecture, Gaming & Professional GPU |
RTX 4090 | N/A | N/A | N/A | N/A | N/A | Unreleased |
Reference
- https://en.wikipedia.org/wiki/Nvidia_Tesla