Kepler cards (CUDA 5 until CUDA 10)
Deprecated from CUDA 11.
- SM30 or
SM_30, compute_30
–
Kepler architecture (e.g. generic Kepler, GeForce 700, GT-730).
Adds support for unified memory programming
Completely dropped from CUDA 11 onwards. - SM35 or
SM_35, compute_35
–
Tesla K40.
Adds support for dynamic parallelism.
Deprecated from CUDA 11, will be dropped in future versions. - SM37 or
SM_37, compute_37
–
Tesla K80.
Adds a few more registers.
Deprecated from CUDA 11, will be dropped in future versions, strongly suggest replacing with a 32GB PCIe Tesla V100.
Maxwell cards (CUDA 6 until CUDA 11)
- SM50 or
SM_50, compute_50
–
Tesla/Quadro M series.
Deprecated from CUDA 11, will be dropped in future versions, strongly suggest replacing with a Quadro RTX 4000 or A6000. - SM52 or
SM_52, compute_52
–
Quadro M6000 , GeForce 900, GTX-970, GTX-980, GTX Titan X. - SM53 or
SM_53, compute_53
–
Tegra (Jetson) TX1 / Tegra X1, Drive CX, Drive PX, Jetson Nano.
Pascal (CUDA 8 and later)
- SM60 or
SM_60, compute_60
–
Quadro GP100, Tesla P100, DGX-1 (Generic Pascal) - SM61 or
SM_61, compute_61
–
GTX 1080, GTX 1070, GTX 1060, GTX 1050, GTX 1030 (GP108), GT 1010 (GP108) Titan Xp, Tesla P40, Tesla P4, Discrete GPU on the NVIDIA Drive PX2 - SM62 or
SM_62, compute_62
–
Integrated GPU on the NVIDIA Drive PX2, Tegra (Jetson) TX2
Volta (CUDA 9 and later)
- SM70 or
SM_70, compute_70
–
DGX-1 with Volta, Tesla V100, GTX 1180 (GV104), Titan V, Quadro GV100 - SM72 or
SM_72, compute_72
–
Jetson AGX Xavier, Drive AGX Pegasus, Xavier NX
Turing (CUDA 10 and later)
- SM75 or
SM_75, compute_75
–
GTX/RTX Turing – GTX 1660 Ti, RTX 2060, RTX 2070, RTX 2080, Titan RTX, Quadro RTX 4000, Quadro RTX 5000, Quadro RTX 6000, Quadro RTX 8000, Quadro T1000/T2000, Tesla T4
Ampere (CUDA 11.1 and later)
- SM80 or
SM_80, compute_80
–
NVIDIA A100 (the name “Tesla” has been dropped – GA100), NVIDIA DGX-A100 - SM86 or
SM_86, compute_86
– (from CUDA 11.1 onwards)
Tesla GA10x cards, RTX Ampere – RTX 3080, GA102 – RTX 3090, RTX A2000, A3000, RTX A4000, A5000, A6000, NVIDIA A40, GA106 – RTX 3060, GA104 – RTX 3070, GA107 – RTX 3050, RTX A10, RTX A16, RTX A40, A2 Tensor Core GPU, A800 40GB
- SM87 or
SM_87, compute_87
– (from CUDA 11.4 onwards, introduced with PTX ISA 7.4 / Driver r470 and newer) – for Jetson AGX Orin and Drive AGX Orin only
“Devices of compute capability 8.6 have 2x more FP32 operations per cycle per SM than devices of compute capability 8.0. While a binary compiled for 8.0 will run as is on 8.6, it is recommended to compile explicitly for 8.6 to benefit from the increased FP32 throughput.“
https://docs.nvidia.com/cuda/ampere-tuning-guide/index.html#improved_fp32
Ada Lovelace (CUDA 11.8 and later)
- SM89 or
SM_89, compute_
89 –
NVIDIA GeForce RTX 4090, RTX 4080, RTX 6000 Ada, Tesla L40, L40s Ada, L4 Ada
Hopper (CUDA 12 and later)
- SM90 or
SM_90, compute_90
–
NVIDIA H100 (GH100), NVIDIA H200 - SM90a or
SM_90a, compute_90a
– (for PTX ISA version 8.0) – adds acceleration for features like wgmma and setmaxnreg. This is required for NVIDIA CUTLASS
Blackwell (CUDA 12 and later)
- SM95 or
SM_95, compute_95
–
NVIDIA B100 (GB100)