A100 显卡关键参数

全局视图

首先看top图,
GA100是无损卡,就是说上面所有的unit都是好的,如下,
![在这里插入图片描述](https://i-blog.csdnimg.cn/direct/18df1f1b7dbd43a8b2b73499f502cd25.png在这里插入图片描述

A100包含有108个SM,每个SM最大可以容纳1024个threads.
说白了就是GA100的有缺陷的卡就为A100,撇去那个不能用的GPC
在这里插入图片描述

SM视图

下面是一个SM的视图:
在这里插入图片描述

在这里插入图片描述

算力

算力:
在这里插入图片描述

加工工艺

工艺采用的是7nm工艺:
在这里插入图片描述

关键参数

在这里插入图片描述
在这里插入图片描述

实际测量参数

下面是关键参数:

device properties : 
	name : NVIDIA A100-PCIE-40GB
	totalGlobalMem : 42298834944
	sharedMemPerBlock : 49152
	regsPerBlock : 65536
	warpSize : 32
	memPitch : 2147483647
	maxThreadsPerBlock : 1024
	maxThreadsDim[0] : 1024
	maxThreadsDim[1] : 1024
	maxThreadsDim[2] : 64
	maxGridSize[0] : 2147483647
	maxGridSize[1] : 65535
	maxGridSize[2] : 65535
	clockRate : 1410000
	totalConstMem : 65536
	major : 8
	minor : 0
	textureAlignment : 512
	texturePitchAlignment : 32
	deviceOverlap : 1
	multiProcessorCount : 108
	kernelExecTimeoutEnabled : 0
	integrated : 0
	canMapHostMemory : 1
	computeMode : 0
	concurrentKernels : 1
	ECCEnabled : 1
	pciBusID : 64
	pciDeviceID : 0
	pciDomainID : 0
	tccDriver : 0
	asyncEngineCount : 3
	unifiedAddressing : 1
	memoryClockRate : 1215000
	memoryBusWidth : 5120
	l2CacheSize : 41943040
	persistingL2CacheMaxSize : 31457280
	maxThreadsPerMultiProcessor : 2048
	streamPrioritiesSupported : 1
	globalL1CacheSupported : 1
	localL1CacheSupported : 1
	sharedMemPerMultiprocessor : 167936
	regsPerMultiprocessor : 65536
	managedMemory : 1
	isMultiGpuBoard : 0
	multiGpuBoardGroupID : 0
	singleToDoublePrecisionPerfRatio : 2
	pageableMemoryAccess : 0
	concurrentManagedAccess : 1
	computePreemptionSupported : 1
	canUseHostPointerForRegisteredMem : 1
	cooperativeLaunch : 1
	cooperativeMultiDeviceLaunch : 1
	pageableMemoryAccessUsesHostPageTables : 0
	directManagedMemAccessFromHost : 0
	accessPolicyMaxWindowSize : 134213632

device limit : 
	deviceLimitStackSize : 1024
	deviceLimitPrintfFifoSize : 7077888
	deviceLimitMallocHeapSize : 8388608
	deviceLimitDevRuntimeSyncDepth : 2
	deviceLimitDevRuntimePendingLaunchCount : 2048
	deviceLimitMaxL2FetchGranularity : 64
	deviceLimitPersistingL2CacheSize : 7864320

summary : 
	register total size : 6.75 MiB
	shared memory size per sm : 164.00 KiB
	shared memory total size : 17.30 MiB
	constant memory total size : 64.00 KiB
	level 2 cache total size : 40.00 MiB
	device memory total size : 39.39 GiB
	device memory bandwidth : 1.56 TB/s
	stack memory total size : 216.00 MiB

block 在SM上的分布

  • sm上是以block为单位进行分配的。
  • 先分配偶数标号的sm,接着再分配奇数标号的sm
  • <<<108,1024>>全部sm占满。
    在这里插入图片描述
grid_dimblock_dimsm0sm1sm2sm3sm4sm5sm6sm7sm8sm9sm10sm11sm12sm13sm14sm15sm16sm17sm18sm19sm20sm21sm22sm23sm24sm25sm26sm27sm28sm29sm30sm31sm32sm33sm34sm35sm36sm37sm38sm39sm40sm41sm42sm43sm44sm45sm46sm47sm48sm49sm50sm51sm52sm53sm54sm55sm56sm57sm58sm59sm60sm61sm62sm63sm64sm65sm66sm67sm68sm69sm70sm71sm72sm73sm74sm75sm76sm77sm78sm79sm80sm81sm82sm83sm84sm85sm86sm87sm88sm89sm90sm91sm92sm93sm94sm95sm96sm97sm98sm99sm100sm101sm102sm103sm104sm105sm106sm107
11100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1323200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
1646400000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
112812800000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
125625600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
151251200000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
11024102400000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
161101010101010101010101010101010100000000000000000000000000000000000000000000000000000000000000000000000000000
16323203203203203203203203203203203203203203203203200000000000000000000000000000000000000000000000000000000000000000000000000000
16646406406406406406406406406406406406406406406406400000000000000000000000000000000000000000000000000000000000000000000000000000
1612812801280128012801280128012801280128012801280128012801280128012800000000000000000000000000000000000000000000000000000000000000000000000000000
1625625602560256025602560256025602560256025602560256025602560256025600000000000000000000000000000000000000000000000000000000000000000000000000000
1651251205120512051205120512051205120512051205120512051205120512051200000000000000000000000000000000000000000000000000000000000000000000000000000
161024102401024010240102401024010240102401024010240102401024010240102401024010240102400000000000000000000000000000000000000000000000000000000000000000000000000000
321101010101010101010101010101010101010101010101010101010101010101000000000000000000000000000000000000000000000
323232032032032032032032032032032032032032032032032032032032032032032032032032032032032032032032032000000000000000000000000000000000000000000000
326464064064064064064064064064064064064064064064064064064064064064064064064064064064064064064064064000000000000000000000000000000000000000000000
321281280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128000000000000000000000000000000000000000000000
322562560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256000000000000000000000000000000000000000000000
325125120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512000000000000000000000000000000000000000000000
321024102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024000000000000000000000000000000000000000000000
641111111111111111111111010101010101010101010101010101010101010101010101010101010101010101010101010101010101010
64323232323232323232323232323232323232323232320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320320
64646464646464646464646464646464646464646464640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640640
6412812812812812812812812812812812812812812812812812812812812812812801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280128012801280
6425625625625625625625625625625625625625625625625625625625625625625602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560256025602560
6451251251251251251251251251251251251251251251251251251251251251251205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120512051205120
641024102410241024102410241024102410241024102410241024102410241024102410241024102410241024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240102401024010240
1081111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111
10832323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232323232
10864646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464646464
108128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128128
108256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256256
108512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512512
1081024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024102410241024

A100上SM/TPC/GPC分组关系

注意,SM都是逻辑idx
在这里插入图片描述

计算能力

在这里插入图片描述

MIG

一个GPUA100,可以划分7个独立的Instance.
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

在这里插入图片描述

参数列表

A100 一个block中最大可以容纳1024个threads
一个SM最大可以荣达2048个threads
所以最大可以容纳<216,1024>个线程
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

  • 4
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值