1. 算力
TOPS(Tera Operations Per Second),1 TOPS代表处理器每秒钟可进行一万亿次(10^12)操作
FLOPS(floating-point operations per second),每秒所执行的浮点运算次数,1 TFLOPS(tera FLOPS)等于每秒一万亿(=10^12)次的浮点运算
因浮点运算对性能要求较高,在保证AI精度的同时可将浮点数转为整数进行计算,可大幅降低计算资源的消耗,提高计算性能,可分为INT8, INT4, INT16,INT32, INT后的数字表示整数的位数。
2. 监控SoC芯片
芯片 | 算力 |
---|---|
RK3566 | 0.8TOPS |
RK3568 | 0.8TOPS |
HiSillicon Hi3519A V100 | 4.0 TOPS |
RK3399Pro | 3.0TOPS(INT8) |
HiSillicon Hi3559A V100 | 4.0 TOPS |
RK3588 | 6.0TOPS |
3. AI芯片
芯片 | 算力 |
---|---|
瑞芯微RK1808/RK1806 | INT8: 3TOPS |
寒武纪Cambricon-1M-1K | INT8: 2TOPS, INT16: 1TOPS, INT32:0.25TOPS |
寒武纪Cambricon-1M-2K | INT8: 4TOPS, INT16: 2TOPS, INT32:0.5TOPS |
寒武纪Cambricon-1M-4K | INT8: 8TOPS, INT16: 4TOPS, INT32:1TOPS |
地平线 征程® 3/旭日® 3 | 5 TOPS |
华为昇腾310 | INT8: 16TOPS, FP16: 8TOPS |
华为昇腾910 | INT8: 640TOPS, FP16: 320 TFLOPS |
4. PCIE加速卡/GPU
AI卡 | 算力 | 功耗(w) |
---|---|---|
寒武纪思元270-S4 | INT4: 256TOPS, INT8: 128TOPS, INT16: 64TOPS | 70 |
寒武纪思元270-F4 | INT4: 256TOPS, INT8: 128TOPS, INT16: 64TOPS | 150 |
NVIDIA RTX4000 | 7.1FLOPS | 160 |
NVIDIA Tesla T4 | 8.1FLOPS, INT4: 260TOPS, INT8: 130TOPS, FP16/FP32:65TOPS | 70 |
NVIDIA RTX5000 | 11.2FLOPS | 265 |
云天励飞“初芯”加速卡 | 12Tops | |
NVIDIA TITAN RTX | 13.6FLOPS | 280 |
NVIDIA V100 | 14FLOPS | 250 |