[system-track][computing][GPU][Intel HD 530 Gen9] architecture and performance issue

Gen9 arch

EU

在这里插入图片描述

  • simultaneous multithreading and interleave multithreading
  • 4-issue from diff threads , piplined accross multiple threads
  • GRF:
    • 28KB/EU, 128 X (SIMD-8-32bit) regs/thread X 7threads
    • 4KB/thread
  • 16 32b-float / cycle:
    • (add+mul) X 2FPUs X SIMD-4(physically 4 per FPU)

interconnect

在这里插入图片描述

slice

在这里插入图片描述

memory hierarchy

  • eDRAM bypass or cache
  • coherent region, overhead?
    在这里插入图片描述

configuration

在这里插入图片描述

issue

  • subclice data port

    • SIMD gather and scatter
    • can access shared memory
    • coalescing scatter read mem // mem access pattern
  • L3 data cache on GPU

    • banked data cache
    • highly banked shared memory //bank conflict, OpenCL refer it as work-group local mem, its programmer managed data
    • atomic barrier usage, three part ratio is configurable
  • LLC-shared
    * shared between intel HD and graphic //how to use it
    * distributed shared cache //coherence overhead? cache ping-pong

  • sharing DRAM

    • zero copy
    • bandwidth contention
  • eDRAM memory side cache or bypass

  • 64 byte data path on many place

    • 1 SIMD-16 instruction can source 64byte wide operands from 64byte wide regs,
    • 2 such 64byte wide regs read or written from L3 over 64-byte wide data bus
    • 3 within L3 data cache , cache line is 64byte wide
    • 4 LLC cache’s bus to SoC-shared LLC is also 64 byte wide
  • EU: flexible SIMD width; 4KB reg file / thread; 28 KB/ EU

  • 16bit float support: mixed accuracy computing

  • many consistency part might influence performance

  • same virtual address can be shared seamlessly across device, programmable via SVM in openCL 2.0

    • net effect pointer rich data structure can be shared directly between code run on CPU and code on GPUs

source:
The Compute Architecture of Intel ® Processor Graphics Gen9 Version 1.0

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值