Power GPU : The Architecture Concepts
( 1 ) Tile Based Deferred Renderer /TBDR
on-chip buffer Tile(32*32 tile大小)
硬件模块
A) Tiling Accelerator (TA) : (Clips, projects, and culls geometry)
B) Parameter Buffer (PB) :
Data stored in system memory: Too much for on-chip memory
Essential for deferring/tiling process: Allows geometry and fragment processing to be separated
Stores Vertex Data: All data attached to each vertex passed from the TA 『???』
Stores Primitive Lists: Lists of which primitives belong to which tile
D) Image Synthesis Processor (ISP)
Performs HSR and other Depth/Stencil Operations
Passes visible fragments to the ‘Tag Buffer’
A buffer used to track visible fragments
Visible fragments passed to the TSP
Fragments are grouped by primitive for cache efficiency
E) Texture & Shading Processor (TSP)
Interpolates(插值) vertex data for each fragment : ‘Varyings’ in a shaders
Fetches texture samples :“non-dependent” texture reads only
F)Arithmetic Logic Units (ALUs)
Unified architecture
Processes vertex, fragment, and compute tasks (标量SIMD方式)
SIMD style execution (向量运算)
Fed by the Coarse Grain Scheduler (CGS) (粗粒度的调度器)
G)Unified Architecture (现场讲座的时候,这里讨论很多)
大部分谈到关于矩阵运算和标量运算的调度
F)Pixel Back End (PBE)
Series5/5XT: 4x MSAA
Series6: 8x MSAA - 需要再仔细研究