Virtual Texture Mapping (VTM ) is a technique to reduce the amount of graphics memory required for textures to a point where it is only dependent on the screen resolution: for a given viewpoint we onlu keep the visible parts of the textures in graphics memory, at hte appropriate MIP map level.
while early texture management schemes were design for a single large texture. recent VTM system are more flexible and mimic the virtual memory management of the OS: texture are divided into small tiles, or pages. Those are automatically cached and loaded onto the GPU as required for rendering the current viewpoint, However,it is necessary to redirect access to missing data to a fallback texture. This prevent "holes" from appearing in the rendering or blocking and waiting until the load request finishes.
in Figure “@Wang”, we begin each frame by determining which tiles are visible.we identify the ones not cached and request them form disk . After the tiles have been uploaded into the tile cache on the GPU we update an indirection texture or page table.. Eventtually, we render the scene ,performing an initial lookup into the indirection texture to determine where to sample in the tile cache.
图片中 蓝色方块为不可见的像素(IDs) 红色方块为新增的可见的像素,绿色为在一定帧数类可见的像素块
The indirection texture is a scaled down version of the complete virtual texture.. where each texel points to a tile in the tile cache.
Title from different MIP map levels cover differently sized areas of the virtual texture, but simplifies the management if the tile cache considerably.
Page Fault Generation
For each frame we determine the visible tiles. identify the ones not yet loaded onto the GPU, and request them frome disk. Future hardware might simplify this native page faults but we still need to determine visible tiles, substitute data and redirect memory access..
A simple approach is to render the complete scene with a special shader thattranslates the virtual texture coordinates into a tile ID. By rendering the actualgeometry of the scene, we trivially handle occlusion. The framebuffer is then readback and processed on the CPU along with other management tasks. As tilestypically cover several pixels, it is possible to render tile IDs at a lower resolutionto reduce bandwidth and processing costs. Also, in order to pre-fetch tiles that willbe visible “soon,” the field of view can be slightly increased.
Page Handler
The page handler loads requested tiles from disk, uploads them onto the GPU, andupdates the indirection texture. Depending on disk latency and camera movement,loading the tiles might become a bottleneck.
Rendering
When texturing the surface we perform a fast unfiltered lookup into the indirection texture, using the uv-coordinate of the fragment in virtual texture space. This provides the position of the target tile in the cache and the actual resolution of its MIP map level in the pyramid of the indirection texture. The latter mightbe different from the resolution computed from the fragment’s MIP map leveldue to our tile upload limit. We add the offset inside the tile to the tile positionand sample from the tile cache. The offset is simply the fractional part of theuv-coordinate scaled by the actual resolution:
Implementation Details
In this section we will investigate various implementation issues with a strongemphasis on texture filtering. Again we will follow the processing of one frame,from page fault generation over page handling to rendering.
Page Fault Generation
MIP map level. To compute the tile ID in the tile shader we need the virtualtexture coordinates and the current MIP map level. The former are directly theinterpolated uvs used for texturing, but on DX 9 and 10 hardware, we have tocompute the latter manually using gradient instructions [Ewins et al. 98,Wu 98]:let
and
be the uv gradients in x- and y-direction.Using their maximal length we compute the MIP map level as
Compressed tiles.
For efficient rendering it is desirable to have a DXTC compressedtile cache. It requires less memory on the GPU and reduces the upload and Rendering Techniquesrendering bandwidth. However, as the compression ratio of DXTC is fixed andquite low, we store the tiles using JPEG and transcode them to DXTC before weupload them. This also allows us to reduce quality selectively and e.g., compresstiles of inaccessible areas stronger.
Disk I/O.
For our tutorial implementation we store tiles as individual JPEG filesfor the sake of simplicity. However, reading many small files requires slow seeksand wastes bandwidth. Packing the tiles into a single file is thus very important,especially for slow devices with large sectors like DVDs.It is possible to cut down the storage requirements by storing only every secondMIP map level and computing two additional MIP maps for each tile: if anintermediate level is requested, we load the corresponding four pages from thefiner level instead.
Cache saturation.
Unused tiles are overwritten with newly requested tiles using anLRU policy. However, the current working set might still not fit into the cache.In this case we remove tiles that promise low impact on visual quality. We replacethe tiles with the finest resolution with their lower-resolution ancestors. This playsnicely with our progressive update strategy and quickly frees the tile cache. Othergood candidates for removal are tiles with low variance or small screen space area.
Tile upload.
Uploading the tiles to the GPU should be fast, with minimum stalling.Using DX 9, we create a managed texture and let the driver handle the uploadto the GPU. Other approaches for DX 9 are described in detail by [Mittring 08].For DX 10 and 11, we create a set of intermediate textures and update thesein turn. The textures are copied individually into the tile cache [Thibieroz 08].DX 11 adds the possibility to update the tiles concurrently, which further increasesperformance.
Indirection texture update.
After the tiles have been uploaded, we update the indirection texture by recreating it from scratch. We start by initializing the top
此图为游戏中Camera 所看见的条目 从刚进入游戏初始化-》细节细化的GPU中间层纹理
of its MIP map pyramid with an entry for the tile with the lowest resolution, soeach fragment has a valid fallback. For each finer level we copy the entries of theparent texels, but replace the copy with entries for tiles from that level, shouldthey reside in the cache. We continue this process until the complete indirectiontexture pyramid is filled (see Figure).
If tiles are usually seen at a single resolution, we can upload only the finest levelto the GPU. This reduces the required upload bandwidth, simplifies the lookup,and improves performance. This is sufficient when every object uses an uniquetexture, in particular for terrain rendering.
Rendering
While rendering with a virtual texture is straight forward, correct filtering, especially at tile edges, is less obvious. Neighboring tiles in texture space are verylikely not adjacent to each other in the tile cache. Filtering is especially challenging if the hardware filtering units should be used, as those rely on having MIPmaps and correct gradients available. The following paragraphs describe how touse HW filtering with an anisotropy of up to 4:1 as shown in Figure "@1" Thecorresponding shader code can be found in Section 4.5.3.
Texture gradients. When two adjacent tiles in texture space are not adjacent inthe tile cache, as shown in Figure “@2”, the uv-coordinates for the final texturelookup will vary a great deal between neighboring fragments. This results inlarge texture gradients and the graphics hardware will use a very wide filter forsampling, producing blurry seams. To address this, we manually compute the
此图为MipMap 各向异性以 4:1方式显示的纹理 “@1”
此图左下角为表面虽然是连续像素但是在cache也是不连续的 “@2”
gradients from the original virtual texture coordinates, scale them depending onthe MIP map level of the tile and pass them on to the texture sampler.
Tile borders.
Even with correct texture gradients, we still have to filter into neighboring pages, which very likely contain a completely different part of the virtualtexture. To avoid the resulting color bleeding we need to add borders. Dependingon what constraints we want to place on the size of the tile, we can use inner orouter borders.
We use the latter and surround our
tiles with a four-pixel border, makingthem
. This keeps the resolution a multiple of four, allowing us to compressthem using DXTC and perform 4:1 anisotropic filtering in hardware.
DXTC border blocks.
As Figure “@3” illustrates, adding a border to tiles might leadto different DXTC blocks at the edges of tiles. As the different blocks will becompressed differently, texels that represent the same points in virtual texturespace will not have the same values in both tiles. This leads to color bleedingacross tile edges, despite the border. By using a four-pixel outer border, thesecompression related artifacts vanish
Conclusion:
本文出自GPU Pro 1,本文主要讲述了一个基本的虚拟纹理映射系统,从游戏画面中看到的纹理贴图 -》转换为 visible titleIDs ->
更新Tile Cache -》 产生新的一个间接纹理-》渲染显示
Shader Code
Tile ID Shader