gpu 纹理
Android GPU Inspector (AGI) lets us peek into the inner GPU workings on Android. One of the most demanding GPU tasks is to fetch and filter texture data from within a shader. We can use AGI to monitor texture-related GPU workloads by capturing texture GPU Performance counters in three categories: bandwidth, cache behavior and filtering.
Android GPU检查器(AGI)使我们可以窥探Android上的内部GPU工作原理。 最苛刻的GPU任务之一是从着色器中获取和过滤纹理数据。 通过捕获三个类别的纹理GPU性能计数器,我们可以使用AGI来监视与纹理相关的GPU工作负载:带宽,缓存行为和过滤。
I always begin with looking at texture bandwidth, because it indicates how much texture data is being transferred to the GPU during a frame and can quickly highlight potential texturing performance issues.
我总是从纹理带宽开始,因为它表明在一帧中有多少纹理数据正在传输到GPU,并且可以快速突出显示潜在的纹理性能问题。
When it comes to texture bandwidth, a good rule of thumb is making sure that the Texture Read Bandwidth is not much higher than 1GB/s on average and its peaks stay well under 5GB/s.
关于纹理带宽,一个很好的经验法则是确保纹理读取带宽平均不高于1GB / s,并且其峰值保持在5GB / s以下。
This game, for example, is consuming a lot of texture bandwidth, with an average of more than 4GB/s and peaks, towards the end of the frame, of more than 6GB/s.
例如,此游戏消耗大量纹理带宽,平均帧速率超过4GB / s,到帧结束时的峰值超过6GB / s。
It’s expected that Post Processing steps can be particularly heavy on texture bandwidth; you might be ok with spending a portion of your bandwidth budget towards the end of the frame for special effects like bloom and tone mapping. But if the color pass of your game has a high texture read bandwidth peak, you might have potential performance issues to investigate.
可以预期,后处理步骤在纹理带宽上可能特别繁重。 您可能会在帧末尾花费一部分带宽预算来获得特殊效果(例如光晕和色调映射)就可以了。 但是,如果游戏的色彩传递具有较高的纹理读取带宽峰值,则可能需要研究潜在的性能问题。
For this game, texture bandwidth consumption is very high and needs further investigation.
对于此游戏,纹理带宽消耗非常高,需要进一步研究。
To investigate a potential texture bandwidth issue, I first look at texture cache behavior. My focus is on the percentage of texture stalls, L1 and L2 fetch misses. When texture data for a texture fetch is not found in the L1 cache, the request is forwarded to the L2 cache and then to system memory. Each step introduces more latency and consumes more power. Average L1 cache misses should be below 10% and peak below 50%.
为了研究潜在的纹理带宽问题,我首先看一下纹理缓存行为。 我的重点是纹理停顿,L1和L2提取未命中的百分比。 如果在L1缓存中找不到用于纹理获取的纹理数据,则将请求转发到L2缓存,然后转发到系统内存。 每个步骤都会引入更多延迟,并消耗更多功率。 L1缓存的平均未命中率应低于10%,峰值应低于50% 。
The GPU system capture of this game shows an average percentage of L1 cache misses over 20% and peaks up to 80% or more.
该游戏的GPU系统捕获显示L1高速缓存未命中的平均百分比超过20%,最高达到80%或更高。
These numbers are again very high.
这些数字再次很高。
Typical reasons for a high percentage of texture stalls are uncompressed textures, complex filtering like anisotropic filtering and textures not being mipmapped.
大量的纹理停顿的典型原因是未压缩的纹理,复杂的过滤(例如各向异性过滤)和纹理未被映射。
To investigate potential causes of texture cache misses I look at the percentage of anisotropic filtering texture fetches, which is very expensive on mobile, and at the percentage of Non Base Level texture fetches.
为了研究纹理缓存未命中的潜在原因,我查看了各向异性过滤纹理获取的百分比(在移动设备上非常昂贵)和非基本级别纹理获取的百分比。
The percentage of Non Base Level texture fetches is an estimate of how efficiently texture mipmaps are being fetched. When this number is 0, it means that the GPU is always accessing the top level, the biggest slice, of the texture mipmap chain or that textures are not mipmapped at all.
非基本级别纹理获取的百分比是对获取纹理mipmap的效率的估计。 当该数字为0时,表示GPU始终在访问纹理mipmap链的顶层(最大片段),或者根本没有映射纹理。
This can be an issue on most 3D games, while it’s usually acceptable on 2D games.
这在大多数3D游戏中可能是个问题,而在2D游戏中通常是可以接受的。
Accessing not mipmapped textures is ok when rendering GUI or during post processing, but in any other scenarios it comes with a large performance penalty and is cause of poor cache behavior.
在渲染GUI时或在后期处理期间,可以访问未映射的纹理,但是在任何其他情况下,这都会带来很大的性能损失,并且是导致不良缓存行为的原因。
In fact, fetching textures consumes a lot of system bandwidth and can potentially introduce latency, increase battery life and cause thermal issues that will further degrade performance in the long run.
实际上,获取纹理会消耗大量系统带宽,并且可能潜在地引入延迟,延长电池寿命并引起散热问题,从长远来看,这将进一步降低性能。
Analyzing GPU counters related to texturing behavior can help uncover potentially big low hanging fruit that can improve user experience substantially when fixed.
分析与纹理化行为相关的GPU计数器可以帮助发现潜在的悬而未决的大问题,在修复后可以大大改善用户体验。
To find these kinds of GPU performance issues related to texturing, take a trace of your game using Android GPU Inspector and compare the values and trends of the GPU counters to the guidelines given here.
要查找与纹理相关的这类GPU性能问题,请使用Android GPU检查器来跟踪您的游戏,并将GPU计数器的值和趋势与此处给出的准则进行比较。
gpu 纹理