Unity的GPU Instancing

Unity的GPU Instancing

GPU Instancing可以用来批量绘制大量相同几何结构相同材质的物体,以降低绘制所需的batches。要想在Unity中使用,首先需要至少在shader的某个pass中加上#pragma multi_compile_instancing。由于instancing的每个物体所需要的绘制数据可能各不相同,因此还需要在shader中传递一个instanceId:

struct VertexData {
	UNITY_VERTEX_INPUT_INSTANCE_ID
	float4 vertex : POSITION;
	…
};

UNITY_VERTEX_INPUT_INSTANCE_ID宏定义如下:

// - UNITY_VERTEX_INPUT_INSTANCE_ID     Declare instance ID field in vertex shader input / output struct.
#   define UNITY_VERTEX_INPUT_INSTANCE_ID DEFAULT_UNITY_VERTEX_INPUT_INSTANCE_ID

#if defined(UNITY_INSTANCING_ENABLED) || defined(UNITY_PROCEDURAL_INSTANCING_ENABLED) || defined(UNITY_STEREO_INSTANCING_ENABLED)
    #ifdef SHADER_API_PSSL
        #define DEFAULT_UNITY_VERTEX_INPUT_INSTANCE_ID uint instanceID;
    #else
        #define DEFAULT_UNITY_VERTEX_INPUT_INSTANCE_ID uint instanceID : SV_InstanceID;
    #endif

#else
    #define DEFAULT_UNITY_VERTEX_INPUT_INSTANCE_ID
#endif

其实就是在启用gpu instancing时定义一个instanceID。

除此之外,我们需要在shader的开头部分使用UNITY_SETUP_INSTANCE_ID宏进行设置:

InterpolatorsVertex MyVertexProgram (VertexData v) {
	InterpolatorsVertex i;
	UNITY_INITIALIZE_OUTPUT(Interpolators, i);
	UNITY_SETUP_INSTANCE_ID(v);
	i.pos = UnityObjectToClipPos(v.vertex);
	…
}

UNITY_SETUP_INSTANCE_ID宏展开如下:

// - UNITY_SETUP_INSTANCE_ID        Should be used at the very beginning of the vertex shader / fragment shader,
//                                  so that succeeding code can have access to the global unity_InstanceID.
//                                  Also procedural function is called to setup instance data.
#   define UNITY_SETUP_INSTANCE_ID(input) DEFAULT_UNITY_SETUP_INSTANCE_ID(input)

#define DEFAULT_UNITY_SETUP_INSTANCE_ID(input)          { UnitySetupInstanceID(UNITY_GET_INSTANCE_ID(input)); UnitySetupCompoundMatrices(); }

这个宏主要做了两件事,第一是设置全局的unity_InstanceID变量,该变量用于索引shader用到的各类内置矩阵(例如object to world)的数组:

void UnitySetupInstanceID(uint inputInstanceID)
    {
        #ifdef UNITY_STEREO_INSTANCING_ENABLED
            #if defined(SHADER_API_GLES3)
                // We must calculate the stereo eye index differently for GLES3
                // because otherwise,  the unity shader compiler will emit a bitfieldInsert function.
                // bitfieldInsert requires support for glsl version 400 or later.  Therefore the
                // generated glsl code will fail to compile on lower end devices.  By changing the
                // way we calculate the stereo eye index,  we can help the shader compiler to avoid
                // emitting the bitfieldInsert function and thereby increase the number of devices we
                // can run stereo instancing on.
                unity_StereoEyeIndex = round(fmod(inputInstanceID, 2.0));
                unity_InstanceID = unity_BaseInstanceID + (inputInstanceID >> 1);
            #else
                // stereo eye index is automatically figured out from the instance ID
                unity_StereoEyeIndex = inputInstanceID & 0x01;
                unity_InstanceID = unity_BaseInstanceID + (inputInstanceID >> 1);
            #endif
        #else
            unity_InstanceID = inputInstanceID + unity_BaseInstanceID;
        #endif
    }

第二就是重新定义常用的矩阵:

        void UnitySetupCompoundMatrices()
        {
            unity_MatrixMVP_Instanced = mul(unity_MatrixVP, unity_ObjectToWorld);
            unity_MatrixMV_Instanced = mul(unity_MatrixV, unity_ObjectToWorld);
            unity_MatrixTMV_Instanced = transpose(unity_MatrixMV_Instanced);
            unity_MatrixITMV_Instanced = transpose(mul(unity_WorldToObject, unity_MatrixInvV));
        }

注意这里的unity_ObjectToWorldunity_WorldToObject也已经被重新定义过了:

        #define unity_ObjectToWorld     UNITY_ACCESS_INSTANCED_PROP(unity_Builtins0, unity_ObjectToWorldArray)
        #define MERGE_UNITY_BUILTINS_INDEX(X) unity_Builtins##X
        #define unity_WorldToObject     UNITY_ACCESS_INSTANCED_PROP(MERGE_UNITY_BUILTINS_INDEX(UNITY_WORLDTOOBJECTARRAY_CB), unity_WorldToObjectArray)

        inline float4 UnityObjectToClipPosInstanced(in float3 pos)
        {
            return mul(UNITY_MATRIX_VP, mul(unity_ObjectToWorld, float4(pos, 1.0)));
        }
        inline float4 UnityObjectToClipPosInstanced(float4 pos)
        {
            return UnityObjectToClipPosInstanced(pos.xyz);
        }
        #define UnityObjectToClipPos UnityObjectToClipPosInstanced

开启gpu instancing时,这里实际上就是用instanceId去对应的矩阵数组中进行索引。

在这里插入图片描述

正是因为每次batch都需要传递给gpu的是矩阵数组而不是矩阵本身,batch的大小需要进行限制,即最多一次只会将有限数量的几何体合并到一个batch进行gpu instancing。unity定义了一个UNITY_INSTANCED_ARRAY_SIZE宏来表示最大数量的限制。

gpu instancing同样支持阴影和多光源的情况。对于阴影,只需要在shadow caster的pass中加上对应的instancing声明即可:

#pragma multi_compile_shadowcaster
#pragma multi_compile_instancing

struct VertexData {
	UNITY_VERTEX_INPUT_INSTANCE_ID
};

InterpolatorsVertex MyShadowVertexProgram (VertexData v) {
	InterpolatorsVertex i;
	UNITY_SETUP_INSTANCE_ID(v);
}

在这里插入图片描述

对于多光源的情况,则需要使用延迟渲染路径:

在这里插入图片描述

然而,默认的gpu instancing只能支持相同材质,这在使用时会很不方便,有时候可能仅仅想要修改材质的某个属性,例如这里修改不同球体的颜色,会导致instancing失效:

在这里插入图片描述

我们可以使用MaterialPropertyBlock来避免修改颜色时创建出新的材质:

			MaterialPropertyBlock properties = new MaterialPropertyBlock();
			properties.SetColor(
				"_Color", new Color(Random.value, Random.value, Random.value)
			);
			t.GetComponent<MeshRenderer>().SetPropertyBlock(properties);

为了在shader代码中使用到此属性,需要在instancing buffer中对其定义:

UNITY_INSTANCING_BUFFER_START(InstanceProperties)
	UNITY_DEFINE_INSTANCED_PROP(float4, _Color)
#define _Color_arr InstanceProperties
UNITY_INSTANCING_BUFFER_END(InstanceProperties)

对宏进行展开,可以发现就是定义了一个包含struct数组的cbuffer,其中struct中定义了我们新增的属性:

    #define UNITY_INSTANCING_BUFFER_START(buf)      UNITY_INSTANCING_CBUFFER_SCOPE_BEGIN(UnityInstancing_##buf) struct {
    #define UNITY_INSTANCING_BUFFER_END(arr)        } arr##Array[UNITY_INSTANCED_ARRAY_SIZE]; UNITY_INSTANCING_CBUFFER_SCOPE_END
    #define UNITY_DEFINE_INSTANCED_PROP(type, var)  type var;

如果要把vertex shader中使用的instanceId传递到fragment shader,可以使用unity提供的UNITY_TRANSFER_INSTANCE_ID

InterpolatorsVertex MyVertexProgram (VertexData v) {
	InterpolatorsVertex i;
	UNITY_INITIALIZE_OUTPUT(Interpolators, i);
	UNITY_SETUP_INSTANCE_ID(v);
	UNITY_TRANSFER_INSTANCE_ID(v, i);
	…
}

这个宏定义很简单:

    #define UNITY_TRANSFER_INSTANCE_ID(input, output)   output.instanceID = UNITY_GET_INSTANCE_ID(input)

那么最终要如何正确读取这个cbuffer的属性呢?这里Unity也提供了配套的宏:

float3 GetAlbedo (Interpolators i) {
	float3 albedo =
		tex2D(_MainTex, i.uv.xy).rgb * UNITY_ACCESS_INSTANCED_PROP(_Color_arr, _Color).rgb;
	...
}

这个宏定义也很简单,就是从之前定义的struct数组中,根据instanceId进行索引,再取出对应的变量:

    #define UNITY_ACCESS_INSTANCED_PROP(arr, var)   arr##Array[unity_InstanceID].var

经过修改之后,再次运行,可以发现batch降低了,instancing生效了:

在这里插入图片描述

如果你觉得我的文章有帮助,欢迎关注我的微信公众号:Game_Develop_Forever

Reference

[1] GPU Instancing

[2] (四)unity自带的着色器源码剖析之——————Unity3D 多例化技术(GUI Instancing)

  • 4
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
GPU Instancer is an out of the box solution to display extreme numbers of objects on screen with high performance. With a few mouse clicks, you can instance your prefabs, Unity Terrain details and trees. GPU Instancer provides user friendly tools to allow everyone to use Indirect GPU Instancing without having to go through the deep learning curve of Compute Shaders and GPU infrastructure. Also, an API with extensive documentation is provided to manage runtime changes. ——————————— FEATURES ——————————— – Out of the box solution for complex GPU Instancing. – VR compatible. Works with both single pass and multi pass rendering modes. – Mobile compatible. Works with both iOS and Android. – Easy to use interface. – Tens of thousands of objects rendered lightning fast in a single draw call. – GPU frustum culling. – GPU occlusion culling (also supports VR platforms with both single pass and multi pass rendering modes). – Automatically configured custom shader support. – Supports Standard, Universal and HD Render Pipelines. – Complex hierarchies of prefabs instanced with a single click. – Multiple sub-meshes support. – LOD Groups and cross-fading support. (Cross-fading is supported on Standard Render Pipeline only) – Automatic 2D Billboard generation system (Standard RP only). – Shadows casting and receiving support for instances (frustum culled instances still can cast shadows). – Ability to use custom shadow distance per prototype and to choose the LOD to render shadows with. – Support for Floating Origin handling. – Multiple camera support. – Well documented API for procedural scenes and runtime modifications (examples included). – Ability to Remove instances inside bounds or colliders at runtime. – Ability to extend with custom Compute Shaders. – Example scenes that showcase GPU Instancer capabilities. Prefab Instancing Features: – Ability to automatically instance prefabs at your scene that you distribute with your favorite prefab painting tool. – Automatically Add-Re
GPU Instancer is an out of the box solution to display extreme numbers of objects on screen with high performance. With a few mouse clicks, you can instance your prefabs, Unity terrain details and trees. GPU Instancer provides user friendly tools to allow everyone to use Indirect GPU Instancing without having to go through the deep learning curve of Compute Shaders and GPU infrastructure. Also, an API with extensive documentation is provided to manage runtime changes. --------------------------------- Features --------------------------------- - Out of the box solution for complex GPU Instancing. - VR compatible. Works with both single pass and multipass rendering modes. - Mobile compatible. Works with both iOS and Android. - Easy to use interface. - Tens of thousands of objects rendered lightning fast in a single draw call. - GPU frustum culling. - GPU occlusion culling (non-VR platforms only). - Automatically configured custom shader support - Complex hierarchies of prefabs instanced with a single click. - Multiple sub-meshes support. - LOD Groups and cross-fading support (with animation or fade transition width). - Automatic 2D Billboard generation system (auto-added as last LOD). - Shadows casting and receiving support for instances (frustum culled instances still can cast shadows). - Unity 5.6 support. - Well documented API for procedural scenes and runtime modifications (examples included). - Example scenes that showcase GPU Instancer capabilities. Prefab Instancing Features: - Ability to automatically instance prefabs at your scene that you distribute with your favorite prefab painting tool. - Automatically Add-Remove prefab instances without any aditional code. - Automatic detection and updating of transform position, rotation and scale changes. - Full or area localized rigidbody and physics support. - Add-Remove-Update prefab instances with or without instantiating GameObjects (examples included). - Instance based material variations through API (similar to Material Property Blocks). - Enabling and disabling instancing at runtime per instance basis. - API to manage instanced prefabs at runtime. - Includes mobile demo scene with custom controllers. Detail Instancing Features: - Dense grass fields and vegetation with very high frame rates. - Included vegetation shader with wind, shadows, AO, billboarding and various other properties. - Support for custom shaders and materials. - Cross quadding support: automatically turns grass textures to crossed quads. - Ability to paint prefabs with custom materials on Unity terrain (with Unity terrain tools). - Ability to use prefabs with LOD Groups on Unity terrain. - Further performance improvements with automatic spatial partitioning. - API to manage instanced terrain detail prototypes at runtime (examples included). - Editor GPU Instancing simulation. Tree Instancing Features [BETA]: - Dense forests with very high frame rates. - Speed Tree support with wind animations. - Included billboard baker and renderers. - Custom vertex color wind animation support for Soft Occlusion Tree shaders. Third Party Integrations: - Gaia integration. - Map Magic integration. Planned Features: - Tree Creator support with wind animations (a limited version is currently available). - Support for animation baking and skinned mesh renderers. Requirements: - DirectX 11 or DirectX 12 and Shader Model 5.0 GPU (Windows, Windows Store) - Metal (macOS, iOS) - OpenGL Core 4.3 (Windows, Linux) - Vulkan (Android, Windows, Linux) - OpenGL ES 3.1 (Android 8.0 Oreo or later) - Modern Consoles (PS4, Xbox One) To provide the fastest possible performance, GPU Instancer utilizes Indirect GPU Instancing using Unity's DrawMeshInstancedIndirect method and Compute Shaders. GPU Instancing results in magnitudes of performance improvement over static batching and mesh combining. Also, other available solutions for GPU Instancing (including Unity's material option and the DrawMeshInstanced method) fail short on limited buffer sizes and therefore result in more draw calls and less performance. By using the indirect method GPU Instancer aims to provide the ultimate solution for this, and increases performance considerably while rendering the same mesh multiple times. For more Information: Getting Started API Documentation

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值