You can use GPU instancing to draw many identical objects with only a few draw calls. There are some restrictions that you need to bear in mind:
A Standard Surface Shader that supports instancing is available in the Unity Editor. Add one to your project by selecting Shader > Standard Surface Shader (Instanced).
Apply this Shader to your GameObject’s Material. In your Material’s Inspector window, click the Shader drop-down, roll over the Instanced field, and choose your instanced Shader from the list:
Even though the instanced GameObjects are sharing the same Mesh and Material, you can set Shader properties on a per-object basis using the MaterialPropertyBlock API. In the example below, each GameObject is assigned a random color value using the _Color
property:
MaterialPropertyBlock props = new MaterialPropertyBlock();
MeshRenderer renderer;
foreach (GameObject obj in objects)
{
float r = Random.Range(0.0f, 1.0f);
float g = Random.Range(0.0f, 1.0f);
float b = Random.Range(0.0f, 1.0f);
props.SetColor("_Color", new Color(r, g, b));
renderer = obj.GetComponent<MeshRenderer>();
renderer.SetPropertyBlock(props);
}
The following example takes a simple unlit Shader and makes it capable of instancing:
Shader "SimplestInstancedShader"
{
Properties
{
_Color ("Color", Color) = (1, 1, 1, 1)
}
SubShader
{
Tags { "RenderType"="Opaque" }
LOD 100
Pass
{
CGPROGRAM
#pragma vertex vert
#pragma fragment frag
#pragma multi_compile_instancing
#include "UnityCG.cginc"
struct appdata
{
float4 vertex : POSITION;
UNITY_INSTANCE_ID
};
struct v2f
{
float4 vertex : SV_POSITION;
UNITY_INSTANCE_ID
};
UNITY_INSTANCING_CBUFFER_START (MyProperties)
UNITY_DEFINE_INSTANCED_PROP (float4, _Color)
UNITY_INSTANCING_CBUFFER_END
v2f vert (appdata v)
{
v2f o;
UNITY_SETUP_INSTANCE_ID (v);
UNITY_TRANSFER_INSTANCE_ID (v, o);
o.vertex = UnityObjectToClipPos (v.vertex);
return o;
}
fixed4 frag (v2f i) : SV_Target
{
UNITY_SETUP_INSTANCE_ID (i);
return UNITY_ACCESS_INSTANCED_PROP (_Color);
}
ENDCG
}
}
}
Addition | Function |
---|---|
#pragma multi_compile_instancing |
multi_compile_instancing generates a Shader with two variants: one with built-in keyword INSTANCING_ON defined (allowing instancing), the other with nothing defined. This allows the Shader to fall back to a non-instanced version if instancing isn’t supported on the GPU. |
UNITY_INSTANCE_ID | This is used in the vertex Shader input/output structure to define an instance ID. See SV_InstanceID for more information. |
UNITY_INSTANCING_CBUFFER_START(name) / UNITY_INSTANCING_CBUFFER_END | Every per-instance property must be defined in a specially named constant buffer. Use this pair of macros to wrap the properties you want to be made unique to each instance. |
UNITY_DEFINE_INSTANCED_PROP(float4, color) | This defines a per-instance Shader property with a type and a name. In this example, the _color property is unique. |
UNITY_SETUP_INSTANCE_ID(v); | This makes the instance ID accessible to Shader functions. It must be used at the very beginning of a vertex Shader, and is optional for fragment Shaders. |
UNITY_TRANSFER_INSTANCE_ID(v, o); | This copies the instance ID from the input structure to the output structure in the vertex Shader. This is only necessary if you need to access per-instance data in the fragment Shader. |
UNITY_ACCESS_INSTANCED_PROP(color) | This accesses a per-instance Shader property. It uses an instance ID to index into the instance data array. |
Note: As long as Material properties are instanced, Renderers can always be rendered instanced, even if you put different instanced properties into different Renderers. Normal (non-instanced) properties cannot be batched, so do not put them in the MaterialPropertyBlock
. Instead, create different Materials for them.
UnityObjectToClipPos(v.vertex)
is always preferred where mul(UNITY_MATRIX_MVP,v.vertex)
would otherwise be used. While you can continue to use UNITY_MATRIX_MVP
as normal in instanced Shaders, UnityObjectToClipPos
is the most efficient way of transforming vertex positions from object space into clip space.
In instanced Shaders, UNITY_MATRIX_MVP
(among other built-in matrices) is transparently modified to include an extra matrix multiply. Specifically, it is expanded to mul(UNITY_MATRIX_VP, unity_ObjectToWorld)
. unity_ObjectToWorld
is expanded to unity_ObjectToWorldArray[unity_InstanceID]
).
UnityObjectToClipPos
is optimized to perform two matrix-vector multiplications simultaneously, and is therefore more efficient than performing the multiplication manually, because the Shader compiler does not automatically perform this optimization.
For vertex and fragment Shaders, Unity needs to change the way vertex transformations are calculated in multi-pass scenarios (for example, in the ForwardAdd pass) to avoid z-fighting artifacts against the base/first passes due to floating point errors in matrix calculation. To do this, add #pragma force_concat_matrix
to the Shader.
Specifically, the vertex transformation in the ForwardAdd
pass is calculated by multiplying the M (model) matrix with the VP (view and projection) matrix instead of using a CPU-precomputed MVP matrix.
This is not necessary for surface Shaders, because the correct calculation is automatically substituted.
Static batching takes priority over instancing. If a GameObject is marked for static batching and is successfully batched, instancing is disabled even if its Renderer uses an instancing Shader. When this happens, a warning box appears in the Inspector suggesting that the Static Batching flag be unchecked in the Player Settings.
Instancing takes priority over dynamic batching. If Meshes can be instanced, dynamic batching is disabled.
addshadow
option to force the generation of an instanced shadow pass.UNITY_MAX_INSTANCE_COUNT
with an integer before including any .cginc file allows you to limit the maximum number of instances an instanced draw call can draw. This allows for more properties per instance in the instance constant buffer. You can achieve the same result when using a surface Shader with #pragma instancing_options maxcount:number
. The default value of this max instance count is 500. For OpenGL, the actual value is one quarter of the value you specify, so 125 by default.