Compute shaders are shaderA program that runs on the GPU. More info
See in Glossary programs that run on the GPU, outside of the normal rendering pipeline.
They can be used for massively parallel GPGPU algorithms, or to accelerate parts of game rendering. In order to efficiently use them, an in-depth knowledge of GPU architectures and parallel algorithms is often needed; as well as knowledge of DirectCompute, OpenGL Compute, CUDA, or OpenCL.
Compute shaders in Unity closely match DirectX 11 DirectCompute technology. Platforms where compute shaders work:
Windows and Windows Store, with a DirectX 11 or DirectX 12 graphics API and Shader Model 5.0 GPU
macOS and iOS using Metal graphics API
Android, Linux and Windows platforms with Vulkan API
Modern OpenGL platforms (OpenGL 4.3 on Linux or Windows; OpenGL ES 3.1 on Android). Note that Mac OS X does not support OpenGL 4.3
Modern consoles
Compute shader support can be queried runtime using SystemInfo.supportsComputeShaders.
Similar to shader assets, compute shader assets are files in your project. with a .compute file extension. They are written in DirectX 11 style HLSL language, with a minimal number of #pragma compilation directives to indicate which functions to compile as compute shader kernels.
Here’s a basic example of a compute shader file, which fills the output texture with red:
// test.compute
#pragma kernel FillWithRed
RWTexture2D<float4> res;
[numthreads(1,1,1)]
void FillWithRed (uint3 dtid : SV_DispatchThreadID)
{
res[dtid.xy] = float4(1,0,0,1);
}
The language is standard DX11 HLSL, with an additional #pragma kernel FillWithRed
directive. One compute shader Asset file must contain at least onecompute kernel
that can be invoked, and that function is indicated by the #pragma directive
. There can be more kernels in the file; just add multiple #pragma kernel
lines.
When using multiple #pragma kernel
lines, note that comments of the style // text
are not permitted on the same line as the #pragma kernel
directives, and cause compilation errors if used.
The #pragma kernel
line can optionally be followed by a number of preprocessor macros to define while compiling that kernel, for example:
#pragma kernel KernelOne SOME_DEFINE DEFINE_WITH_VALUE=1337
#pragma kernel KernelTwo OTHER_DEFINE
// ...
In your script, define a variable of ComputeShader type and assign a reference to the Asset. This allows you to invoke them with ComputeShader.Dispatch function. See Unity documentation on ComputeShader class for more details.
Closely related to compute shaders is a ComputeBuffer class, which defines arbitrary data buffer (“structured buffer” in DX11 lingo). Render TexturesA special type of Texture that is created and updated at runtime. To use them, first create a new Render Texture and designate one of your Cameras to render into it. Then you can use the Render Texture in a Material just like a regular Texture. More info
See in Glossary can also be written into from compute shaders, if they have “random access” flag set (“unordered access view” in DX11). See RenderTexture.enableRandomWrite to learn more about this.
Textures and samplers aren’t separate objects in Unity, so to use them in compute shaders you must follow one of the following Unity-specific rules:
Use the same name as the Texture name, with sampler
at the beginning (for example, Texture2D MyTex
; SamplerState samplerMyTex
). In this case, the sampler is initialized to that Texture’s filter/wrap/aniso settings.
Use a predefined sampler. For this, the name has to have Linear
or Point
(for filter mode) and Clamp
or Repeat
(for wrap mode). For example, SamplerState MyLinearClampSampler
creates a sampler that has Linear filter mode and Clamp wrap mode.
For more information, see documentation on Sampler States.
As with regular shaders, Unity is capable of translating compute shaders from HLSL to other shader languages. Therefore, for the easiest cross-platform builds, you should write compute shaders in HLSL. However, there are some factors that need to be considered when doing this.
DirectX 11 (DX11) supports many actions that are not supported on other platforms (such as Metal or OpenGL ES). Therefore, you should always ensure your shader has well-defined behavior on platforms that offer less support, rather than only on DX11. Here are few things to consider:
Out-of-bounds memory accesses are bad. DX11 might consistently return zero when reading, and read some writes without issues, but platforms that offer less support might crash the GPU when doing this. Watch out for DX11-specific hacks, buffer sizes not matching with multiple of your thread group size, trying to read neighboring data elements from the beginning or end of the buffer, and similar incompatibilities.
Initialize your resources. The contents of new buffers and Textures are undefined. Some platforms might provide all zeroes, but on others, there could be anything including NaNs.
Bind all the resources your compute shader declares. Even if you know for sure that the shader does not use resources in its current state because of branching, you must still ensure a resource is bound to it.
GetDimensions
queries on buffers. Pass the buffer size info as constant to the shader if needed.RWTextures<T>
that are not write-only.GraphicsFormat | RenderTextureFormat | HLSL type | GLSL image format qualifier |
---|---|---|---|
R32G32B32A32_SFloat | ARGBFloat | float4 | rgba32f |
R16G16B16A16_SFloat | ARGBHalf | min16float4/half4 | rgba16f |
R32G32_SFloat | RGFloat | float2 | rg32f |
R16G16_SFloat | RGHalf | min16float2/half2 | rg16f |
B10G11R11_UFloatPack32 | RGB111110Float | min10float3 | r11f_g11f_b10f |
R32_SFloat | RFloat | float | r32f |
R16_SFloat | RHalf | min16float/half | r16f |
R16G16B16A16_UNorm | ARGB64 | unorm min16float4/half4 | rgba16 |
A2B10G10R10_UNormPack32 | ARGB2101010 | unorm min10float4 | rgb10_a2 |
R8G8B8A8_UNorm | ARGB32 | unorm float4 | rgba8 |
R16G16_UNorm | RG32 | unorm min16float2/half2 | rg16 |
R8G8_UNorm | RG16 | unorm float2 | rg8 |
R16_UNorm | R16 | unorm min16float/half | r16 |
R8_UNorm | R8 | unorm float | r8 |
R16G16B16A16_SNorm | unsupported | snorm min16float4/half4 | rgba16_snorm |
R8G8B8A8_SNorm | unsupported | snorm float4 | rgba8_snorm |
R16G16_SNorm | unsupported | snorm min16float2/half2 | rg16_snorm |
R8G8_SNorm | unsupported | snorm float2 | rg8_snorm |
R16_SNorm | unsupported | snorm min16float/half | r16_snorm |
R8_SNorm | unsupported | snorm float | r8_snorm |
R32G32B32A32_SInt | ARGBInt | int4 | rgba32i |
R16G16B16A16_SInt | unsupported | min16int4 | rgba16i |
R8G8B8A8_SInt | unsupported | min12int4 | rgba8i |
R32G32_SInt | RGInt | int2 | rg32i |
R16G16_SInt | unsupported | min16int2 | rg16i |
R8G8_SInt | unsupported | min12int2 | rg8i |
R32_SInt | RInt | int | r32i |
R16_SInt | unsupported | min16int | r16i |
R8_SInt | unsupported | min12int | r8i |
R32G32B32A32_UInt | unsupported | uint4 | rgba32i |
R16G16B16A16_UInt | RGBAUShort | min16uint4 | rgba16ui |
R8G8B8A8_UInt | unsupported | unsupported | rgba8ui |
R32G32_UInt | unsupported | uint2 | rg32ui |
R16G16_UInt | unsupported | min16uint2 | rg16ui |
R8G8_UInt | unsupported | unsupported | rg8ui |
R32_UInt | unsupported | uint | r32ui |
R16_UInt | unsupported | min16uint | r16ui |
R8_UInt | unsupported | unsupported | r8ui |
A2B10G10R10_UIntPack32 | unsupported | unsupported | rgb10_a2ui |
Usually, compute shader files are written in HLSL, and compiled or translated into all necessary platforms automatically. However, it is possible to either prevent translation to other languages (that is, only keep HLSL platforms), or to write GLSL compute code manually.
The following information only applies to HLSL-only or GLSL-only compute shaders, not cross-platform builds. This is because this information can result in compute shader source being excluded from some platforms.
Compute shader source surrounded by CGPROGRAM
and ENDCG
keywords is not processed for non-HLSL platforms.
Compute shader source surrounded by GLSLPROGRAM
and ENDGLSL
keywords is treated as GLSL source, and emitted verbatim. This only works when targeting OpenGL or GLSL platforms. You should also note that while automatically translated shaders follow HLSL data layout on buffers, manually written GLSL shaders follow GLSL layout rules.
You can use keywords to produce multiple variants of compute shaders, the same as you can for graphics shaders.
For general information on variants, see Shader variantsA verion of a shader program that Unity generates according to a specific combination of shader keywords and their status. A Shader object can contain multiple shader variants. More info
See in Glossary. For information on how to implement these features in compute shaders, see Declaring and using shader keywords in HLSL and the ComputeShader API documentation.
ComputeShader