OpenGL Core Details
Graphics Command Buffers

Compute Shaders

Compute Shaders are programs that run on the graphics card, outside of the normal rendering pipeline. They can be used for massively parallel GPGPU algorithms, or to accelerate parts of game rendering. In order to efficiently use them, often an in-depth knowledge of GPU architectures and parallel algorithms is needed; as well as knowledge of DirectCompute, OpenCL or CUDA.

Compute shaders in Unity closely match DirectX 11 DirectCompute technology. Platforms where compute shaders work:

  • Windows and Windows Store, with a DirectX 11 graphics API and Shader Model 5.0 GPU.
  • Modern OpenGL platforms (OpenGL 4.3 on Linux or Windows; OpenGL ES 3.1 on Android). Note that Mac OS X does not support OpenGL 4.3, so no compute shaders there yet.
  • Modern consoles (Sony PS4 and Microsoft XboxOne).

Compute shader assets

Similar to regular shaders, Compute Shaders are asset files in your project, with *.compute file extension. They are written in DirectX 11 style HLSL language, with minimal amount of #pragma compilation directives to indicate which functions to compile as compute shader kernels.

Here’s a minimal example of a compute shader file:

// test.compute

#pragma kernel FillWithRed

RWTexture2D<float4> res;

void FillWithRed (uint3 dtid : SV_DispatchThreadID)
    res[dtid.xy] = float4(1,0,0,1);

Note that the example above does not do anything remotely interesting, it just fills the output texture with red.

The language is standard DX11 HLSL, with the only exception of a #pragma kernel FillWithRed directive. One compute shader asset file must contain at least one “compute kernel” that can be invoked, and that function is indicated by the #pragma directive. There can be more kernels in the file; just add multiple #pragma kernel lines.

Please note when using multiple #pragma kernel lines that comments of the style // text are not permitted on the same line as the #pragma kernel directives and will cause compilation errors.

The #pragma kernel line can optionally be followed by a number of preprocessor macros to define while compiling that kernel, for example:

#pragma kernel KernelOne SOME_DEFINE DEFINE_WITH_VALUE=1337
#pragma kernel KernelTwo OTHER_DEFINE
// ...

Invoking compute shaders

In your script, define a variable of ComputeShader type, assign a reference to the asset, and then you can invoke them with ComputeShader.Dispatch function. See scripting reference of ComputeShader class for more details.

Closely related to compute shaders is a ComputeBuffer class, which defines arbitrary data buffer (“structured buffer” in DX11 lingo). Render Textures can also be written into from compute shaders, if they have “random access” flag set (“unordered access view” in DX11), see RenderTexture.enableRandomWrite.

Texture samplers in compute shaders

Textures and samplers aren’t separate objects in Unity, so in order to use them in compute shader you have to follow some Unity specific rules:

  • Either use same as texture name, with “sampler” in front (e.g. Texture2D MyTex; SamplerState samplerMyTex). In this case, sampler will be initialized to that texture’s filter/wrap/aniso settings.
  • Or use one of “predefined” samplers; name has to have “Linear” or “Point” (for filter mode) and “Clamp” or “Repeat” (for wrap mode). For example, "SamplerState MyLinearClampSampler" - this will have linear filter and clamp wrap mode.

Cross-platform support

As with regular shaders, Unity is capable of translating compute shaders from HLSL to GLSL. Therefore for the easiest cross-platform builds it is recommended to write compute shaders in HLSL.

OpenGL compute differences from D3D

In order to achieve shaders working on multiple different platforms one should consider these limitations:

  • D3D and OpenGL have different data layout rules. Automatically translated GLSL shaders use std430 layout on compute buffers. Therefore for example using float3 based structured buffers will cause compatibility issues as DX allows tight packing but OpenGL enforces padding to float4. Scalars, two-component and four-component vectors are safe to use as they are. Extra care should be taken when constructing structs.
  • OpenGL ES 3.1 guarantees support for only 4 simultaneous shader storage buffers. Actual implementations typically support a bit more but in general one should consider grouping related data in structs as opposed to having each data item in its own buffer.

HLSL-only or GLSL-only compute shaders

Typically compute shader files are written in HLSL, and compiled or translated into all needed platforms automatically. However it is possible to either prevent translation to GLSL (i.e. only keep HLSL platforms), or to write GLSL compute code manually.

  • Compute shader source surrounded by CGPROGRAM and ENDCG keywords will not be processed for OpenGL/GLSL platforms.
  • Compute shader source surrounded by GLSLPROGRAM and ENDGLSL keywords will be treated as GLSL source, and emitted verbatim. This will only work when targetting OpenGL/GLSL platforms.

Note that for cross-platform builds neither of the above is recommended, since it very much ties compute shader source into being excluded from some platforms.

OpenGL Core Details
Graphics Command Buffers