Compute shaders are programs that run on the graphics card, outside of the normal rendering pipeline. They can be used for massively parallel GPGPU algorithms, or to accelerate parts of game rendering. In order to efficiently use them, an in-depth knowledge of GPU architectures and parallel algorithms is often needed; as well as knowledge of DirectCompute, OpenGL Compute, CUDA, or OpenCL.
Compute shaders in Unity closely match DirectX 11 DirectCompute technology. Platforms where compute shaders work:
Windows and Windows Store, with a DirectX 11 or DirectX 12 graphics API and Shader Model 5.0 GPU
macOS and iOS using Metal graphics API
Android, Linux and Windows platforms with Vulkan API
Modern OpenGL platforms (OpenGL 4.3 on Linux or Windows; OpenGL ES 3.1 on Android). Note that Mac OS X does not support OpenGL 4.3
Modern consoles (Sony PS4 and Microsoft Xbox One)
Compute shader support can be queried runtime using SystemInfo.supportsComputeShaders.
Similar to regular shaders, compute shaders are Asset files in your project, with a .compute file extension. They are written in DirectX 11 style HLSL language, with a minimal number of #pragma compilation directives to indicate which functions to compile as compute shader kernels.
Here’s a basic example of a compute shader file, which fills the output texture with red:
// test.compute
#pragma kernel FillWithRed
RWTexture2D<float4> res;
[numthreads(1,1,1)]
void FillWithRed (uint3 dtid : SV_DispatchThreadID)
{
res[dtid.xy] = float4(1,0,0,1);
}
The language is standard DX11 HLSL, with an additional #pragma kernel FillWithRed
directive. One compute shader Asset file must contain at least onecompute kernel
that can be invoked, and that function is indicated by the #pragma directive
. There can be more kernels in the file; just add multiple #pragma kernel
lines.
When using multiple #pragma kernel
lines, note that comments of the style // text
are not permitted on the same line as the #pragma kernel
directives, and cause compilation errors if used.
По желанию, после строки #pragma kernel
может быть указан номер макроса препроцессора, для назначения во время компиляции этого ядра, например:
#pragma kernel KernelOne SOME_DEFINE DEFINE_WITH_VALUE=1337
#pragma kernel KernelTwo OTHER_DEFINE
// ...
Вызов вычислительных шейдеров
In your script, define a variable of ComputeShader type and assign a reference to the Asset. This allows you to invoke them with ComputeShader.Dispatch function. See Unity documentation on ComputeShader class for more details.
Closely related to compute shaders is a ComputeBuffer class, which defines arbitrary data buffer (“structured buffer” in DX11 lingo). Render Textures can also be written into from compute shaders, if they have “random access” flag set (“unordered access view” in DX11). See RenderTexture.enableRandomWrite to learn more about this.
Семплеры текстур в вычислительных шейдерах
Textures and samplers aren’t separate objects in Unity, so to use them in compute shaders you must follow one of the following Unity-specific rules:
Use the same name as the Texture name, with sampler
at the beginning (for example, Texture2D MyTex
; SamplerState samplerMyTex
). In this case, the sampler is initialized to that Texture’s filter/wrap/aniso settings.
Use a predefined sampler. For this, the name has to have Linear
or Point
(for filter mode) and Clamp
or Repeat
(for wrap mode). For example, SamplerState MyLinearClampSampler
creates a sampler that has Linear filter mode and Clamp wrap mode.
For more information, see documentation on Sampler States.
As with regular shaders, Unity is capable of translating compute shaders from HLSL to other shader languages. Therefore, for the easiest cross-platform builds, you should write compute shaders in HLSL. However, there are some factors that need to be considered when doing this.
DirectX 11 (DX11) supports many actions that are not supported on other platforms (such as Metal or OpenGL ES). Therefore, you should always ensure your shader has well-defined behavior on platforms that offer less support, rather than only on DX11. Here are few things to consider:
Out-of-bounds memory accesses are bad. DX11 might consistently return zero when reading, and read some writes without issues, but platforms that offer less support might crash the GPU when doing this. Watch out for DX11-specific hacks, buffer sizes not matching with multiple of your thread group size, trying to read neighboring data elements from the beginning or end of the buffer, and similar incompatibilities.
Initialize your resources. The contents of new buffers and Textures are undefined. Some platforms might provide all zeroes, but on others, there could be anything including NaNs.
Bind all the resources your compute shader declares. Even if you know for sure that the shader does not use resources in its current state because of branching, you must still ensure a resource is bound to it.
Metal (for iOS and tvOS platforms) does not support atomic operations on Textures. Metal also does not support GetDimensions
queries on buffers. Pass the buffer size info as constant to the shader if needed.
OpenGL ES 3.1 (for (Android, iOS, tvOS platforms) only guarantees support for 4 compute buffers at a time. Actual implementations typically support more, but in general if developing for OpenGL ES, you should consider grouping related data in structs rather than having each data item in its own buffer.
Usually, compute shader files are written in HLSL, and compiled or translated into all necessary platforms automatically. However, it is possible to either prevent translation to other languages (that is, only keep HLSL platforms), or to write GLSL compute code manually.
The following information only applies to HLSL-only or GLSL-only compute shaders, not cross-platform builds. This is because this information can result in compute shader source being excluded from some platforms.
Compute shader source surrounded by CGPROGRAM
and ENDCG
keywords is not processed for non-HLSL platforms.
Compute shader source surrounded by GLSLPROGRAM
and ENDGLSL
keywords is treated as GLSL source, and emitted verbatim. This only works when targeting OpenGL or GLSL platforms. You should also note that while automatically translated shaders follow HLSL data layout on buffers, manually written GLSL shaders follow GLSL layout rules.
You can use the #pragma multi_compile
and #pragma multi_compile_local
directives to compile multiple variants of compute shaders, the same way you can for regular shaders. These directives affect all kernels in a given file.
Note that regular and compute shaders share global keywords. Enabling or disabling a global keyword affects all regular shaders and all compute shaders. Global keywords set in compute shaders count towards the global keyword limit.
To enable or disable global keywords on all regular shaders and compute shaders, use the Shader.EnableKeyword, Shader.DisableKeyword, CommandBuffer.EnableKeyword, and CommandBuffer.DisableKeyword APIs.
To enable or disable a local keyword on a compute shader, use the ComputeShader.EnableKeyword and ComputeShader.DisableKeyword APIs.
For more information, see Making multiple shader program variants and the ComputeShader API documentation.